the image quality

THE UNIVERSITY OF CALGARY

A Psychovisually-Based Objective Image Quality

Evaluator for DCT-Based Lossy Data Compression

by

Ruby Wai-Shan Chan

A THESIS SUBMIT'IED TO THE FACULTY OF GRADUATE STUDIES IN

PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF

MASTER OF SCIENCE

DEPARTmNT OF MECHANICAL AND MANUFACTURING

ENGINEERING

CALGARY, ALBERTA

August, 200 1

O Ruby Wai Shan Chan 2001

Natiorral Library 1+1 ,Canada BiMbWque nationale du Canada

Aquisitions and Acquisitii et B~Mmgraphic Services services bibliographiques

The author has granted a non- exclusive licence allowing the National Library of Canada to reproduce, loan, distriiute or sell copies of this thesis in microform, paper or electronic formats.

The author retains ownership of tlle copyright in this thesis. Neither the thesis nor substantial extracts fkom it may be printed or otherwise reproduced without the author's permission.

L'auteur a accorde une lice11ce non exclusive pennettant a la Biblioth-e nationale du Canada de reproduire, prgter, distribuer ou vendre des copies de cette these sous la forme de microfiche/^ de reproduction sur papier ou sur format electronque.

L'auteur conserve la propriete du Qoit d'auteur qyi proege cette these. Ni la these ni des extraits substantiels de celle-ci ne doivent etre imprids ou autrement reproduits sans son autorisation.

ABSTRACT

In this thesis, we propose an algorithm for evaluating the quality of DCT-based

compressed images, called the Psychovisually-Based Objective Image Quality Evaluator

(POIQE). The POIQE evaluates the image quality using two psychovisually-based

fidelity indexes: blockiness and similarity. Blockiness measures the patterned square

artifact created as a by-product of the lossy DCT-based compression technique used by

P E G and MPEG, while similarity measures the perceivable detail remaining after

compression. The blockiness and similarity are combined into a single POIQE iudex

used to assess quality. The POIQE model is tuned using subjective assessment results

from five subjects evaluating six sets of images. Then, the capability of the model is

verified by validation experiments involving four new subjects and five new sets of

images.

ACKNOWLEDGEMENT

This section is probably the most difficult but enjoyable section to write. Difficult for the

fear that I might miss anyone but enjoyable for the joy that I am finally done. Starting off

this section with a contradicting and complicated feeling, I believe the f i s t person that I

will have to thank is my supervisor Dr. Peter Goldsmith. Being a mentor and a

supervisor, Dr. Goldsmith provides me with all the valuable advice and guidance that a

student could ever ask for. Also, I will Ore to thank Dr. R Rangayyan for teaching me

the first course in digital image processing and for introducing me to my favorite book

"Digital Image Processing" by Gonzalez and Woods.

Special thanks to my pals and James Mykytiuk for their supports during the course of my

thesis work. Great appreciation goes to my family. I wish to express my gratefulness to

my sisters, Vicky and Ivy, and my brother-in-law Herbert for all their encouragements,

care and companionship. Finally, deeply fiom my heart, I would like to say a million

thanks to my parents for all their love and supports.

to mypamts, Joan and Daniel

TABLE OF CONTENTS

ABSTRACT

ACKNOWLEDGEMENT

TABLE OF CONTENTS

LIST OF FIGURES

LIST OF TABLES

ACRONYM

vii

xii

1.1 THE NEED FOR A PSYCHOVISUALLY-BASED

OBJECTIVE IMAGE QUALITY EVALUATOR (POIQE) .............................. 1

............................................. 1.1.1 The Growth of Digital Data Communication 1

1.1.2 Digital Data Compression and its Quality Criteria ..................................... 2

1.1.3 Psychovisually-based Objective Image Quality Evaluator (POIQE) ......... . 3

... VlU

1.2.1 Hypothesis ................................................................................................ 4

1-22 Objective .................................................................................................... 5

12.3 Scope .......................................................................................................... 5

1.4.1 Related Research on Psychological Studies of the Human Visual

Perception ................................................................................................ 7

.................. ....... 1.42 Related Researches on Quality Measurement Tools .... 9

............................................................ 2.1.1 The Nature and Physics of Optics 14

.................................................................. 2.1.2 The Human Visual Perception 15

..............*.. ........................... 2.1 -3 HVS as a Shift-Invariant Linear System .. 19

.............................................................................. 2.1 -4 Brightness Perception 2 0

...................................................................................... 2.1.5 Color Perception -25

ix

2.1.6 Dark Adaptation and Motion Perception ................................................ 27

.......................................... .............................. 2.1.7 Sequential Perception ... 28

2.2 DIGITAL IMAGE COMPRESSION .................. ....... .. .... ........................ 29

..................................................................................... 2.2.1 Data Redundancy 29

....................................................... 2.2.2 Fundamentals of Image Compression 30

............................................................. 2.2.3 Lossless Compression/Encoding 3 1

.................................................................. 2.2.4 Lossy Compression/Encoding 32

........................................................... 2.2.5 Lossy DCT-based Compression 3 3

FIDELITY ASSESSMENT AND

........................................................... CRITERIA 39

................................................................ ............. . 3.1 1 Subjective Evaluation .- 40

..................................................... 3.1.2 Subjective Assessment Methodology 42

................................................... 3.3.1 Effect of Patterned /Structured Artifact 4 6

3.3.2 Using MSE as an Quality Indicator for Lossy DCT-based

Compressed Image ................................................................................. 47

POIQE MODEL DESIGN ................................... 49

4.1.1 Blockiness Evaluator ................................................................................ 52

4.1 -2 Similarity Identifier ........................ ... ............................................ 57

4.1.3 Merger ................................. .. ................................................................ 60

................................................. 4.2 THE CHARACTERISTIC OF THE MODEL 63

EXPERIMENTAL TUNING AND

VALIDATION OF THE POIQE MODEL .............. 65

5.1.1 Purpose ............................ .. ................................................................ 65

......................................................................... 5.1.2 Method and Procedure 6 6

............................................................................... 5.1.3 Experimental Images 7 0

5.1.4 Subjective Quality Evaluation Data .................... .. .................................. 73

5.2 ANALYSIS .......................................................................................... 7 8

xi

.. 5.2.1 Blockiness Index ................................................................................. 78

5.2.2 Similarity Index ........................ ...... .............................................. -80

5 .2.3 POIQE Index ........................................................................................... 8 1

5.2.4 Errors and Observations ................................................................... ., . . 90

5.3.1 Purpose ............... .. .................................................................................. 92

5.3.2 Method and Procedure .............................................................................. 93

5.3.3 Validation Result ................................................................................... -93

References 99

Appendix A Lossless Image Compression Statistics 105

Appendix B Tuning and Validation Experimentsf Image Sets 109

Appendix C Video Format and Color Spaces of Composite and

Component TV Systems

LIST OF FIGURES

FIGURE 2- 1

FIGURE 2-2

FIGURE 2-3

FIGURE 2-4

FIGURE 2-5

FIGURE 2-6

FIGURE 2-7

FIGURE 2-8

FIGURE 2- 13

................... ............................ ANATOMY OF THE HUMAN EYE .., 16

................................................... ROD AND CONE'S SPATIAL PATTERNS 17

VISUAL ANGLE ................................................................................. 18

................................................................. IMAGE TRANSFORMATIONS 20

JUST DETECTABLE CONTRAST THRESHOLD ......................................... 21

............................ ............................... S m-WAVE GRATING ..... 2 2

........................................ ADELSON'S DIAMOND SHAPED PATERNS -24

CONTRAST SENSITIVITIES FOR LUMINANCE AND CHROMINANCE

FIpFL96 ] .......................................................................................... -26

CRmCAt FLICKER FREQUENCIES fR0~831 ........................................ 29

IMAGE FORMAT ................................................................................... 31

BLOCK DIAGRAM OF JPEG COMPRESSION PROCEDURES ..................... 34

GROW OF PICTURE, SLICE, MACROBLOCKS & SUBIMAGE

............................................................................................... BLOCKS 37

BLOCK DIAGRAM OF MPEG COMPRESSION PROCEDURES .................. -36

SUBJECTWE ASSESSMENT ~~IETHODOLOGY ......................................... -42

................................................ EXAMPLE OF OBJECTIVE ASSESSMENT 44

...................................................................... STRUC~URED ARTIFACT 46

STRU-D ARTIFACT CAUSED BY LOSSY DCT BASED

..................................................................................... COMPRESSION 48

............................................................... BLOCK DIAGRAM OF POIQE 51

FIGURE 5-1

FIGURE 5-2

FIGURE 5-3

FIGURE 5-4

FIGURE 5-5

FIGURE 5-6

FIGURE 5-7

BLOCKINESS A R ~ A C T ....................................................................... 52

COMPUTATION OF BLOCKINESS INDEX ............................................... 56

BLOCK DIAGRAM OF BLOCKINESS EVALUATOR .................................. 57

BLOCK DIAGRAM OF S~MILARITY IDENT~ER ...................................... 58

BLOCK DIAGRAM OF MERGER .............................. .., ............................. 61

SATURATION OF THE HUMAN SUBJECIWE EVALUATION RESULT ........ 61

SORTMG AND GROUPZNG OF SET #1 IMAGES ................... .... .... ....... 68

QUALITY MATCHING OF SET # 1 AND SET#^ AGES ........................... 68

ASSIGNING GROW RANGE .................................................................. 69

PLOT OF COMPRESSED DATA SIZE ...................................................... 72

PLOT OF COMPRESSION RATIO .......................................................... 72

PLOT OF S U B J E ~ QUALITY EVALUATION ....................... ........ 74

PLOT OF STANDARD DEVIATION FOR SUBJECTIVE QUALITY

EVALUATION .................................................................................. 76

PLOT OF BLOCKINESS INDEX ........................... .. ................................ 79

PLOT OF SIMLUNW INDEX ................................................................. 80 PLOT OF MODIFIED BLOCKMESS INDEX .............................................. 81

PLOT OF MODEED SIMILARITY INDEX ................................................ 82

POIQE INDEX ..................................................................................... 85

POIQE INDEX FOR 'CLIFFORD' ................................. ,.., ...................... 87

POIQE INDEX FOR 'KEYS' ..................... .. ........................................ 87

POIQE INDEX FOR 'GIRL & APPLE' .................................................... 8 8

POIQE INDEX FOR 'LENS COVER' ........................................................ 88

POIQE INDEX FOR 'ROSE' ................................................................... -89

POIQE INDEX FOR 'SUNGLASSES' ................................................... 89

.............................. ............................... POIQE INDEX FOR 'BUS' ... -94

POIQE INDEX FOR 'CARS' ................................................................... 94

.......................................................... POIQE INDEX FOR 'COMPUTER' -95

xiv

FIGURE C- 1

FIGURE C-2

FIGURE C-3

FIGURE C-4

FIGURE C-5

FIGURE C-6

FIGURE C-7

FIGURE C-8

FIGURE C-9

FIGURE C- 1 O

FIGURE C- 1 1

POIQE INDEX FOR TABLE' ............................................................... 95

.............................................................. POIQE INDEX FOR UOUSE' 96

........................................................................... 'CLLFFORD' IMAGES 1 10

................................................. ...................... 'KEYS' IMAGES .... 1 12

................................................................. 'GIRL & APPLE' IMAGES 1 14

....................................................................... TENS COVER' IMAGES 1 16

................................................................................. 'ROSE' IMAGES 1 18

....................................................................... 'SUNGLASSES' IMAGES 121

....................................................... ..................... 'Bus' IMAGES .. 1 23

................................................................................. 'CARS' IMAGES 1 2 5

.................................... ............................. 'COMPUTER' IMAGES ...,.. 127

............................................................................. TABLE' IMAGES 1 2 9

'MOUSE' IMAGES ......................... .. .................................................... 13 1

CCR-60 1 DIGITAL VIDEO FORMAT ................................................... 15 1

LIST OF TABLES

TABLE 2- 1 DIGITAL DATA REDUNDANCIES .................. .... .............................. 3 0

TABLE 2-2 MPEG VERSION LIST .......................................................................... 35

TABLE C- 1 TABLE OF SUBJECIWE QUALITY EVALUATION FOR 'CLIFFORD' ......... 133

TABLE C-2 TABLE OF SUBJECTIVE QUALITY EVALUATION FOR ' m y S' ................ 134

TABLE C-3 TABLE OF SUBJECTIVE QUALITY EVALUATION FOR 'GIRL &

APPLE' ................... .. ....................................................................... 135

TABLE C-4 TABLE OF SUBJEC~VE QUALITY EVALUATION FOR 'LENS

COVER' .............................................................................................. 136

TABLE C-S TABLE OF SUBJECTIVE QUALITY EVALUATION FOR 'ROSE' ................ 137

TABLE C-6 TABLE OF SUBJECTIVE QUALITY EVALUATION FOR

~SUNGLASSES' ................................................................................... 1 3 8

TMLE C-7 TABLE OF SUBJECTIVE QUAL~TY EVALUATION FOR 'BUS' .................. 139

TABLE OF SUBJEC~M QUALITY EVALUATION FOR 'CARS' ................ 140

TABLE OF SUBJECTIVE Q u a m EVALUATION FOR'COMPUTER' ....... 141

TABLE OF SUBJECTIVE QUALITY EVALUATION FOR TABLE' .............. 142

............. TABLE OF SUBJECTIVE QUALITY EVALUATION FOR MOUSE' 143

TABLE OF BLOCKINESS INDEX FOR THE VALIDATION

..................................................................................... EXPERIMENT 144

TABLE OF SIMLWITY INDEX FOR THE VALIDATION

.................................................................................... EXPERIMENT 1 4 5

TABLE OF MODIFIED BLOCKINESS INDEX FOR THE VALIDATION

.................................................................................... EXPERIMENT 1 4 6

TABLE OF MODEIED SIMILARITY INDEX FOR THE VALIDATION

.................................................................................... EXPERIMENT 1 4 7

TABLE OF POIQE INDEX FOR THE VALIDATION EXPERIMENT ........... 148

ACRONYM

ASCII

AC

AVI

bmp or BMP

bps

BPS

CCITT

CD ROM

CIE

CODEC

cpi

cpd

CPS

DC

DCT

DIP

American Standard Code for Information Interchange

Alternate k e n t

Audio Video Interleave

Bitmap

Bits per second

Bytes per second

Consulting Committee for International Telegraphs and Telephones

Compact Disk - read-only-memory

Commission International De 1'Eclairage

Compressot/Decompressor

cycles per hch

cycles per visual degree

cycles per second

Direct Current

Discrete Cosine Transformation

Digital Image Processing

Qs

GIF

GOP

HAV

HDTV

H V S

IS0

IEC

IEEE

ITU-R

ITU-T

PEG

L z w

MAD

MSE

MPEG

NTSC

PAL

PCX

P P ~

P P ~

PSTN

Frames per second

Graphics Interchange Format

Group of Picture

High quality Audio1 Video

Kigh-De finition Television

Human Visual (or Vision) System

International Organization for Standards

International Electrotechnical Commission

Institute of Electrical and Electronic Engineers

International Telecommunications Union-Radio Sector

International Telecommunications Union-Telecommunication Sector

Joint Photographic Experts Groups

Lempel-Ziv- Welch compression

Mean Absolute Distortion

Mean Square Error

Moving Picture Experts Group

National Television System Committee

Phase Alternate Line

ZSoft PC Exchange image

pixel per inch

pixel per visual degree

Public Switch Telephone Network

RLE

SECAM

SNR

TGA

TIFF

VLE

Run Length Encoding

SEquentiel Couleur uvec Memoire

to Noise Ratio

Truevision Taqga Image File Format

Tagged Image File Format

Variable Length Encoding

Chapter 1 Introduction

1.1 The Need for a Psychovisually-based Objective Image

Quality Evaluator (POIQE)

1.1.1 The Growth of Digital Data Communication

Digital communication technology has grown aggressively in the past two decades. In

the telecommunication industry, the digital data transmission system is slowly taking

over the analog system, due to its robustness in controlling signal transmission.

However, digitally represented signals require far greater bandwidth and storage space

than traditional analog signals.[HPN97]

As digital video transmitting related technology (such as HDTV, video conferencing, and

Internet communication) grows, so does the need for a faster and higher quality digital

data transmission. In the United States, approximately one in every four households has

access to the 1ntemet1. This information superhighway is extremely busy. The

transmission of high quality digital video results in a massive amount of data clogging the

transmission media. However, the bitrate of the transmission media is limited. For

example, the PSTN modem has a transmission bitrate of up to 3Okbps. For application

such as NetMeeting, the 'real-time' transmission of a 3 x 8-bits uncompressed color video

' According to the December 1998 U.S. Department of Commerce Census Bureau, 42.1% of American households owned computers, and 26.2% of a11 households had Internet access.

under the NTSC standard2 will require a data rate of 168Mbps. This problem motivates

the development of digital data compression technoiogy.

1.1.2 Digital Data Compression and its Quality Criteria

Digital image data is just like a sponge, which is massive and fbll of spatial and temporal

redundancies, In order to reduce this massive volume of data into a size that allows

efficient data transmission, this porous sponge needs to be compressed before it is sent

down through the designated data line.

DigitaI data compression provides a significant reduction in data size, which benefits

substantially the data transmission and storage media industries. The International

Organization for Standardization (ISO) and the International Electrotechnical

Commission (IEC)~ developed an international standard for digital data compression

[GW92J. Some of the well-known data compression standards are JPEG (Joint

Photographic Experts Group) for digital still image compression and MPEG (Moving

Picture Experts Group) digital video compression.

Digital data compression can be classified into two types: lossless and lossy. Lossless

compression allows reversible and error-fiee compression (i.e., the compressed image is

exactly the same as the original), whereas the lossy compression is an irreversible process

and non-essential details of the image will be truncated. However, as a compensation for

the loss in image quality, lossy compression gives a much higher compression ratio.

Lossy compression for single frame image is usually capable of achieving a compression

' Under the NTSC system, the image is refieshed at a rate of 29.97 fiame per second with video resolution of 480 pels by 480 lines pN97] ' The development of MPEG-2 and MPEG-4 standards are also participated by the Consulting Committee for International Telegraphs and Telephones (CCITT), the third major international standards organization. The name of CCITT is recently changed to International Telecommunications Union-Telecommunication S e a r W - T ) FIPFL961

ratio4 of 2 5 1 or more with still acceptable quality, whereas lossless compression can

seldom achieve a compression ratio of more than 2.6: 1'.

In the existing video compression algorithms, the criteria used to evaluate the video

quality are mainly mean square error (MSE) and mean absolute difference (MAD). MSE

and MAD are very good criteria to show the difference two images fiom a numerical

point of view. But they fail to show the differences between the two images from a

human visual point of view. (An example showing that MSE is not a good image quality

indicator is provided in Chapter 3 .)

1.13 PsychovisuaIly-based Objective Image Quality Evaluator

POIQE)

Since the end-user of the compressed image data is a human being, it is important to

evaluate the quality based on human perception. The evaluation of the image quality can

be performed subjectively or objectively. (More details on the subjective and objective

fidelity criteria will be discussed in Chapter 3.)

To evaluate an image quality by subjective means is infeasible. The process is time

consuming, inconsistent and expensive. In addition, subjective measurement cannot be

implemented into a compression algorithm. In contrast, objective quality measurement

offers a much faster and consistent evaluation method.

By developing a mathematical model to resemble the subjective judgement of a human,

we propose a objective evaluator, called Psychovisually-based Objective Image Quality

Evaluator (POIQE), to weight the importance of specific image details based on what is

perceivable to the human vision system. This objective evaluator can be implemented in

the compression algorithm as a quality indicator to facilitate discarding of image details

' The compression ratio of the image is defined as the ratio of the compressed data size and the uncompressed data size. Please refer to analysis results in Appendix A Table A-3.

that cannot be perceived by the human eyes. In another words, the quality of the image

will be maintained, whereas the compression ratio of the data will remain small.

1.1.4 Applications

The proposed image quality evaluator can be implemented into a compression algorithm

to enhance the compression process, or stand alone as an image quality index. It has a

wide variety of applications, which are required by data transmission (or storage) media

having limited bitrate (or storage space). Typical examples of such applications are:

Multimedia storage

Internet or teleconferencing

Quality evaluation for digital displaying device

Security surveillance system

Tele-robotics

Media broadcasting

1.2 Objective and Scope

1.2.1 Hypothesis

If the MSE is not a good indicator of image quality, what properties does the human

observer use to rate the image quality? When a human observer looks at an image, he

will sub-consciously look for some details that he has seen before, or some artifacts that

does not belong to the image. Then, he will identify these recognizable details and make

appropriate judgments based of their occurrences. In this thesis, it is hypothesized that

the human subjective rating is mainly based on the following:

patterned artifacts generated as a side effect of the compression algorithm

recognizable level of the detzils in the image

1.23 Objective

The objective of this thesis work is to design and validate a Psychovisually-based

mathematical algorithm for objective image quality measurement.

1.23 Scope

The thesis work consists of:

Presentation of background theories and related research on human visual perception

Design of experiments to measure the human evaluation index that will be correlated

with the objective evaluation criteria used in the mathematical model

Conducting these human subjective evaluation experiments with six different sets of

monochromic images generated from the PEG compression software to obtain model

data.

Designing a Psychovisually-based Objective Image Quality Evaluator

Impiementation of the mathematical model in Visual C/C+ using the data obtained

from the human subjective evaluation experiment

Validation of the proposed POIQE model

This thesis focuses on the demonstration that the proposed POIQE can resemble the

human subjective judging perception. This work does not include the implementation of

the evaluator into any image/video compression algorithm.

1.3 Contribution of this Thesis

The main contributions of this thesis are:

A mathematical model of digital image quality that is based on human psychovisual

properties.

An algorithm (POIQE) based on this model that simulates human subjective

evaluation of digital image quality

Experimental results that give psychovisual parameters used in the model

Coding of the JPEG compression s o h a r e in Visual C/C* for generating the test

images used in the experiment of human subjective evaluation

Experimental validation of the algorithm

An example demonstrating the ineffectiveness of using MSE as a quality indicator for

DCT-based compressed images

Experimental evidence showing that the human quality assessment is very subjective

and iaconsistent. Also, the assessment result shows that the evaluation is more

consistent when the image quality is extremely good or extremely bad.

1.4 Related Work

Previous research related to this thesis work can be categorized into two types of studies.

Psychological characteristics of human visual perception

Quality measurement tools

The related works regarding objective image quality measurement tools are vast. But

most of the time, the quality measurement studies and the psychological studies are

closely tied together. They all have the same characteristic of investigating the symbolic

relationships between visual perception and mathematical models.

1.4.1 Related Research on Psychological Studies of the Human

Visual Perception

The biological studies on the human vision system can be traced back to Leonardo da

Vinci's Trattato della Pittura [Ste58] for stereo vision in the fifteenth centuries. This

biological research provided the backbone for the later research on the psychological

studies of vision.

Most of the psychological studies focused on the investigation in area such as luminance,

spatial contrast sensitivity, temporal contrast sensitivity and spatio-temporal contrast

sensitivity. Also, many of these psychological studies are built based on a biological

approach peL58] [CG66] [Owe72]. In addition, the work on spatio-temporal sensitivity

provided the background knowledge that lead to the investigation of the motion analysis

of video [WAS51 [Gir88].

1.4.1.1 Perception of Luminance and Brightness

Arend and Goldstein [AG87] and Adelson [Ade93] provided detailed investigations and

demonstrations of the characteristic on human perceptual organization and judgement of

brightnessflightness constancy. These researches focused on the studies on the perceived

brightness of a patch relative to the surroundings pattern's luminance. In many other

studies, the relativity of brightness is also investigated in terms of the contrast sensitivity

threshold. Both Weber's Law and Commission International de I'Eclairage (CIE)

proposed computational methods in calculating the brightness level that is just detectable,

known as "contrast sensitivity threshold".

1.4.1.2 Spatial Contrast Sensitivity

The studies of contrast sensitivity are further extended into the investigation on how the

human vision system respond to the change in spatial frequency of the stimulus. Van Nes

and Bournan's w 6 7 ] and Mullen's m 8 5 ] researches provided detail investigation on

the contrast sensitivity of sinusoidal monochrome and chromatic gratings at various

spatial kquencies and luminance levels.

In 1973, Vassilev m 7 3 ] conducted a study on contrast sensitivity of near borders (i.e.

edges).6 Although the research did not covered experiments with an extensive range of

temporal variation, the results showed that the duration of the stimulus is also a very

important factor to the contrast sensitivity.

1.4.13 Temporal Contrast Sensitivity

Well before Vassilev's work, the temporal effect on contrast sensitivity was investigated.

The studies of sinusoidal time-varying stimuli began as early as 1922 by H. E. Ives, but

the attention on the temporal effect of the contrast sensitivity only began in the 1950's by

de Lange Fel6laJ. In most experiments on temporal contrast sensitivity KelCla]

Weldlb], the subject was presented with a flashing modulated light at a constant

temporal fiequency. The luminance of the source will gradually increase until the subject

began to detect the contrast sensitivity threshold. Then, the experiment was repeated

with a different temporal fiequency.

1.4.1.4 Spatio-Temporal Contrast Sensitivity

It is well known that the spatial and temporal effect is an inseparable property of the

contrast sensitivity firnction. In Owen's research [Owe72], a series of experiments were

conducted to valid the interdependence of luminance (just detectabIe contrast), area

(spatial) and duration (temporal) of a visual stimulus. Also many research efforts [Bar581

Bob661 pK98] were focused in building a mathematical model in quantifying the

spatio-temporal contrast sensitivity relation.

In 1956, Schade [Sch56] proposed the spatid contrast-sensitivity hct ion to

mathematically model the viewer judgement on the 'just detectable' threshold contrast. In

Demonstrating the foveal edge effect on stimulus of varies size and shape, VassiIcv explained the contradictory results on related literahrrcs.

1966, Robson mob661 modified Schade's fhction by adding the temporal effect on the

human contrast sensitivity, and proposed the spatial-temporal contrast sensitivity

bction.

In 1998, van den Branden Lambrecht and Kunt mK98] proposed a mathematical

representation to model the human spatio-temporal contrast sensitivity characteristic.

Their works focused on characterizing the human visual perception of the coding artifact

(a "band-pass" filtered white noise) by perceptual channels.

1.4.1.5 Motion Perception

Some spatio-temporal investigations have led to the studies of motion or velocity

perception of the HVS [AB85] PA851. In 1988, Girod [Gir88] explored the motion

perception by tracking of the eye movements of the human observer. He extended his

researched into the relevance of using human eye movement in video sequence encoding

for better data compression.

Gathering information on these innovative researches, many books [Cor'lO] pD88]

w 9 1 ] [Eva481 related to vision perception studies have been published. Wandell's

book, titled Foundations of vision [Wan95], contains a complete resource for the human

vision system. The book explains many human visual characteristics and its

psychological effects from a biological and anatomical point of view. It also covers

substantially on spatial and temporal sensitivity and motion perceptions.

1.4.2 Related Researches on Quality Measurement Tools

The psychological studies of human visual perception began as early as the mid-

nineteenth centuries. However, the research on the development of the psychovisually-

based videohmage quality measurement tools is a fairly new topic. The rise of interest ia

this area is primarily due to the dramatic growth of digital video application in the past

ten years. All these researches focused on the development of an objective image or

video evaluation tools that could resemble the subjective human quality assessment

results. In general, they can be classified into image quality measurement tools for still

image and for image sequence (i.e. video).

1.4.2.1 Stin Image

For still image quality evaluation tools, the evaluation criteria used are typically based on

mean square differences of pixel intensity and edge information between reconstructed

and original images. Also, the techniques used are mainly focused on image properties

such as contrast, luminance and spatial fkequency.

Heeger and Teo is one of the few research that focused in the development of the

perceptual image fidelity model for still image [TH94] m95]. The model is based on

the measurement of the difference in the contrast and luminance sensitivity between the

original and the reconstructed image.

A methodology for determining objective quality metrics used in image coding is

presented by Miyahara, Kotani and Algazi -981 and Horita, Katayama, Murai and

Miyahara -963. The method was used to obtain a picture quality scale (PQS) for

coding achromatic images over the range of image quality defined by the subjective mean

opinion score (MOS). This PQS considers the properties of visual perception for global

features and localized disturbances. It was found to closely approximate the MOS,

except at the low end of the image quality range.

1.43.2 Image Sequence (Video)

For image sequence's quality evaluation tools, the evaluation criteria used are usually

divided into two parts: i n t r a - h e evaluation and inter-fiame evaluation. Intra-frame

evaluation for video is just the same as the evaluation used for still images. The

evaluation technique based mainly on the spatial properties of the intra-he; whereas

the i n t e r - h e evaluation focused on the human assessment of quality deterioration in

the present of motion (i.e. the temporal properties of the inter-frame) [Bra961 [NB97]

[BFLSV97]. Many of the researches started as an image quality evaluation tools

[Lub95], and then eventually added in the temporal component into the algorithm for

video quality evaluation [Lub97].

Focusing on the masking effect and the spatial-temporal frequency characteristics of the

HVS, Okamoto, Hangai and Miyauchi [OHM961 proposed an objective video quality

evaluation index namely 3-D SNR (three dimensional signal to noise ratio). The

experiment shows that the 3-D SNR results provide much smoother and closer match to

the human subjective ratings than the SNR results.

Tan, Ghanbari, Gardiner and Pearson [GGPT97] FGP971 [TGP98] proposed an

objective measurement tool for accessing MPEG video application. The work included

two parts: distortion weighting and cognitive emulator. The first part targeted on the

quality evaluation of still M e . It measured the distortion weighted by the distance of

the pels from any nearby extreme contrast change (i.e. edge). The weighting of this part

is based on the human perceptual effect called 'activity masking'. (More details on active

masking will be provided in Chapter 2.) The second stage focused on the quality

evaluation of a sequence of continuous h e (i-e. video) by incorporating human

decision making processes of smoothing, saturation, asymmetry and delay.

1.5 Terminology

Before proceeding any tiuther, it is essential to clarifL some of the terms commonly used

in this document. The definitions of these terms are summarized and put together based

on their conventional usage and definition in various literatures.

As defined in GonzaZez and Woods [GW92], luminance, measured in lumens, measures

the amount of energy an observer perceives ftom a light source7. But the incident energy

is weighted according to the spectral sensitivity of the eyes WFL961. In another words,

the amount of energy perceived by the human eyes varies as a function of the sue of the

source. In contrast, chrominance measures the hue and saturation of color. Hue is

defined as an attribute associated with the dominant wavelength of the color, and

saturation is defined as the level that the hue is being diluted by the white light w 2 ] .

For example, if the color of the detail is pastel purple, the hue of the detail will be "purple"

and the saturation is the amount of white added to that hue (i-e. the pastel tone to the

detail).

In this document, the word 'quanta' is used to quantitatively represent luminance. The

word 'imtensityd implies the luminance of the object. It indicates the physical luminance

only and has nothing to do with human interpretation, unless they were used together

with adjectives, such as the word 'apparent' or 'perceived', which indicate that they are

idonnation resultant from human interpretation.

According to Evans @2va48], bn'gktness is the apparent luminance of a patch in an

image, without referencing to its surroundings. The brightness perception is mentally

how this luminance is viewed and understood by the viewer. Another very similar term

is lightness. It is defined as the apparent reflectance of a perceived surface relative to

other patches in the same scene [Eva48]. In many books, the use of the term brightness

and lightness refers to different issue, which the latter is affected by the surrounding and

the previous is not.

Also for image compression, a reconstructed image is one that has been modified or

processed. In this document, sometimes the compressed image is referred to as the

reconstructed image. The term blockiness9 refers to the patterned square artifact created

by the lossy DCT-based compression such as PEG and MPEG. A more detailed

explanation and example for blockiness will be provided in Chapter 4.

' In this case, light source is mfemd to as the object or detail in the image. In some literatures, it is also known as "shadesn. In some case, shade is also used to describc color information (e.g. shades of color). In some literature, "blockincssn is also known as blocking artifkct.

1.6 Organization of this Thesis

In order to design an evaluator that can simulate the psychovisual behavior of the human

visual system (HVS), it is important to understand some hdamentals of the HVS. In

Chapter 2, some hdamental background research on the HVS and the human visual

perception is provided. Not all information provided in this chapter is explicitly related

to the focus of this thesis research. However, the materials in this chapter provide

background, which lead to the understanding to the HVS and the rationalization behind

the proposed thesis. In the later part of the chapter, a brief introduction of different image

formats and compression technique was described. Readers who are familiar with the

psychovisual behavior of the HVS and image compression techniques can skip this

section.

Chapter 3 provides detailed descriptions and definitions of subjective and objective

fidelity assessments and criteria. This chapter also gives examples on the effects of

structured artifacts and the failure of using MSE as an objective fidelity criterion.

In Chapter 4, the proposed mathematical model of the POIQE is explained. Chapter 5

provides a detailed description on the method and procedure of the model tuning and

validation experiments. The model tuning experiment sets the model parameters. And

the validation experiment fiuther justifies the robustness of the model in evaluation the

quality of varies different image sets. Finally, the thesis is summarized and concluded in

Chapter 6.

This document does not provide descriptions on the JPEG and MPEG compression

algorithm. If the reader would like more information in those areas, he may refer to

[GW92] w F L 9 6 ] IHpN971 for more details.

Chapter 2 Background

For image compression, it is inefficient to store and transmit information that cannot be

sensed by the human eye. Therefore, it is important to understand the fundamental

characteristics of the human visual system, in order to identify the non-perceivable details

and remove the perceptual redundancy. In this chapter, some hdamental details of

optics and the human visual system are described.

2.1 The Physics and Fundamentals of Vision

2.1.1 The Nature and Physics of Optics

The human body is surrounded in an environment that is filled with electromagnetic

radiati~n'~. According to their wavelengths or frequencies, the electromagnetic radiation

is recognized in the various forms such as Light, radio wave, ultraviolet, infi.ared, etc.

Amount all these forms, light, of wavelength ranged from 400 to 700~1" [SZY87], is

the only form of electromagnetic radiation that can be sensed by humans. And that is

why it is also known as 'visible radiation'.

A light ray can be understood as a set of electromagnetic radiation of various

wavelengths. It travels in a straight path and strikes on the object in the path. According

to the color and reflectivity of the surface of the object, the object will reflect the light

rays of the same wavelengths as its shades of color and absorb the rest of the light ray.

'O Electromagnetic radiation, consists of time-varying electric and magnetic fields, travels or propagates through space without artificial guide or media at a definite speed (c = 3.00 x 10' ms"). [SZY87]

The reflections of the objects create a scene of different colors and intensity levels in the

3-dimensional space. As the reflections of the objects change in time, time becomes the

fourth dimension of scene.

2.1.2 The Human Visual Perception

The study of human visual perception is a study of how this Cdimensional scene is

transformed into psychological interpretation of space and time by humans. The human

visual perception is very complicated. In general, visual perception can be understood as

a psychological interpretation of an image acquired by the vision system.

In order to understand the psychovisuai behavior of the human vision system, it is

important to understand the hdamentals of how it works.

2m1m2.1 Human Vision System (HVS) Fundamentals

The vision system is one of the most magnificent systems in the human body. This

system includes the left and right eyes, a series of neural pathway, and the brain. Its

sophisticated design allows it to identi@ image details at thousands of color shades and

intensities levels. The binocular visual field is roughly 200 degrees by 135 degrees with

respect to the visual axis -951.

Figure 2-1 shows a cross-section of the human eye. For acquiring images fiom a scene,

light is reflected by the object and enters into the eye. The light is then focused by the

comea (tixed-focus) and the lens (variable-focus)12, and projected onto the retina, which

is lined with photoreceptors. These photoreceptors are light sensitive cells that can

absorb Light. Stimulated by the light, the photoreceptors create a pattern of signals. This

" For standardization purpose, Commission International De L'Eclairage (CIE) designated in 1931 the followhg specific wavelength for visible light: blue = 435.8nm. green = 546.lnm and red = 700nm. [GW92] The combined optical power, measures as the reciprocal of the focal length, of the cornea and the lens is 58.8 dioptcrs. For focusing nearby object, the muscle connected to the eye will change the shape of the lens to increase its optical power. This process is known as accommodarion. @,Van951

pattern of optical signal is then transmitted by a system of neural pathways13 running

fiom the eye to the brain at a rate of 1000 impulses per second-

Fiwre 2-1 Anatomv of the human eve"

The retina of human eye contains two types of photoreceptor~: rods and cones. There are

6 to 7 millions of cone receptors and 75 to 150 millions of rod receptors in each eye

[GW92].

Located densely at the foveal5, the cow is responsible forplio#opic vision, which means

that the cones are stimulated by scene with only high-illumination. There are three types

of cone photoreceptors: red cone (long wavelength), green cone (medium wavelength),

and blue cone (short wavelength)I6. As described by Rogowitz Fog921, these cones can

be thought of as broadband filters for three ranges of wavelength. According to the

trichromatic theory1', mixing these color signals, detected by these three cones, allows

the human eye to discern different colors and shades.

The rod receptors are distributed radially symmetric about the fovea over the retina,

except the blind spot1'. This region is also known as the peri@hery region. Rods are

mainly responsible for scotopic vision, which is sensitive to low-illumination condition.

" The neural pathway consists of several layers of retina neurons and ttrc output fibers make up the optic nerve that leads to the primary visual cortex, also known as area V1, of the brain- wan951 '' The graphic of this figure was taken fiom a wcbsite (amst unlmown). ' Fovea, also known as macula, is a spot where the visual axis intersects with the reha

l6 The red, green and blue cones isre also known as L- (for long wavelength), M- (for middle wavelength) and S- (for short wavelength) cones respectively in some biological literatures.

I' The trichromatic theory states that any color can be regenerated by combining red, green and blue (Cor701. la Blind spot, also known as optic disk, is where the optic nerves arc gathered and connected with the eye.

In comparing with the cone cells, it is not responsible to color details due to its scotopic

characteristic.

Fieure 2-2 Rod and cone's s~at id m at terns

Figure 2-2 provides a of rod and cones distribution on the retina. Each cone

receptor is connected to a single nerve as one unit. These cone units create a 4 x 6 array

of photosensitive cells. In comparison, a number of rod receptors are gathered and

connected to one nerve as one unit. Figure 2-2 shows four rod units (labeled as Rod #1,

Rod #2, Rod #3 and Rod #4), which create a 2 x 2 array of photosensitive cells. Since the

cone array is much h e r than the rod array, the cone is considered as receptor that is

capable in resolving finer details.

Since each rod unit consists of a number of rod receptors, the rod unit is capable of

capturing more quanta. In another word, the rod unit is more sensitive to light than the

cone unit. This explains the scotopic characteristic of the rod receptors.

Moreover, the periphery region is more temporal sensitivity [Pir67]. In another words,

rods are more sensitivity to stimuli that is temporarily varying (e-g. a flashing source).

This phenomenon can be understood, if we look at a flashing source as source with

sufficient luminance but low quanta or energy due to its flashing behavior. Since the rod

l9 Please note that the actual distriiution pattern is more imgular than the pattern shown in Figure 2-2. The cone population at the fovea region is a lot denser than that at the periphery region, vice versa applies to the rod population. (Please ref- to [Wan95 Fig 3.41 for actual spatial mosaic).

in periphery region is more capable in capturing more quanta, it is more capable of

detecting flashing source.

2.1.2.2 Visual Angle

Visual angle 8 is defined as a one-dimensional angular measurement of the dimension of

a detail based on the horizontal distance between the object and the eye. In Figure 2-3,

the visual angle can be represented as

where 8 = Visual Angle

hl and h2 = Height of the object above and below the visual axis2'

respectively

x = Horizontal distance between the object and the eye

Fimrre 2-3 Visual AnpIe

h= h, + hz where i3 = visual angle

But for simplicity, equation (2-1) is often represented as

where h = Total height of the object = hr + h2

20 Thc term "visual axis" used in this section is different from the "visual axis" described in the in Figure 2-1 Anatomy of the human eye.

2.13 HVS as a Shift-Invariant Linear System

The image formation of the HVS includes a series of optical and neural transformations.

The input is the object image and the output is the retinal image formed. This image

formation process is represented as a l ined1 shift-invariant transformation pNan951.

Two important properties of a linear transformation:

Homogeneity: if r = T(i),

then q u i ) = a q i ) = a r

Superposition: if rr = T(ir) and r2 = Tiit),

then r, + r2 = nil + i2)

where i is the input, r is the output, T is the transformation h c t i o n and a is

the arbitrary constant

Also, the system's shift-invariant characteristic indicates that it is spatially homogeneous.

When a system is spatially homogenous, it means that the transformation result is true for

all locations and directions (isotropy) in space (i-e. r = T(i) for all locations and directions

in space). Two main properties of shift-invariant system are:

1) Due to its spatially homogeneous properties, the system transformation matrix

can be defined fiom one single stimulus.

2) The response to a harmonic function (such as sinusoid and cosinusoid fiom

discrete cosine and discrete Fourier transformations) at frequency f is also a

harmonic function of the same kequency.

*' Although the actual image formation process of the vision system is a non-linear transformation, many analytical approaches assumed that it is linear. This is mainly due to the simplicity of the linear analysis, such as the Fourier techniques. This linear analysis still gives correct results when it is applied to the linear portion of the non-linear system.

Fimre 2-4 Imaee Transformations

I n ~ u t Out~ut il r,=T(id A A

i2 A Optical &

- Transformations - - - b x

if +i2 rl +r2=i(il +iJ

2.1.4 Brightness Perception

As mentioned, light reflected by the object in a 3-D scene enters into the eye and

stimulates the receptors to create an optical pattern. Each eye perceives the optical

pattern as 2-D image consisting of patches with different brightness. Defined by the

Weber's Law WFL961, the contrast between two adjacent luminance Y and Y + AY is

'just detectable' when

Similarly, a 'just detectable' contrast calculation is also suggested by the Commission

International de I'Eclairage (CIE). The perception of brightness, also hown as Lightness

(L*) for a specific luminance Y is shown in equation (2-6). The just detectable lightness

occurred when AL* = 1.

(2-6)

where Y, is the luminance of white

Both just detectable contrast calculations by Weber's Law and the CIE are very similar.

Figure 2-5 shows a plot of the AY, also known as the "contrast threshold", versus Y for

both calculations.

Fieure 2-5 Just Detectable Contrast Threshold

2.1.4.1 Contrast Sensitivity

Besides the calculations provided by Weber's Law and CIE, there are also other

researches PeL581 par581 [Sch56] Wel6laJ [KeMlb] mob661 w 6 7 ] [Owe721

Wul851 [AB85] @3K98] on contrast sensitivity of the HVS based on the effect of spatial

and temporal variation. In most studies for the modulation transfer fimction (MTF) of the

vision system, a sine-wave grating is used as the stimuli. The sine-wave grating is a

vertical stripe pattern with intensity distribution as shown in Figure 2-6 with the

definitions and terminology as followed

Fimrre 2-6 Sine-Wave Grating I

modulation

amplitude

Contrast, also known as the "percentage of modulation" or just the "modulation", is

defined w 6 7 ] as the modulation amplitude of a sinusoid variation at the just detectable

threshold, divided by average luminance.

Conm* Modulatiom Amplitude - f (Y- - Y,, ) - Y,, - Y,,

or - - - - % of Modulation Average Luminance +(Y- + Y,, ) Y,, + Y,,

Contrast sensitivity is defined w 8 5 ] as the inverse of the modulation amplitude at just

detectable threshold.

Contrast Sensitivity = 1 - - 1 Modulation Amplitude f (Y,, - Y-) (2-8)

Refatrgve contrust sensitivi@ is defined Wu185J as the inverse of percentage of

modulation at just detectable threshold.

1 - Y,, + Y- Relative Contrast Sensitivity = - % of Modulation Y,, - Y,,

These definitions of contrast are extensively used in the studies of human visual

sensitivity to brightness.

2.1.4.2 Simultaneous Contrast

The Lightness of a particular region does not simply depend on the luminance of that

region. It is often affected by the luminance of the surroundings. The HVS encodes

information on a relative basis. This phenomenon is known as "simultaneous contrast"

[co~~o]". Adelson [Ade93] provided a very good illustration as shown in Figure 2-7 to

demonstrate this effect. The diamonds in the illustration have the same physical

reflectance, but it is experimentally proven that they have a brightness of approximately

3 5% perceived difference.

" Cornsweet's book ([Cor'lO] p. 272-7) provided a substantial amount of examples illustrating simultaneous contrast.

Fieure 2-7 Adelson's Diamond Shaned Patterns

2.1.43 Activity Masking

Besides simultaneous contrast, another very common phenomenon of the human

perceptual property is Activity ~mking'~. Vassilev was731 provides a very brief

literature review on activity masking. Activity masking shows that contrast threshold at

the region, which is closer to the edget4 detail, is substantially higher. In another word,

small variation of intensity at area near edge details is less sensitive to the HVS.

This phenomenon is also known as "edge eflectn. " Edge is defined as detail with rapid intensity gradient.

2.1.5 Color ~erce~tion~'

2.1.5.1 Contrast Sensitivity for Luminance and Chrominance Variations

The HVS is more sensitive to variation of luminance information than that of

chrominance information. Mitchell, Pennebaker, Fogg and LeGall's book N I T 9 6 1

fiuther provides evidence to this phenomenon by combining Ness and Bouman's [NB67]

results on contrast sensitivity for gray luminance stimuli with Mullen's ~ u l 8 5 1 findings

on contrast sensitivity for chrominance stimuli as shown in Figure 2-8. The plot indicates

that, for the same spatial frequency, HVS has much higher contrast sensitivity for

luminance than for chrominance. In another words, the H V S can detiw finer spatial

variations in luminance than that in color.

To demonstrate this phenomenon, Rogowitz Fog921 used the yellow text in white

background and the dark blue text in a black background as examples. It is very difficult

to see the text in those two examples, but we can still see some yellowish and dark blue

blurs. This is because the luminance variations are too small, even though it has

significant chrorninance variations.

This thesis work covers evaluation based on monochrome images only. This section is written to make the documentation for human vision system more completed.

Fig~re 2-8 Contrast Sensitivities for Luminance and Chrominance IMPFL961

Therefore, the MPEG compression employs the CCIR-60 1 digital video format, which

allows chrominance sampling at much lower spatial frequencies than the luminance

sampling. The CCIR-601 format represents the color space with one luminance signal Y

and two chrominanfe signals Cr Cb. (MPEG is also adaptable to other color space sets.

Appendix B provides more detail descriptions and conversions between each color space

to RGB.) The available video formats are 4:2:0,4:2:2 and 4:4:4 as shown in Figure C- 1.

2.1.5.2 Sensitivity for Red, Blue and Green Cones

As mentioned earlier, there are three types of cone receptors for photopic vision: red for

long wavelength, green for medium wavelength and blue for short wavelength. The peak

in sensitivity between this three color varies. The green-absorbing cones are

approximately 5% or sensitive than the red-absorbing cones. And both the green-

absorbing and red-absorbing cones are about 2900% more sensitive than the blue-

absorbing cones [MPFL96].

2.1.6 Dark Adaptation and Motion Perception

As mentioned, the HVS is sensitive to thousands of color shades and intensities.

However, at any one instant, the HVS can only sense a small range of intensities. And

the neural system will gradually adjust this dynamic range to match the ambient light-

An example, given by Rogowitz [Rog92], to demonstrate this process is as followed.

When you enter into a movie theater, at first everything is dark with minimal fine details.

Then, the H V S gradually re-adjust the dynamic range of intensity so that the image

becomes more apparent. This phenomenon is known as 'dark adaptation' [Cor70]. The

dark adaptation process indicates that the HVS is very slow in reacting to dramatic

intensity change?

This phenomenon explains a motion perception known as forward masking". According

to Mitchell, Pennebaker, Fogg and LeGall WFL961, forward masking occurs when

there is a sudden scene change. This scene change can occur globally for the entire

image or locally for just a region of the image. During the instant of scene change, the

eye cannot immediately re-adjust itseIf to all of the changes in intensity. As the scene is

being perceived by the eye for a period of time, the HVS starts to pick up the details of

the scene. In another words, very fine details will not be visible immediately right after

scene change. If the details are available to the viewer for certain duration of time, the

viewer will be able to slowly perceive all the details. However, if the duration between

successive scene changes is too short, the fine details in the scene will not be perceivable.

In another word, the fine detail is being masked out. However, the HVS can still

perceive information such as global intensity, overall contrast, and motion information.

In comparing to the standard 29.97 frame per second refieshig rate for display, the dark adaptation process is very slow. (In North America and Japan, 29.97 fmmels is a NTSC TV broadcasting standard frame rate. It is also so-called the "real-time" video transmission rate.)

" It is also known as "temporal masking".

2.1.7 Sequential Perception

The human visual system has a continuous response, due to its ability to persist an image

for a short duration. p N 9 7 ] m 5 ] As a result, the video display media only need to

regenerate the image at a moderate frequency in order to produce a continuous effect. If

the display is refreshed at a lesser rate, display flicker will be produced and the image

will appear shifted and discontinuous.

Researches [DeL58] WeMlb] have been carried out on the study of the perception of

flicker with various luminances, which leads to the term critical flicker frequency (CFF).

Critical flicker frequency is the minimum frequency required in regenerating the display

stimuli in order to create a 'continuous' effect for the HVS.

Rogowitz's experimental observation mog83] in Figure 2-9 shows that the CFF increases

as the luminance of the stimuli increases. In another words, the duration of the vision

persistence reduces as the intensity of the scene increase. A scene with brighter details

will need to be refreshed more frequently than that with darker details in order to

maintain the same continuous effect. Also, Rogowitz result shows that the CE'F increases

as the size of the stimuli increases.28

The standard analog color TV system used in North America is the NTSC. Under the

NTSC system, the image is refieshed at a rate of 29.97 frame per second with video

resolution of 480 pels by 480 lines. In commercial video display, 30 frame per second is

considered as 'real-time' performance.

In the plot, the size of the object is measured as per unit of visual angle.

Figure 2-9 Critical Flicker Freauencies lRoe831

i - ?*-y

I I I lo1; 10' rd 10' to' lot 10'

2.2 Digital Image Compression

2.2.1 Data Redundancy

The purpose of image compression is to remove redundancies that takes up valuable

storage space and transmission time. These data redundancies have no contribution to the

quality of the image and carry no new information. In general, they can be classified as

spatial, coding, psychovisual and temporal redundancies as shown in Table 2- 1.

2.2.2 Fundamentals of Image Compression

Table 2-1 Dieital Data Redundancies

In general, an image can be stored in a compressed or uncompressed format. For an

uncompressed image, the data size (or file size) is approximately equal to the product of

height, width and color depth, plus image header.

Redundancies

spatialtg

Coding

Psychovisual

Temporal

uncompressed data size = height x width-x color depth + image header

Descriptions

Information redundancy (or repetition) between pixels and pixels

within the same image fixme

Information redundancy (or repetition) within the series of code that

represent the image.

Information that represents detail which is not perceivable by the

H V S .

Information redundancy (or repetition) of the same pixel between

successive fhmes

where height = vertical resolution (also h o w n as rows), in pixels

width = horizontal resolution (also known as columns), in pixels

color depth = data size used to represent each pixel, in unit of bits

per pixel or bytes per pixel

29 Spatial redundancy is also known as inter-pixel redundancy or geometric redundancy.

image header = overhead data used to store information about the

image (e.g. resolution, color depth, remarks, etc.),

in unit of bits or bytes

For digitally compressed format, an image can be classified into two broad categories:

losslrss and los~y as shown in Figure 2-10. Both categories involve compressing and

encoding processes, which focus on the removal of the spatial, coding, and psychovisual

redundancies.

Fignre 2-10 Image Format

The compressed data sire is the total data size after compression and encoding, plus

image header. The compression ratio of the image is defined as the ratio between the

compressed data size and the uncompressed data size.

uncompressed data size compression ratio =

compressed data size

2.23 Lossless Compression/Encoding

Lossless compression allows reversible and error-Fee compression. There is no loss of

image quality and all image data are preserved. The process does not include any

quantization. Therefore, the reconstructed image is exactly the same as the original

image. In terms of removal of redundancy, lossless compression/encoding technique

targets mainly on reducing spatial and coding redundancies. The reduction of data size is

achieved only by its unique compression/encoding technique. Typical examples of

compressiodencoding techniques3' include:

Lempel Ziv Welch compression, LZW

Variable Length Coding, VLC (e.g. modified H u f k m encoding, Binary Shift code,

arithmetic coding, etc .)

Bit-Plane Coding

Run Length Encoding, RLE

Predictive Coding

These compression/encoding techniques are employed in file formats such as BMP,

TIFF, GIF, PCX, TGA, etc. These file formats can also store images as uncompressed.

The compression ratio of lossless techniques is based on the characteristics of the image

and is not adjustable3'. Appendix A provides a feature comparison and statistics of

compression ratio for various image formats. Note that the compression ratios of these

lossless compressions can seldom exceed 2.6: 1. (See Table A-3 in Appendix A)

2.2.4 Lossy Compression/Encoding

In contrast, lossy compression is an irreversible process and non-essential details of the

image will be truncated. Since the process includes quantization, the reconstructed image

deviates from the original image. In terms of removal of redundancy, lossy

compressiodencoding technique targets mainly on reducing spatial, coding, and

psychovisual redundancies. Nonetheless, as a compensation for the loss in image quality,

lossy compression gives a much higher compression ratio. Lossy compression for still

'O For more information on the listed compression and encoding techniques, please refer to Digital Image Processing by Gonzalez and Woods [GW92].

" In comparison, the lossy P E G comprcssionos compression ratio can be adjusted by the quantization factor input by the user, which indicates the quality level of the reconstructed image.

image is usually capable of achieving a compression ratio of 25:l or more with still

acceptable quality. Typical examples of lossy compression32 are:

Joint Photographic Experts Group, JPEG~'

Lossy predictive coding

23.5 Lossy DCT-based Compression

Since this thesis focuses only on the image quality assessment for lossy DCT based

compressed data, a description on DCT-based compression algorithm for still image and

video is provided below.

23.5.1 JPEG Compression/Encoding (for Still Image)

Developed by the collaboration of the Consulting Committee for International Telegraphs

and Telephones (CCITT) and the International Organization for Standardization (ISO),

JPEG is a very popular continuous tone (monochrome and color), still-frame compression

standard [GW92]. Its popularity is mainly due to its capability in maintaining a

significant compression rate at an acceptable image quality, in comparing with many

lossless compression techniques. The JPEG algorithm focuses on the removal of non-

essential details, which are psychovisually not perceivable to the human vision system.

Figure 2-1 1 shows the block diagram of the JPEG compression procedures. The JPEG

compression algorithm involves a compressor and an encoder. The compressor consists

of three sequential steps: -2"-' level shifting, discrete cosine transformation and

quantization. M e r compression, the data will be encoded by variable length coding.

Please refer to Gonzalez and Woods' book [GW92] for completed explanations of each of

these steps.

32 For more information on the listed compression and encoding technique, please refer to Digital Image Processing by Gonzalez and Woods [GW92].

33 The P E G compression also has a new lossless compression version known as "PEG-LS".

Fimrre 2-11 Block Dianram of JPEG Comnression Procedures

Com~ressor Encoder

Compressed Image Shift Coding VLC Code

During compression, the image is first divided into 8x8 subimage blocks. Then, each of

these subimage blocks will undergo the compression algorithm with level shifted,

transformed and quantized individually, The quantization process causes details to be

truncated in according to the pre-set quantization factor. Consequently, a series of 8x8

patterned artifacts is created, which is defined as 'blockiness' in this document. This

blockiness is recognized as deteriorated detail. For a higher compression ratio, the image

will have a more noticeable blockiness effect. As a result, image quality is subjectively

rated down.

2.2.5.2 MPEG CompressionlEncoding (for Video)

MPEG is a series of digital audio and visual data compression standard established by the

Motion Picture Expert Group, which formed under the auspices of the International

Organization for Standardization (ISO) and International Electrotechnical Commission

(IEC). It is an international compression standard that is widely employed by the digital

video broadcasting, telecommunication, digital storage media industries and more. Table

2-2 shows a series of MPEG standards. Each version of the MPEG standards is designed

for an application of specific data transmission rate.

" MPEG-I is intended for video coding at bitrates of about 12Mbps, plus stereo audio coding at bitrates of about 250kbps.

jS Typical example of HDTV is 1920 x 1080 at 30Hz with uncompresscd bit rate of approx. 1 -5Gbps. 36 Although MPEG-2 has been finalized in 1994, the standard is still constantly modified. 37 The initial design for MPEG-3 is for high bit rate and high-resolution application. But as the MPEG-2

development moves along. It is realized that it can bc achieved by minot extension of MPEG-2. Therefore, MPEG-3 is been merged into MPEG-2.

MPEG1

(Isomc 1 117% 5 P-)

MPEGZ

(ISO/IEC 13818, in pus)

MPEG3

MPEC-4

(ISO/IEC 14496, in 6 parts)

MPEG7

Status

F i e d in Nov. 1992

Finalized in Nov. 1 9 9 4 ~ ~

Merged to M P E G - ~ ~ ~

Version 1 finalized in Oct. 1998

Version 2 targeted

Dec. 1999

Targeted Fall 2001

Table

Optimal Transmission

Rate

Intermediate data rate

(1 .5mps)34

High bit rate and high resolution

definition

(10~bp.s or mom)

-

Low bit rate, but content-based

interactive

(64Kbps or less)

-

2-2 MPEG Version List

Applications

For storage and retrieval of moving pictures and audio on storage media (progressive or non- interlaced fiames only) - CD ROM and high quality compact disk

e.g., 352x240 at 30 fps or 352 x 288 at 25fps wl VHS quality

For digital television (w/£ield-interlaced frames) - e.g. telecommunications, digital W broadcasting, interactive television and 3D stereoscopic television, HDTV~'

e.g., 720x485 studio quality CCIR-60 1 images at up to 15 MbiWsec

HDTV

For interactive multimedia applications - telecommunications, error-prone wireless networks

Version 1 : interactive video on CD-ROM and Digital Television

Version 2: extension of version 1, plus fully backward compatibility

For multimedia content description interface - e-g., digital libraries (image catalog, musical dictionary, . . .), multimedia directory services (e.g. yellow pages), broadcast media selection (radio channel, TV channel, . . .) and multimedia editing (personalized news service, . . .)

The MPEG algorithm can be divided into 3 main procedures: intra-jhame compression,

inter--/am compression, and encoding as shown in Figure 2-12. MPEG compression

algorithm is relatively more complicated kt comparing with the JPEG compression

algorithm. In MPEG compression, the video stream is mainly divided into series of

groups of pictures (GOP) as shown in Figure 2-13. Each group of picture consists of

three types of fiames: intra-frame (I- frame), forward predicted h e ( P - h e ) , and bi-

directional predicted h e ( B - h e ) .

Each frame is fiuther divided into 16 x 16 macroblocks and 8 x 8 subimage blocks as

shown in Figure 2- 13. If the video is monochrome, each macroblock will consist of four

8 x 8 luminance subirnage blocks. Otherwise, if the video is colored, each macroblocks

will consist of four 8 x 8 luminance subimage blocks, one 8 x 8 Cr chrominance

subimage blocks and one 8 x 8 Cb chrominance subirnage blocks as shown in the figure.

Fieure 2-12 Block Diaeram of MPEG Com~ression Procedures

Intra-Frame Com~ressor Encoder

I- frame Intra Quantization

Inter-Frame Com~ressor VariabIe Length Coding Compressed

P-frame Quantization - data Frame

Packing

Quantization

Fieure 2-13 G r o a ~ of Picture. Slice. Macroblocks & Subimane blocks

Video - Seauence of Actureq

0.0 . .

m m Gmup of Picmra (GOP) Gmup of Pictures (GOP)

16x Id& M~mMocks

Four - &8 Luminance One - k 8 Chrominance One - 8x8 Chrominance B I d - Cr Blmk Cb Block

Similar to JPEG, the MPEG compression focuses on the removal of spatial, coding and

psychovisual redundancies. In addition, it also targets in the removal of temporal

redundancy.

8 x 8 Chrominance

(Cr or 0 Block

8x8Lum. Block

k8Lurn . Block

In the intra-frame compression, the I-frame is processed as a still image. Each subimage

block is individually processed in a way very similar to the JPEG algorithm. The inter-

frame compression focuses on removal of temporal redundancy between frames. In a

sequence of video frames, there is very little change in details between h e s . The

8 x 8 Chronrina~)ce

(Cb or V B k k

k8Lu1m. Block

k 8 L u m . Block

change in details is often due to shifting of detail position in the image. Therefore,

MPEG employs a process called motion estimation for its inter-frame compression. This

process determines the motion displacement vector for each 16 x 16 macroblock by

matching the macroblock of the current W e s with that of its reference kames. These

motion vectors are then transfomed, quantized and encoded. Since the MPEG algorithm

is also processed in blocks, the artifacts of blockiness still persist.

Chapter 3 Fidelity Assessment and Criteria

In many dictionaries, the wordfidelity has the meaning of faithfidness, loyalty, accuracy

and exactness. In image quality measurement, the fidelity of a reconstructed image is

defined as how similar the reconstructed image in comparison to the original image.

Consequently, a fidelity criterion is a standard or tool used to measure this similarity, and

fidelity assessment is the process of evaluating using the criterion. The definition of a

fidelity criterion is extremely important because it defmes the successfulness of any

experiment and algorithm.

This chapter will cover two general classes of fidelity criteria: objective and subjective.

Also, some subjective and objective fidelity criteria commonly used for image quality

assessment will be provided.

Subjective Fidelity Criteria

A subjective fidelity criterion is the standard for assessing the "goodness" of the test

object based on the subjective judgement of a human observer. A typical example of

subjective assessment can be demonstrated by the orange sorting process in the grocery

store as follows.

The worker opens up the boxes of oranges with various sizes and freshness, and has to

sort these oranges into two grades. The "good" quality orange will be rated as Grade #1

and sold for $1.49/lb, whereas the "not-as-good" quality orange will be rated as Grade #2

and sold for S0.99Ab. The criteria of the sorting process are the properties of the test

object (which in this case the orange) such as the size and fieshness. At the end of the

sorting process, each orange will be rated as Grade #I or Grade #2.

Assuming the assessment process is controlled under constant external conditions (such

as Lighting, disturbance, etc), the assessment result is still very often highly variable based

on the backgrounds of the evaluator, such as age, sex, vision, education, experience,

career, etc. For example, the storeowner will have a much higher tolerance on

imperfection of the orange than a store worker. The assessment result is not just

inconsistent fkom person to person. Sometimes the result generated by the same person

could be inconsistent fkom time to time. The test object rating is often affected by the

qualities of the preceding test objects. This phenomenon is known as adaptation.[AC72]

For example, assuming orange A and orange B both have identical fieshness and weight.

If orange A is been assessed after a sequence of extremely high quality oranges and

orange B is been assessed after a sequence of close to rotten oranges, it is very likely that

A will be rated as Grade #2 and B will be rated as Grade #l.

3.1.1 Subjective Evaluation

Subjective evaluation can be either on a "Go-or-No Go" basis or on a scaled basis. A

"Go-or-No go" evaluation is just like the evaluation of the orange sorting process. The

test object is either rated as "good" quality (Grade #1) or "not-as-good" quality (Grade

#2). In contrast, for scaled evaluation, the rating can be either made on an absolute scale

or by means of side-by-side comparisons as shown in Table 3- 1.

Table 3-1 Scaled Evaluations

a) Absolute Scale w60)

Vdue Rating Descri~tion - 1 Excellent An image of extremely high quality, as good as you

could desire

2 Fine An image of high quality, providing enjoyable viewing. Interference3* is not objectionable.

3 Passable An image of acceptable quality. Interference is not objectionable.

4 Marginal An image of poor quality; you wish you could improve it. Interference is somewhat objectionable.

5 Inferior A very poor image, but you could watch it. Objectionable interference is definitely present.

6 Unusable An image so bad that you could not watch it.'9

b) Side-By-Side Comparisons [GW92]

{much worse, worse, slightly worse, the same, slightly better, better, much better}

'' The word 'interference' refers to noise, distortion and artifacts. 39 Image quality of this rating is totally unacceptable. Viewers will start to feel annoy when watching

image of this type of quality

3.1.2 Subjective Assessment Methodology

As suggested by K. T. Tan et al. [TGP98], subjective assessment can be divided into

three classes: single stimufus method, comparison method, and double stimulus method,

as depicted in Figure 3-1.

Fieure 3-1 Subiective Assessment Methodolow

Rating A, B, C , D, .. .

a) Single Stimulus Method

A Rating A

b) Comparison Method

c) Double Stimulus Method

Rating B B ...

A A

* Note: A, B, C, D, . . . and Ref. is referred to as the presentation of

test object A, B, C, D, . . . and reference object.

Rahg A Ref: A Ref:

Rating Ref.

. . .

3.1.2.1 Single Stimulus Method

In the single stimulus method, the subject is presented with a single test object one at a

time. At the end of each presentation, the subject is asked to give a rating for the test

object. Then, the same procedure is repeated until all test objects are presented. In single

stimulus method, the subject does not refer back to the previous assessment results for

references.

This method is typically used in experiments in which it is difficult to assess more than

one stimulus at a time (e.g. audio and video assessments) or when the assessment time

permitted is limited. The phenomenon of adaptation will tend to have a significant effect

on the test results. A typical example of the single stimulus method is the orange sorting

method given in the previous section.

3.1.2.2 Comparison Method

In the comparison method, aIl test objects are presented to the subject at the same time.

During the presentation, the subject will have the opportunity to compare and sort the

qualities of all test objects. The subject will rate the objects after they have been sorted.

In this method, the effect of adaptation is least significant.

3.1.23 Double Stimulus Method

Similar to the single stimulus method, the test objects of the double stimulus method are

presented in a sequence. But in each presentation, a constant reference object is also

present at the same time. The subject is not informed about which is the reference object

and is required to give rating for both the reference and test objects.

This method is very popular in video assessment. It is relatively more time consuming,

but the result yielded is less adaptive and more reliable than the single stimulus method.

3.2 Objective Fidelity Criteria

An objective fidelity crifen'on is the standard for assessing the quality based on

quantitative measurements obtained fkom the test object. In most cases, if the assessment

process involves more than one criterion, it is represented in a form of a mathematical

model.

Using the same orange sorting process as an example, in this case, the decision-making

process is no longer carried out by the worker. The responsibility of the worker is limited

to the acquisition of the relevant criteria data such as weight (for size) and packaged date

(for freshness). He will enter these data into the mathematical model as shown in Figure

3-2. If the value generated by the model is above a certain threshold, the orange will be

rated as Grade #I. Otherwise, the orange will be rated as Grade #2.

Fieure 3-2 Example of Obiective Assessment

Mathematical Threshold ,b Grade #1 package Testing

date - -b Grade #2

In image quality assessment, the objective fidelity criteria commonly used are shown in

Table 3-2. The purpose of these criteria is typically used to measure numerically the

difference between the original and reconstructed image. However, it does not

necessarily represent the difference that the HVS perceives. In the next section, a more

complete proof will be provided.

Pase 45

Table 3-2 Commonlv Obiective Fidelitv Criteria

3.3 Using An Error Based Criterion as an Image Quality

Error:

Total Error:

Root-Mean-

Square Error

(root MSE):

Root-Mean-

Square signal

to noise ratio

(root SNR):

Indicator

e(x,v) = F ( x , ~ ) - f ( x ~ ~ ) (3-1)

M-1 N-1

e,, = C. C licx, Y -my Y)[ (3-2) x=O+

M-1 N-l 'I" (3-3) =[L zz [ i (XyY~-f ,,o (xyY,

14-lN-l [ -0 Z Y-o E ~(x.Y)']% (3-4) S% = A# lY-I [ i *P [ icX.~-/cx.Y~

ma Y-o I]"

Previous research has shown that the pattern sensitivity of humans plays a very important

role in human visual perception pK98][Wan95]. Details with a specific pattern or

structure will tend to be more perceivable than details that are distributed randomly.

f (x, y) b the p b f value of the ongr'nal image where j(x, y) b the pirrl vafue of the reconstruc!ed inago

This section uses two examples to demonstrate the effect of patterned artifacts and to

show the ineffectiveness of using MSE as a quality indicator for DCT-based compressed

images. In the first example, it shows how the noise becomes more noticeable as the

noise becomes structured A very similar example was also given in van den Branden

Lambrecht and Kunt's research [BK98]. The second example provides evidence that

MSE is ineffective when it is used to indicate the quality of a DCT-based compressed

image.

33.1 Effect of Patterned /Structured Artifact

In Figure 3-3a) shows an image polluted with random 'Speckle' noise, and b) and c) show

the same image with horizontally structured 'speckle' noise of frequency 7.5 and 5.0 cpda

(cycles per visual degree) respectively.

This horizontal structured artifact in b) and c) makes the noise become more noticeable

and annoying than that in a), even though they all have the same root MSE of

approximately 27 grayscale levels. As the spatial fkequency of a horizontal artifact

decreases in c), the annoyance and perceivable level due to the present of the noise

increases. This phenomenon of increasing sensitivity of patterns corresponds with the

contrast sensitivity behavior of the HVS as mentioned in Chapter 2. As shown in Figure

2-8, the conatrst sensitivity of the HVS decreases as the spatial frequency increases.

Fieure 3-3 Structured Artifact

a) Random noise b) Structured noise at 7.5 cpd c) Structured noise at 5.0 cpd

The visual angle calculation is based on viewer and image distance of 18 inches. At 18 inches viewing distance, linch of image dimension is comsponded to 3.18 degrees of visual angle. For more detailed explanation on the calculation of visual angle, pteasc refer to Section 2.1.

33.2 Using MSE as an Quality Indicator for Lossy DCT-based

Compressed Image

MSE is widely used in evaluating image quality mainly because of is simplicity. It is a

good quality indicator for image with random noise, but (as proved by the previous

example) it is not a good indicator for images with structured artifacts. In lossy DCT-

based compression, a structured artifact, namely blockiness, is created as a 'by-product' of

the compression. This blockiness effect makes the quality deterioration of the image

more noticeable and annoying to the HVS.

Also, MSE is not a good indicator for quality of image deterioration due to quantization

of details in lossy compression. Figure 3-4a) and b) show images polluted with random

'Salt & Pepper' and 'Speckle' noise respectively. And c) & d) show the images

compressed by P E G algorithm. They all have the same root MSE of approximately 15

grayscale levels. For Figure 34a) and b), it is still acceptable to say that they have

similar quality level. However, comparing the qualities of a), b) and c), viewers will fmd

that the image in c) is so highly distorted that the quality of c) is much less than that of a)

and/or b).

Moreover, the image of a rose in d) shows a compressed result, in which the compression

causes a sigmficant deterioration of quality. The detail in the image is totally not

recognizable. (The original uncompressed version of the 'Rose' is available in Appendix

B.) And it is unreasonable to say that c) and d) have the same quality, even though they

have the same root MSE.

Therefore, the MSE is not a good criterion for image quality evaluations. In the next

chapter, a mathematical model using more sophisticated criteria is proposed. It targets at

compressed image with the existence of structured artifact and the characteristic of the

human assessment processed is suggested.

Fienre 3-4 Structured Artifact caused bv Lossv DCT based Com~ression

a) Random "Salt & Pepper" noise b) Random "Speckle" noise

r d) JPEG compressed image (rose)

Chapter 4 POIQE Model Design

As mentioned in Chapter I, the proposed quality evaluator, namely Pychovisud&-based

Objective Image Qualiw Evaluator (POIQE), is a mathematical model that imitates the

subjective human evaluation of image W i t y . This evaluator objectively yields an

evaluation index as an output of the model. In simulating the subjective human

evaluation, some of the major concerns are:

How does a human observer judge the image quality? What are the criteria

that the human observer used during the judging process?

Is the observer particularly more sensitive to any specific distortion in the

image?

When the human observer looks at the transmitted image through the Internet or a

teleconference session, the first thing he will do is to identi@ or match the details shown

in the image with the objects that he has seen before in his everyday life. The quality

assessment of the transmitted image will be based on the recognizable level of these

details. In general, these details that the observer tries to match can be divided into two

categories: d-orated detail and genuine detail.

The deteriorated details are usually noises and artifacts that are embedded in the data as

by-products of the transmission or compression process. Typical examples are noise and

blockiness. The deteriorated details only occur in the reconstructed image, not in the

original image. It is very common that human observer's attention will be drawn towards

these deteriorated details in the image4', especially towards artifacts with a specific

4' It is well known that the human visual system is very sensitive to distortions and artifacts.

pattern or structure. The judgement process is based on the occurrences and noticeable

levels of these deteriorated details that are perceived. If the occurrence and noticeable

level of the deteriorated detail is higher, the image @ty becomes lower.

The genuine details are the original details that the original image is transmitting. If more

details are being recognized, the higher the image quality will be rated- For example, Ivy

sends her sister Vicky a picture of herself that was taken during a vacation by electronic

mail. She scanned in the image and stored it in JPEG format. When Vicky receives the

picture, she does not have the original image to compare with. Instinctively, the first

thing that Viclq does is to identify what is in the picture. To do that, she will search

through her huge database of memory, and try to look for a match for each detail in the

picture. Of course, the first detail that she recognized is her sister Ivy. Then, finding a

match of a sandy beach scene in her memory, she realizes that the picture was taken at a

beautiful beach. She also realizes that Ivy is trying to show her a magazine in her hands.

Then, she will try to recognize the characters and the cover girl on the magazine to find

out what she is trying to show. During the entire process, if it takes Vicky a long time to

recognize the details, she will rate the quality of the picture downP2

Based on these characteristics of the assessment process, the POIQE is designed to

robustly weigh the deteriorated and genuine details, based on their occurrence and

perceivable level to the human vision system. In comparison to human subjective

evaluation, the advantages of this evaluator are its consistency, cost-efficiency and high-

speed processing capability. Most importantly, this evaluator can be integrated into the

digital data compression algorithm. In this chapter, a mathematical model is proposed to

resemble the human quality evaluation based on the measurement of the deteriorated and

genuine details.

42 Of course, one might say the quality assessment of an image is also based its properties such as contrast and color representation. But, all these properties aIso define the clarity of the image, which leads to the ease of recognizing the details of the image.

4.1 Mathematical Model

Based on the identification of deteriorated and genuine details in the reconstructed image,

the mathematical model used for POIQE consists of three parts: Blockiness Evaluator,

Sinrilari@ Identifier and the M e w as shown in Figure 4-1. The objective fidelity

criteria used in this model are the blockiness and similarity. Using the original and

reconstructed images as inputs, the Blockiness Evaluator and Similarity Identifier

generate the Blockiness Index "B'and Similarity Index "S', respectively, for

measuring those fidelity criteria. These index numbers will be combined by the Merger

to generate a POIQE Index "P'.

Fieure 4-1 Block Diaeram of POIOE

Mathemab'cd Modcl

Original I Image

Reconstructed Image

Index ,.pee

f

The POIQE Index "P" is a number that resembles the human observer's subjective quality

assessment results obtained through experiments4'. The value of the index ranges from 0

to 100, which 0 indicates extreme poor image quality and 100 indicates perfect image

quality (i.e., exactly the same as the original).

43 A more detailed description of the experiment procedures and conditions will be available in the Chapter 5.

4.1.1 Blockiness Evaluator

Designed for evaluation of DCT-based compressed images, the Blockiness Evaluator

focuses on the measurement of the blockiness artifacta. Figure 4-2 shows an example of

the blockiness artifact compared with the original image (on the left). The reconstructed

image (on the right) has a compression ratio of 51.95:l. As mentioned in Chapter 1,

blockiness is defined as the patterned square artifact created as a by-product during the

lossy DCT-based compression. This artifact is clearly shown in the compressed image on

the right.

Fipure 4-2 Blockiness Artifact

4.1.1.1 Dewtion of Blockiness

In JPEG and MPEG algorithms, the discrete cosine transformation is carried out

individually for each 8 x 8-subimage block using the forward and inverse 2-D DCT

formula provided in equations (4- 1 ) and (4-2).

ec A deteriorated detail generated as a side-product of the DCT-based compression.

Assume the entire image has a size of X x Y pixel.

Let Rxpy) = Image intensity represented for x=0 ,1 ,2 ,..., X in space domain y=0,1 ,2 ,..., Y

C(up v ) = Image intensity represented for u=0,1,2 ,..., X in spatial - fkquency domain v=0 ,1 ,2 ,.,., Y

Forward 2-D DCT for each subimage block:

Inverse 2-D DCT for each subimage block:

N = Subimage size (Example, if the subimage size is 8 x 8 pixel, N is equal to 8.)

In the spatial frequency domain, if both u and v is equal to zero (the top Left corner of the

subimage block), equation (4- 1) becomes:

where fa"- = Average of all elements within the 8x8 subimage block

C(0,O) is also known as the DC coefficient, whereas the rest of the coefficients are known

as AC c0efficient8~. As illustrated in Chapter 2, the perceivable level of details in a

pattern is based on its contrast relative to its neighbors. Since the DC coefficient shows

the average intensity of all the pixels in the subimage block, the blockiness effect can be

extracted in terms of the difference in luminance of the current subimage relative to that

of its neighbors.

'' In the electrical engineering literature, AC and DC are the acronyms for alternating current and direct current respectively. The term DC also has the meanings of constant and average in the description of signal. In discrete cosine transformation, the C(0,O) is known as DC coefficient due to it averaging characteristics.

4.1.1.2 Methodology in Computing Blockiness

The computation of Blockiness Index "B" can be explained graphically as shown in Figure

4-3 and Figure 4 4 by letting:

foe (x,y) & = Original and reconstructed images ; for x = 0,1,2, ..., X f, b y 1 represented in space domain y=0,1,2 ,..., Y

respectively

Cow (u, v) & = Original and reconstructed images ; for u = O,1,2, . . . , X c, (up v) represented in spatial - frequency v=0,1,2 ,..., Y

domain respectively

DC, (m.n) & = DC coefficient of the original and ; for m = 0,1,2, ..., M8 DC,(m,n) reconstructed images respectively n=0,1,2 ,..., Y/8

where X = Numbers of Row (Height)

Y = Number of Column (Width)

After the level shifting and discrete cosine transformation process, both the original

C.,(u. v) and reconstructed C,(u, v) images are represented in spatial-frequency domain

of size X x Y pixels.

The Blockiness Index B(m, n) is defined as the average absolute difference between the

current DC coefficient and its eight neighborhoods' DC coefficients of the reconstructed

image relative to that of the original image. As shown in Figure 4-3 a), the image is

divided into (Xx 0164 units of 8 x 8 subimage blocks. In each of these subimage blocks,

the top left coefficient is the DC coefficient (as shown by a shaded pixel in Figure 4-3a).

The DC coefficient of each subimage block is then extracted and represented as DC(m, n)

as shown in Figure 4-3b. Then, the DC(m, n) is subtracted with each of its eight

neighbors.

Finure 4-3 Com~u tation of BIockiness Index

a) Image C(u, v) represented in spatial-frequency b) Average absolute Merence of current DC domain. coefficient DC(m, n) with its eight

The differences between DC(i, n) and its eight neighbors are averaged out to obtain ~ ( m ,

n). Finally, the Blockiness Index is computed as the absolute difference between the

original &*(rn, n) and the reconstructed ~ ,~~ (m, n) as shown in equation (4-9.4

where 1 1 I E4(m,n)= - 8 i=-I x j-I Z I { D c , ( ~ , ~ ) - ~ C , ( m + i , n + j ) ]

Figure 4-4 shows a block diagram for the computation of the Blockiness Index for each

subimage blocks. The Blockiness Index B(m, n) for each subimage blocks is then added

Please note that the Blockiness Index is designed to measure the gradient difference between the current subimage block and its 8 neighbors in term o f 8-bit grayscale. Therefore, Equation (44) is not nmm~lizcci by he E, (m, n ) .

together and divided by the total number of subimage blocks to obtained the Blockiness

Index "B" for the whole image.

Fimre 4-4 Block Diaeram of Blockiness Evaluator

Blockiness Evafuador t

coeff. neighbors Blockiness Index

Rcewst. I B(m. n)

fm&y) f DCT coe& I

4.1.2 Similarity Identifier

Many evaluators often try to define image quality based on the difference between the

original image and the reconstructed image. A common criterion used is the error-based

objective fidelity criteria as shown in Table 3-2. However, during the assessment

process, the observer is actually trying to identi@ details instead of comparing

differences. This is even more obvious when the observer only has access to the

compressed image.

4.1.2.1 Definition of Similarity

Targeting the "details recognizing behavior" of the observer, the Similarity Identifier

focuses on the measurement and identification of the genuine details remaining in the

reconstructed image numerically. In general, sinriIari@ can be defined as a measure of

the degree of resemblance when two objects are under comparison. As mentioned in the

example, the comparison takes place between the reconstructed image and the bits-and-

pieces in the observer's memory. The observer's memory is a huge database accumulated

through experience and time.

4.1.22 Methodology in Computing Similarity

To teach a machine to numerically resemble this matching procedure is almost

impossible due to the lack of this huge database. This is one of the reasons why most

evaluators tend to use the original image as a datum in substitution to the memory

database of the observer. In comparison to the implementation of a database that

resemble the observer's memory, the use of the original image as a datum is much

simpler method. Figure 4-5 provides a block diagram indicating the steps in computing

the similarity of the image: edge detection and matching of genuine details.

Fimre 4-5 Block Diaeram of Simitaritv Identifier

Simiiaritv Idena'fier

f

Detection

Detection i

The first step of the Similarity Identifier is to perform edge detection for both the original

and reconstructed images. The purpose of this step is to outline the details in the images.

The principle behind edge detection is very simple. For each pixel in the image, a

gradient magnitude "G(x, y)" is computed using equation (4-6).

As shown in the equation, the gradient magnitude is a measure of intensity difference of

the current pixel with its neighbors diagonally. If the gradient magnitude increases, it

means that the contrast between the current pixel and its neighbors is bigger. (i-e., It is

more Wrely that the current pixel contains an edge.) After the computation of the

gradient magnitude, a threshold "7" is used to determined whether the gradient magnitude

is large enough to define the current pixel is an edge or not.

E(x, y) = f (edge) if G+, y) 2 z 0 (non - edge) if G(x,y) < z

E(x, y) is a edge matrix containing only binary values of 0's and l's, where 1's represents

an edge pixel and 0's represents a non-edge pixel.

After the edge detection is performed on both images, the next step is to identify the

similarity between the two images. Before matching, the first procedure is to filter out

the nonogenuine details in the reconstructed image using the original edge matrix as a

masking filter as shown in equation (4-7).

If a genuine detail is detected in the original image (i.e. E,,(x, y) = I), then the edge pixel

of the reconstructed image will not change (i-e. E, . mPTrLed(i, y) = Em=& y) x 1).

Otherwise, the edge pixel of the reconstructed image will be masked out (i-e. Ere= .

ma~fi , y) = Em& y) x 0 = 0). The purpose of this masking process is to remove all

false edge that is caused by compression related artifacts.

Then, the next step is to count the occurrence of edge pixel in the original and X Y

reconstructed images ZZ E(x, y). The Similarity Index 5" is the ratio of the x Y

occurrence of the masked edge pixel in the reconstructed image to that of the original X Y

image. Since the edge pixel in the reconstructed image is being masked, E, (x, y) is = Y

X Y always greater than or equal to Z Z E,-,(x, y) . In another words, the Similarity

Index always lies between 0 to 1.

4.13 Merger

The purpose of the Merger is to combine the Blockiness Index and the Similarity Index to

generate a POIQE Index that range between 0 and 100 in resembling the subjective

assessment result. The POIQE Index has a value close to 100 for image almost the same

as the original, and a value close to 0 for image of extremely poor quality.

As shown in Figure 4-6, the inputs of the Merger include the Blockiness Index and the

Similarity Index. The Merger consists of 3 steps: Blockiness Index Modification,

Similarity Index Modification and Index Merging.

Since it is known that the human subjective evaluation results are often saturated at

extreme Limits [GGPT97] [TGP97] [TGP98], the purposes of the Blockiness Index

Modification and the Similarity Index Modification are to shape the two indexes into

curves that gradually increase at a decreasing rate with respect to the quantization factor as

shown in Figure 4-7.

1

Fimrre 4-6 Block Diaeram of Mewer - Blockiness Index

' , POIQE Index

M M e d Me@g Similarity Index - Index Index

. ,

Firmre 4-7 Saturation of the Human Subiective Evaluation Result

Modified A

Blockiaess I' Index (or Modified Similarity Index)

0 -, Quantization factor

4.13.1 Blockiness Index Modification

The purpose of this step is to mod@ and clip the Blockiness Index "B" and to output a

Modified Blockiness Index "B,". The Blockiness Index is modified such that as the

quantization increases, the modified value will increase gradually at a decreasing rate. In

order to do so, the Blockiness Index is modified as shown in equation (4-914'. An

increase in the Modified Blockiness Index indicates that the image has high blockiness

and its quality decreases.

Let x =log@ + 0.5)

4.13.2 Similarity Index Modification

Similar to the Blockiness Index Modification, the Similarity Index Modification process

will modify and clip the Similarity Index "3' and output a Modified Similarity Index "S,"

as shown in equation (4-10)~~. An increase in the Modified Similarity Index indicates that

less genuine details are being recognized and the image quality decreases.

Let x = log2 (1 00 x (I-s))

47 Please refer to section 4.2 for the development o f the equation. '' Please refer to section 4.2 for the development of thc equation.

4.1.3.3 Merging

Once the Blockiness Index and the Similarity Index is being modified, the Merging

process will combine the two value and generate the POIQE Index "P" value using

equation (4-1 1). The POIQE Index will be clipped to a value between 5 to 100.

Let x=lOO-oxB, xS,

where a = Tuning parameter

4.2 The Characteristic of the Model

Unlike other quality evaluators that focus on the intensity difference between the original

and compressed results, the model investigates the "Similarities" between the original and

compressed results, based on the percentage of detaiIs that remain in the compressed

result. This allows the model to compare two images without under-estimating the image

quality based on intensity difference that is not perceivable to the HVS. Also, the model

is tailored to evaluate DCT-based compressed images by evaluating the patterned

artifacts called " B lockiness" produced by DCT-based compression.

In addition, in the development of the equations (4-9) and (4-lo), the parameters were

implicitly determined by matching the human subjective data of the tuning experiments

using the method of "Trial and Error". In Chapter 5, the model is fine-tuned explicitly

using the tuning parameter 'a' of equation (4-1 1). The selection of this mathematical

approach in tuning the model was mainly driven by the simplicity and efficiency of the

method. In tuning the model, the author tried various different mathematical approaches,

including a model represented by a polynomial of degree 6. However, none of these

methods seem to be able to provide a better model tuning than the "Trial and Error"

method.

Chapter 5 Experimental Tuning and

Validation of the POIQE model

In order to adjust the tuning parameters of the POIQE model proposed in Chapter 4, a

series of experiments on the human subjective assessment on image quality is conducted

in this chapter. The details of the experimental procedures and human evaluation results

(i.e., experimental results) are covered in Section 5.1. Then, the human evaluation result

is analyzed and used to tune the model parameters in Section 5.2.

In order to validate the capability of the model, a set of validation experiments is also

conducted. The validaiion result is obtained by using image sets and subjects, which are

completely different fiom the experiment used to tune the model parameters. A detailed

description on the validation experiment is available in Section 5.3.

5.1 Model Tuning Experiments

5.1.1 Purpose

The purpose of the model tuning experiments on human subjective assessment is to

measure the human image evaluation data, which will be used for tuning the parameters

of the proposed POIQE model.

5.1.2 Method and Procedure

As mentioned, one of the advantages of the using objective evaluator is that it gives

consistent image quality evaluation. Therefore, it is very important that the model

parameters are tuned based on some consistent experimental results. However, it is well

known that human evaluation is very subjective and adaptive. Therefore, the

experimental method selected has to be able to maintain a certain degree of consistency

in the experimental results in order to build a consistent model.

As described in Section 3.1.2, the comparison method allows the subjects to constantly

refer back to previous evaluation results that they made. As a result, the comparison

method provides data that are relatively more consistent and less af5ected by the effect of

"adaptation". For this reason, the subjective assessment method chosen in the experiment

is the comparison method with absolute scaled eva~uation~~.

5,1,2,1 Procedure

During the experiment, the image sets are presented to the subject for evaluation

consecutively. All evaluated images are available for comparison at all times. At the

final step of the experiment, the subject is requested to provide an evaluation for each

image based on the absolute scaled evaluation table shown in Table 5-1, which is a

modification of Table 3-1 PB60J based on the purpose of this research.

For the entire experiment, each subject has to provide evaluation of six sets of images,

each consisting of 27 imagesso. Details of the test image sets will be described in Section

5.1.3.

4 9 Detailed descriptions of the comparison method and absolute scaled evaluation are provided in Section 3.1 -2. Each set is composed of 26 reconstructed images at different quantization level and 1 original image.

Table 5-1 Absolute Scaled Evaluation Table

Ouality Average Subiective Descri~tion

Index Value - Rating Ranee

Original

Excellent

Good

Satisfactory

Acceptable

Average

Inferior

Bad

Worse

Worst

Unusable

-- -

An image of extremely high quality, as good as you could desire

An image of excellent quality. Details can be recognized instantly.

An image of acceptable quality. Interference is not objectionable.

An image of acceptable quality. Interference is not objectionable.

An image of poor quality; you wish you could improve it. Interference is somewhat objectionable.

Just barely recognizable

Details in image is totally not recognizable

5.1.2.2 Experiment Instructions

At the beginning of the experiment, the subject is presented with Table 5-1 and

experiment instructions. The experiment instructions are as follows. (Note that the

experiment instructions described below are provided for the subjects in both oral and

written format.)

S t 1 Group the images of Set #1 into columns of similar qualities, and

sort the columns in descending quality as shown in Figure 5- 1.

Fimre 5-1 Sortinn and Grou~ine of Set #1 Imanes

Hieh b - Low OuPlitv OualitY m w m m ml m m m rn rn tw m EJ

tnl m la HI El El ml W#l m la tBFl m

- -

S t e ~ #2 Repeat the same sorting and grouping process as Step #1 for the

images of Set #2.

Sten #3 Match the quality of the Set #1 with the Set #2, column by column

as shown in Figure 5-2. (At this time, some columns of Set #1 might

have to be broken up into two and a new column might have to be

created with Set #2 as shown in the figure.)

Figure 5-2 Oualitv Matchine of Set #1 and Set#2 Ima~es

Em b - Low Oualitv Quality

m ~ m ~ r m m m m E l m . m m m 1

1?1 0 , m I rn m m m fa* rn

5 - lrlrl r m m m m . 0 m i m w

A61 Set#l rn Set#2

m

S t e ~ #4 Assign a quality index range value (first column of Table 5-1) for

each column as shown in Figure 5-3.

Fimre 5-3 Assiane Group Ranee

100 99-90 89-80 79-70 6940 59-50 4940 39-30 29-20 19-10 9-0 - -------- m m m m m m m 6 6 5 1 m m m m m m m m m m f m w m m m m m

m m m m m m w m #rl rn

Z m i m rn Set#l m Set #2

w

S t e ~ #5 Repeat the same sorting and grouping process as Step #1 for the

images of Set #3.

Sten #6 Match the quality of the Set #3 with the previous columns.

S t e ~ #7 Repeat Step #5 and Step #6 until all six sets are evaluated.

Sten #8 After all six sets of image are sorted and evaluated, assign an

evaluation number for each image based on the evaluation index

range assigned to each column. Repeat this step for all columns

except the last three columns (column 0-9, 10-19 and 20-29). For

the last three columns, assigning evaluation number for each image

is not necessary. Simply give all images in the column the average

values1 as given in the second column of Table 5-1). Example, give

all images in column 10-1 9 an evaluation number of 15.

'I Since the average value is given to the group range '9-O', the lowest evaluation rating is '5' (not '09). This explains why the lower limit of equation (4-1 1) is '5'.

5.13 Experimental Images

The test objects used in this experiment include six sets of images. AU images are

monochromic. They are represented in 8-bit grayscale precision and displayed at pixel

density of 96 pixels per inch (or 20 pixel per visual degree53 for both horizontal and

vertical directions. These 6 sets of images are labeled as 'Clifford', 'Keys', 'Girl & Apple',

'Lens Cover', 'Rose' and 'Sunglasses'. Refer to Figure B-1 to Figure B-6 to view the

images.

I . each set, there is one original uncompressed image and twenty-six compressed images.

The compressed images are developed from the original image using the PEG

compression algorithm at twenty-six different quantization levels."

Table 5-2 Exneriment Imaee Parameters

5.13.1 Compressed Data Size and Compression Ratio

Pixel dimension (row x

Image type:

Grayscale precision:

Display pixel density:

Subimage block size:

Each of the twenty-six compressed images has a different compressed data size and

compression ratio. Figure 5 4 and Figure 5-5 shows the plots of the compressed data

size5* and the compression ratio for all six sets of test objects.

160 x 120

Monochrome

8-bit (0 to 255)

96 ppi (or 20 ppd)

8 x 8

s2 The pixel density can also represented as pixel per visual degree ppd, assuming the horizontal distance between the viewer and the object is approximately 12 inches. All sets of images have pixel dimension of 160 x 120, except for image set 'Rose'. The pixel dimension for 'Rose' is 184 x 96. The quantization level is a parameter that defines how much detail will be truncated during the quantization process as descnid in Figure 2-1 1. These twenty-six levels of quantization levels used 1, 2, 3, 4, 5,6, 7, 8,9 , 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 22.5, 25, 27.5, 30, 32.5 and 35.

As the quantization factor increases, it indicates that more infomation is being

truncated. As a result, the compression achieves a reduction in data size (as shown in

Figure 5-4), which leads to an increase in compression ratio (as shown in Figure 5-5).

As the compensation to a better compression ratio, the truncation of details yields a

deterioration of image quality as shown in each set of images in Appendix B. In Figure

5-4, the reduction of data size starts to level off at a quantization factor of approximately

10. This indicates that bther truncation of details will not provide any significant

achievement in data reduction. In this document, the range between quantization factor

0 to 10, which shows a steep slope in the compression size reduction, is considered as

the "effective range" of compression.

55 Unlike the measurement suggested in Section 2.2.2, the measurements of the compressed and uncompressed data size am obtained in exclusive to the image header. Since the image size is not very large, the image header has a significant effect on the analysis of the data reduction. To allow clear analysis of the data reduction due to compression, the image header is excluded.

Fieare 5-4 Plot of Com~ressed Data Size

Fieure 5-5 Plot of Comnression Ratio

5.1.4 Subjective Quality Evaluation Data

5.1.4.1 Experiment Subjects

This experiment involved five individual subjects. They each individually performed the

experiment according to the procedures as described in Section 5.1.2. The experiment on

average takes about 1 '/z hours to complete for each subject.

5.1.4.2 Evaluation Data

The subjective quality evaluation results are recorded as shown in Table B-1 to Table B-6

of Appendix B. The average values of the five subjects are plotted in Figure 5-6 and

summarized in Table 5-3. Each set of data is approximated with a polynomial function

that provides the best fit of the curve.

All six sets of evaluation results indicates a deterioration of image quality as the

quantization factor increase. However, as shown in Figure 5 4 , the rate of deterioration

of quality for each image set is different. For example, the deterioration of 'Lens Cover'

is at much faster rate than that of the 'Girl & Apple'.

Pienre 5-6 Plot of Subiective Oualitv Evaluation

5.1.4.3 Inconsistency of the Evaluation Data

Table

Quantization factor

1 2 3 4 5 6 7 8 9 10 1 1 12 13 14 15 16 17 18 19 20

22.5 25

27.5 30

32.5 35

As mentioned Section 3.1, human evaluation results are very subjective and inconsistent.

As shown by Table B-1 to Table B-6, the evaluation of the same image given by each

subject can be quite different. The difference is represented by the standard deviation

5-3

Clifford

95.20 92.40 88.80 84.60 80.60 69.00 68.00 62.80 57.60 46.20 46.80 46.20 45-80 40.60 33.00 32.80 28.60 26.80 25.80 26.00 22.60 16.80 16.60 12.00 14.00 1 1.00

Average

Keys

94-60 90.20 82.20 8 1.20 72.60 6 1-00 45.00 4 1.00 5 1.60 39.00 34.00 33.40 22.40 26.20 22.60 16.80 22.20 15.00 15.00 7.00 7.00 8 .OO 6.00 6.00 6.00 6.00

Results

Rose

90.80 89.00 86.20 84.00 71.00 64.40 59.80 55.00 52.00 45.60 39.60 32.80 38.00 35.80 26.60 22.20 25.20 18.00 15-00 12.00 9.00 9.00 4.00 4.00 4.00 4.00

Subiective Ouality

Sunglasses

94.60 8 1.60 75.20 65.80 59.80 50.60 47.20 44.00 35.40 38.20 29.20 34.40 32.00 24.20 23.40 18.20 23.80 15.00 17.40 7.00 1 1 .OO 13 .OO 7.00 9.00 4.00 4.00

Girl & Apple 95.00 83.80 85.00 82.60 79.40 74.40 67.40 58.60 57.40 54.00 53.40 48.80 43.80 41.20 36.00 34.20 31.60 32.20 24.00 28.80 23.00 15.00 16.00 7.00 12.00 8.00

Evaluation

Leas Cover 90.40 86.60 78.40 66.80 63.00 52.60 47.60 42.20 42.80 32.80 34.00 3 1.60 29.60 20.00 18.00 14.00 17.00 13 -00 15.00 12.00 12.00 4.00 4.00 5.00 5.00 8.00

& as s b w in Figure 5-1. The figure shows that the standard

deviation tends to be highest at a quantization factor of approximately 10. Then, the

deviation starts to decrease and levels off. This indicates that the human evaluation is

least subjective when the image quality is extremely good or extremely bad.

Figure 5-7 Plot of Standard Deviation for Subiective Ouaiitv Evaluation

5.1.4.4 Confidence Interval

Since the variance of H V S evaluation result for the entire population is unknown, the

experimental data is statistically assumed to follow the t-distni~tion~~ m 8 9 ] , instead

of the normal distribution. The probability of the t-distribution with a 95% confidence is

represented as:

where f = Sample mean,

,U = mean,

tat, = T coefficient of t - Distribution,

tF = Sample standard deviation, and

n = Sample size.

Equation (5-1) shows that the confidence interval (C.I.) of the distribution is bounded

within the range defined as:

CJ. = .T + half - width, (5-2)

where 8 half - width = t,,, * - . J;;

By re-arranging equation (5-3), the minimum sample size (n) required for the result data

to fall within a certain confidence intend range is:

The t-distribution is also known as the "Student t-distribution".

At a 95% confidence level, the tan coefficient is 2.776. From the experimental data, the

average sample standard deviation is 1 129. Considering a confidence interval with a

halGwidth of 15, the minimum sample size required is 4.36.

The experimental data is obtained and averaged from data performed by five different

subjects. Therefore, it is reasonable to conclude that the subjective human evaluation

results obtained fall within 95% confidence level of the population mean p.

5.2 Analysis

5.2.1 Blockiness Index

Using the calculations suggested in Section 4.1 .l, the Blockiness Index is presented in

Figure 5-8 for all six sets of images.

The initial portion of the plot increases fairly linearly. Referring to the image sets in

Appendix B (Figure B-1 to Figure B-6), observers will find that the blockiness of the

images with quantization factors 1 to 10 tends to increase fairly gradually. Also, the rate

of increase in blockiness of aLl six sets of images is very much the same.

Fimre 5-8 Plot of BIockiness Index

As the quantization factor increases, the Blockiness Index of different image sets

starts to diverge. In Figure 5-8, the diverging directions that image sets 'Lens

Cover' and 'Sunglasses' are quite different. The diversion can be understood by

looking at the characteristics of the two sets of images. Comparing the two sets,

observers will find that image set 'Sunglasses' is composed of a large area of a

smooth background. In contrast, the image set 'Len Cover' has a relatively more

complex background. As the quantization factor increases to the levels beyond

20, this blockiness becomes less severe in the 'Sunglasses' images set (in

comparing to the 'Lens Cover' image set) due to slow gradient change in the

background.

5.2.2 Similarity Index

The Similarity Index is measured using the method suggested in Section 4.1.2. Figure

5-9 shows a plot of the results for all the image sets. As shown in the plot, the Similarity

Index of each image set decreases gradually as the quantization factor increases. This

reduction in the Similarity Index indicates that the details in the image become less

recognizable, as more details are being truncated.

Finure 5-9 Plot of Similaritv Index

In addition, Figure 5-9 shows that the Similarity Index of the 'Lens Cover' image set is

much less than that of the other five sets of image. The image set 'Lens Cover' has a

much smaller Similarity Index because the majority of the edge details used to define the

index is the wood grain in the image's background. These edge details are easily

truncated even at a very low quantization factor. Comparing the original image and the

compressed image with quantization factor of 1 in Figure B-4 of Appendix B, observers

will find that most of the fine wood grains in the compressed image are truncated. As a

result, a significant amount (by percentage of total) of edge details is lost. Therefore, the

Similarity Index is much lower in comparison to the other five sets.

5.23 POIQE Index

5.23.1 Modified Blockiness hdex

With the Blockiness Index, the modified Blockiness Index can be calculated using

equation (4-9). The results are presented in Figure 5-10 as follows.

Fieure 5-10 Plot of Modified BIockiness Index

53.32 Modified Sidarity Index

Similarly, the modified Similarity Index is calculated in according to equation (4- 10)

using the Similarity Index. The results are presented in Figure 5- 1 1.

m e 5-11 Plot of Modified Similaritv Index

5.233 Numerical Analysis for Model Parameter

Now that both the modified Blockiness Index and the modified Similarity Index are

obtained, the next step is to merge the two indexes together in order to generate a POIQE

Index. The POIQE Index generated is expected to closely match the subjective quality

evaluation results obtained from the experiments (as recorded Figure 5-6 or Table 5-3).

Io order to match the POIQE Index to the subjective quality evaluation results as closely

as possible, the arbitrary parameter 'a' in equation (4-1 1) is determined by using the

Method of Least ~~uares". In equation (4-1 l), the calculation of the POIQE Index is

57 For a detail explanation of the Method of Least Square, please refer to Cbapter 10 of Numerical Mathematics and Computing by Cheney and Kincaid. [CK85]

where a = An arbitary parameter, P = POIQE Index,

B, = Modified Blockiness Index, and S, = Modified Similarity Index.

Let k be the Subjective Quality Evaluation result that the POIQE Index tries to match.

Then, the mean square error @ of the Least Squares Approximation is

where n = Total number of Subjective Evaluation data (i-e., 26 images per set x 6 image sets = 156 images).

According to the Method of Least Square, @ is minimized with respect the parameter 'a',

a# if - = 0. Using equation (5-5), aa

Then, re-arrange the above to determine 'a'.

Ushg averaged Subjective Quality Evaluation (6 ), modified Blockiness Index ( B,, )

and modified Similarity Index (S,,), the numerator and denominator of equation (5-7)

are obtained as follow.

Therefore,

By substituting 'a' =53.346 into equation (4-1 1). the POIQE Index generated &om the

mathematical model proposed in this thesis is obtained as shown in and Figure 5-12.

Fimre 5-12 POIOE Index

Figure 5-13 to Figure 5-18 show the plot of the POIQE Index obtained for each image set

in comparing to the Subjective Quality Evaluation results.s8 In each plot, the data

representations are as follows.

1) The Subjective Quality Evaluation results obtained fiom the experiments are

represented in markers,

Please note that the plots in Figure 5-13 to Figure 5-18 are not smoothened because the model is designed so that it can also evaluate the quality of a standalone image, where the image qualities rating of the previous and following images arc unknown. Also, these figures arc not plotted with respect to their comprcssion data size for better data presentation. As shown in Figure 5-4, the compression data size level off beyond quantization level of 10. If the figures are plotted with respect to heir comprcssion data size, the data with the low compression data size (or quantization level of 10 or higher) will be all collapsed together.

2) The thin Line, indicated as the "Poly (Human Evaluation)" in the legend, is the best-fit

polynomial5g of the Subjective Quality Evaluation results generated by Microsoft

EXCEL using polynomial of sixth-order.

3) The POIQE Index generated from the mathematical model proposed in this thesis is

represented as a thick line in the plot.

'' The purpose of this best-fit polynomial is to smoothen the experimenta1 result with a continuous tine, so that reader can use this line to visually determine how close is the POlQE Index matching with the experimental results.

Figure 5-15 POIOE Index for 'Girl & Arrnle'

Figure 5-16 POIOE Index for 'Lens Cover'

Figure 5-17 POIOE Index for 'Rose'

Fieure 5-18 POIOE Index for 'Sunelasses'

5.2.4 Errors and Observations

5*2*4.1 Mean Square Error # and Root Mean Square Error

With P with as shown in equation (5-9, the root mean square error (root MSE) f i of the results generated by the proposed mathematical model is approximately 10.6 units.

The POIQE Index of all six set of image provides fairly close match to the Subjective

Quality Evaluation results, except for image set "Sunglasses" (Figure 5-18). Table 5-4

summarized the MSE and root MSE of all image sets, all image sets (excluding

"Sunglasses") and just the "Sunglasses" image sets. Excluding the image set

"Sunglasses", the POIQE Index have a root MSE of 7.4 units.

Table 5-4 Mean Sauare Error & Root Mean Sauare Error

root MSE (units)

10.6

7.4

19.8

6.1

All image sets

AM image sets (except "Sunglasses1')

Just the wSunglrrssesw image set

All image sets of quantization factor 1-1 0 (except llSunglasses")

MSE (units sq.)

1 1 1

55

394

37

1

5,2,4.2 Observations

Based on the above analysis, two observations were found.

1) The POIQE Index is most accurate when the data compression is at its

"effective rangew.

As indicated by Figure 5-4, the compression data size of the images is most

significantly reduced at a quantization factor of 10 or below. In Section 5.1.3,

the range with quantization factor below 10 is defined as the "effective range".

Any further detail truncation with higher quantization factor will have no

significant achievement on data reduction. However, the image quality

continuously reduced. For industrial applications, there is no reason to fiuther

compress the image, if there is no significant reduction in data size. Therefore,

the evaluation of the performance of the proposed POIQE should be focused at

the "effective range".

In Figure 5-13 to Figure 5-17, the POIQE Index provides a very close match to

the Subjective Quality Evaluation at quantization factor of 10 or below. The root

MSE of the images (except image set "Sunglasses"), from quantization factor I

to 10 is approximately 6.1 units as shown in Table 5-4. As the quantization

factor increases beyond 10, the POIQE Index begins to deviate fiom the

Subjective Quality Evaluation.

Combining the above, it is reasonable to say that the proposed POIQE Index

provides the most accurate match with the Subjective Quality Evaluation, when

the data compression is at its "effective range".

2) The POIQE Index provides a good match to the Subjective Quality

Evaluation, except when the image contains only very simple contents.

The POIQE Index provides a reasonably close match to the Subjective Quality

Evaluation for image sets "Clifford", "Key", "Girl & Apple", "Lens Cover" and

"Rose" (Figure 5-13 to Figure 5-17). However, the POIQE Index for image set

"Sunglasses" (Figure 5-18) is relatively higher than the Subjective Quality

Evaluation. In another words, the proposed POIQE rates the "Sunglasses" image set

at a higher quality rating than the subjective human evaluation. The reason for this is

explained as follow.

From the image set "Sunglasses", it is obvious that the image contains only very

simple details. Majority of the area in the image is occupied by a plain smooth

background, and the image contains details of very simple outlines and features.

Therefore, there is relatively less information for the viewer to use in recognizing the

details. As a result, a very small amount of information truncation will cause the

image to become unrecognizable. For the same reason, images with simple details

are more likely to be rates down by the human subjective evaluation; even only very

small percentage of details is been truncated.

5.3 Validation of Model

5.3.1 Purpose

The purpose of the validation experiment is to verify the capability of the proposed

mathematical model in matching the Subjective Quality Evaluation of image sets

other than the test image sets. Also, it allows further investigations of the two

observations obtained in Section 5.2.4.

53.2 Method and Procedure

The validation experiments are performed by four subjects. Using the same procedures

documented in Section 5.1.2, the subjects are requested to provide quality evaluation for

five sets of images as shown in Figure 8-7 to Figure B-l 1. Note that the subjects and

images sets in the validation experiment are completely different fkom those in the model

tuning experiments.

53.3 Validation Result

533.1 Evaluation Data

Obtained from the validation experiments, the subjective quality evaluation is shown in

Table B-7 to Table B-1 I. Using the model, the calculated resuits for Blockiness Index,

Similarity Index, Modified Blockiness Index, Modified Similarity Index and POIQE

Index are summarized in Table B-12 to Table B-16 respectively

Figure 5-19 to Figure 5-23 show the plots of the POIQE Index obtained for each image

set in comparing to the Subjective Quality Evaluation results. In each plot, the data

representations are the same as indicated in Section 5.2.3.

m e 5-19 POIOE Index for 'Bus'

Figure 5-20 POIOE Index for 'Cars'

Finure 5 2 1 POIOE Index for 'Com~ater'

Fienre 5-22 POIOE Index for 'Table'

Finure 5-23 POIOE Index for 'Mouse'

533.2 Observations

The validation shows a good match between the POIQE Index and the Subjective

Quality Evaluation. Also, the validation results further demonstrate findings

observed in Section 5.2.4.

Figure 5-19 to Figure 5-22 show that the POIQE Index is most accurate when the data

compression is within its "effective range". The POIQE Index provides a very close

match to the Subjective Quality Evaluation at a quantization factor of 10 or below.

Also, the results indicate that the POIQE Index provides a good match to the

Subjective Quality Evaluation, except for images containing only very simple

contents. Similar to image set "Sunglasses", image set "Mouse" contains only very

simple details. As a result, the Subjective Quality Evaluation of image set "Mouse"

was lower than the POIQE Index as shown in Figure 5-23.

Chapter 6 Conclusion

In this thesis, a mathematical model for objective image quality evaluation was proposed.

This model was designed in accordance with the hypothesis described in Chapter 1,

which is that the HVS evaluates image quality based on patterned artifacts and

recognizable details of the image. The model measures the image quality of DCT-based

compressed images using the psychovisually-based indexes of blockhess and similarity,

and combines these to yield the so-called POIQE index.

The model was calibrated by using the subjective assessment results obtained by the

model tuning experiments. Then, the performance of the model was verified by a series

of validation experiments. The validation results demonstrated that the POIQE Index is

most accurate within the "effective range" of compression for most images. However,

the model does not perform as well for images containing only very simple contents.

The validation results obtained indicate that the model has its limitations. Firstly, the

model is not very accurate when evaluating images with simple contents. This is mainly

because the human vision system is very complex. The use of only two fidelity criteria in

designing a model to simulate the quality assessment result of the human vision system

may not be sufficient.

Secondly, the model is not designed to account for users' subjective expectation or

definition of image quality. It is designed to measure how closely a test image matches

the original (with respect to HVS-perceived detail). Therefore, a test image that is less

noisy than the original is penalized by the POIQE (since it does not match), whereas

other definitions of quality might consider it better than the original.

Thirdly, for simplicity, a simple model was used to combine the modified Blockiness and

modified Similarity indexes. A model with additional tuning parameters might give better

accuracy. (For example, P#= crB, +bS, + c&S, ; where a, b, and c are tuning

parameters)

F M y , the human contrast sensitivity is highly dependent on the variation of spatial

frequency. Therefore, the size of the blocks within the pattern artifact could have a

significant effect on the assessment results. However, only one block size was used in

both the tuning and validation experiments.

In summary, the proposed model provides image quality evaluation according to how the

human vision system assesses an image. The model provides accurate evaluations of

PEG compressed images that contain enough detail.

Further research is suggested to consider the introduction of an additional index to

measure the complexity of the image, so that the POIQE model can extend its image

evaluation capability to images with very simple contents. In addition, future work could

also investigate the effectiveness of using additional tuning parameters in the equation

that computes the POIQE index fiom the modified Blockiness and modified Similarity

indexes.

References

[ A B ~ ~ I E. H. Adelson and J. R. Bergen, "Spatiotemporal models and the

perception of motion", Journal of the Optical Society of America A, 2(2):

284-95, Feb 1985

[AC72] J. W. Allnatt and J. M. Corbett, "Adaptation in observers during television

quality-grating tests", Ergonomics, 1 5: 353-3 56, 1972

[Ade93] E. H. Adelson, "Perceptual organization and the judgment of brightness",

Science 262: 204292044,1993

[AG87] L. Arend and R Goldstein, " Simultaneous constancy, lightness, and

brightness ", Journul of the Optical Society of America A, 4(12): 2281-5,

1987

l?3=58] H. B. Barlow, "Temporal and spatial summation in human vision at

different background intensities", Journal of Physiology, 14 1 : 3 3 7-50,

1958

lJ3FLSV971 L. Boch, S. Fragola, R. Lancini, P. Sunna and M. Visca, "Motion detection

on video compressed sequences as a tool to correlate objective measure

and subjective score", 13th International Conference on Digital Signal

Processing, DSP v 2 p 1 1 19-1 122, Jul2-4 1997

PK981 C. J. van den Branden Lambrecht and M. Kunt, "Characterization of

human visual sensitivity for video imaging application", S i g d Processing

67: 255-269, 1998

Pa%e 1 0

C. J. van den Branden Lambrecht, "A working spatio-temporal model of

the human visual system for image restoration and quality assessment

applications", Proc. International Conference on Acoustics Speech and

Si@ Processing, AtIanta, GA, 7- 1 0 May 1996

F. W. Campbell and R. W. Gubisch, "Optical Quality of the Human eye",

Journal of Physiology, 1 86: 558-578, 1966

Ward Cheney and David Kincaid, Numerical Mathematics and

Computing, Brooks/Cole Publishing Company, Belmont California, 1985.

Tom N. Cornsweet, Visual Perception, Harcourt Brace Jovanovich, Inc.,

Orlando, Florida, 1970

Russell L. De Valois and Karen K. De Valois, Spatial Vision, OKford

University Press, New York, 1988

H. De Lange Dm, "Research into the Dynamic Nature of the Human

Fovea-Cortex Systems with intermittent and modulated light", Journal of

the Optical Society of America, 48(11): 777-84, 1958

Ralph Merrill Evans, An introduction to color, New York: Wiley, 1948.

G. L. Frendendall and W. L. Behrend, "Picture Quality - Procedures for

Evaluating Subjective Effects of Interference", Proc. IRE, Vol. 48, pp.

988-998, 1960

P. N. Gardiner, M. Ghanbari, D. E. Pearson, K. T. Tan, "Development of a

perceptual distortion meter for digital video", IEE Conference Publication

Proceedings of the 1997 International Broadcusting Convention, n447,

v. 1, pp. 493497, Amsterdam, 1997

B. Girod, "Eye movements and coding of video sequences", SPE Visual

Communications and Image Processing, volume 1001, pages 398-405,

1988

Rafael C. Gonzalez and Richard E. Woods, Digital Image Processing,

Addison-Wesley Publishing Company, 1 992.

Barry G. Haskell, Atul Puri and Arun N. Netravali, Digital video: An

introduction to MPEG-2, Chapman & Hall, New York, NY, 1997.

Y. Horita, M. Katayama, T. Murai, M. Miyahara, "Objective picture

quality scale for video coding", International Conference on Information

Processing ICIP-96, Lausanne, Switzerland, Vol. 3, pp. 3 1 9-322, 1 6- 1 9

September 1996

D. J. Heeger and P. C . Teo, "A model of perceptual image fidelity", Proc

International Con$ on Image Processing, Washington, DC, pp. 343-345,

23-26 October 1995

D. H. Kelly, "Visual responses to time-dependent stimuli. I. amplitude

sensitivity measurements", Journal of the Optical Society of America, 5 1 :

422-9, 196 1

D. H. Kelly, "Flicker Fusion and Harmonic Analysis", Journal of the

Optical Society of America, 5 1 (8): 9 1 7-8, 1 96 1

J. Lubin, "A visual discrimination mode for image system design and

evaluation", E. M (Ed.), Visual Models for Target Detection and

Recognition, World Scientific publishers, Singapore, 1995

J. Lubin, "Human Vision System Model For Objective Picture Quality

Measurements", IEE Conference Publication Proceedings of the 1997

International Broadcasting Convention, n447, p498-503, Amsterdam,

1997

Makoto Miyahara, Kazunori Kotani and V. Ralph Algazi, "Objective

picture quality scale (PQS) for image coding", EEE Transactions on

CommUILications v. 46 no9 p. 12 15-26, Sept. 1998

Joan L. Mitchell, William B. Pennebaker, Chad E. Fogg and Didier J.

LeGall, UPEG Video Compression Standard, Chapman & Hall, New

York, NY, 1996.

K. T. Mullen, "The contrast sensitivity of human color vision to Red-

Green and Blue-Yellow chromatic gratings", Jounal of Physiology, 359:

381-400,1985

F. I. Van Ness and M. A. Bouman, "Spatial Modulation Transfer in

H u m Eye", Journal of the Optical Society of America, 57(3): 401-6,

Mar. 1967

M. R. M. Nijenhuis and F. I. J. Blornmaert, "Perceptual-error measure and

its application to sampled and interpolated single-edged images", Journal

of the Optical Society of America A, 14(9): 2 1 1 1-27, Sept. 1997

A. N. Netravail and B. G. Haskell, Digital Pictures: Representation,

Compression and Standards, 2& Ed., Plenum Press, New York, 1995

p. 110-1 16

3. Okamoto, S. Hangai, K. Miyauchi, "A study on subjective and objective

evaluation method for coded moving picture quality", Picture Coding

Symposium PCSf96, pp. 5 1 9-523, Melbourne, 3- 1 5 March 1 996

W. 0. Owen, "Spatial-temporal integration in the human peripheral

retina", Visual Research, 12(1): 10 1 1-26, 1972

Maurice Hemi Leonard Pirenne, Vision and the eye (2nd ed.), 1967

J. G. Robson, "Spatial and temporal contrast sensitivity hct ions of the

visual system", Journal of the Optical Society of America, 56: 1 141 -2,

1966

B. E. Rogowitz, "The Human Visual System: A Guide for the Display

Technologist", Proc. SID, 24(3): 235-52, 1983

c~og921

[Sc h56]

[SteSS]

[SzU87]

[TGf 971

[TGP98]

[TH94j

Bemice E Rogowitz, "Displays: the human factor", Byre v. 17 p. 195-9,

July '92

0. H. Schade, "Optical and Photoelectric Analog of the Eye", Journal of

the @tical Society of America, 46(9): 72 1039,1956

Kate Traurnan Steinitz (editor), Leonardo da Vinci's Trattato della pithrra

(Treatise on paintin&, 1958

Francis W. Sears, Mark W. Zemansky and Hugh D. Young, University

Physics, 7th ed., Addison-Wesley Publishing Company, 1987.

K. T. Tan, M Ghanbari, D.E. Pearson, "A video distortion meter", PCS

'97, pp. 1 1 9- 1 22, Berlin, Gennany, 10- 12 September 1 997

K. Tan, M. Ghanbari, D. E. Pearson, "An objective measurement tool for

MPEG video quality", Signal Processing, Vol.70 N0.3 p. 279-94, Nov. '98

P. C. Teo and D. J. Heeger, "Perceptual image distortion", Proc

International Con$ on Image Processing, pp. 982-6, Austin, TX, 13-16

November 1994

A. Vassilev, "Contrast sensitivity near boarders: Significance of test

Stimulus form, size, and duration", Vision Research, 13(4): 7 19-30, April

1973

Brian A. Wandell, Foundations of Vision, Sinauer Associates, Inc.,

Sunderland, Massachusetts, 1 995

Andrew B. Watson and Jr. Albert J. Ahumada, "Model of human visual-

motion sensing", Journal of the Optical Society of America A, 2(2): 322-

341, Feb. 1985

Ronald E. Walpole and Raymond H. Myers, Probability and Statistics for

Engineers and Scientists, 4& edition, New York, 1 989

[WSgl] Nicholas J. Wade and Michael Swanston, Visual Perception: An

Introduction, Routledge, London, 199 1

Appendk A Lossless Image Compression S ~ ~ c s

The followings are some statistics on lossless compression provided by the company Bitlaa Inc. The company provided a series of comparison60 between their lossless image-compression technique, BitJazz (symbol: JZZ) and other conventional technique, such as BMP, PCX, etc.

The legend below provides a list of acronyms used in the tables. Table A-1 shows a comparison of features between each of these compression techniques. Please note that all the compared techniques are lossless, except JPG.

Legend:

EPS Photoshop Encapsulated Post Script. These files were made with binary encoding.

S C T Scitex CT files.

PXR Pixar Image Computer files.

PS2 Photoshop 2 files.

BMP MS-DOS Bitmap files.

TGA Truevision Targa files.

RAW These files have a 0-byte header.

PCX ZSofi PC Exchange files.

PSD Native Photoshop 5 files-

PCT Macintosh PICT files.

IFF Amiga Interchange File Format.

PDF Photoshop Portable Document Format. These files were made with ZIP (a form of LZW) compression.

TIF Tagged Image File Format. These files were made with LZW (LempeYZiv/Welch) compression.

PNG Portable Network Graphics. These files were made with no interlace and with an adaptive filter. The PNG file size randomly varies within

Thc comparison in this section is obtained fiom the web site httD:f/www.bitiazzcom/a~I~sis.html and httD9fwww.bi tiazz.com/statistics.hhnl.

a couple hundred bytes.

L W LuraWave, available fiom LuRaTech. These files were made with baseline lossless compression and without a key.

JZZ PhotoJazz files.

Table A-1 Feature Com~arison of Various Imaee Format

Table A-2 and Table A-3 show the compression ratios resulted fiom using different compression techniques on a set of 24 photo-quality images donated to research by Kodak. Each of the 768x512 images is represented in RGB color map6'. Please note that the compression ratio of lossless compression can seldom exceed more than 2.6: 1.

isdud

RCB

CMYK

Gnysuk

IhotoRe

L.b

Mmltkbuael

w Spot

Compradom

L4mku

Noa-lm8gc Dun

1cchoma

CRC

m4

L.ycn

6' Referring to , the uncompressed data size is approximate 1.18 M bytes. Assuming each RGB pixel requires 3 x 8 bits of storage space.

PSD PS2 BMP DCS EPS CTF IFF JPC JZZ P m PCX PDF PNC PXR RAW SCT TCA TtF

l . l m . . l . . . l l l

l l l l l l l o o o o a l l •

l l l a l a l l l . l l l l l l l l . . . . . l l . l l l . l l a l •

l l l

. l . 1 1 1 1 . t l

a l l 1 I 1 1 . 1 l

RLE RLE LZW RLE MJT JZZ RLE RLE LZ77 U 7 7 LZW l l l 0 . 0 0 l . l . l a a l l l

l l l l a l l l . l a * . . l l l

l l a l l l . l

. l

a l l

l

Table A-2 Compression Ratio of Various Image Format

PS2 I BMP I TGA - LWF - size - inti0

Table A-3 Snmmarv of Com~ression Ratio

JZZ

2.47

worst c----------------- worse uncompressed better best

Mtao Comp,do, -ti0

EPS

3.748

IFF

1.21

S m

3.998

PXR

3.999

RAW

1-00

PDF

1.50

PS2

1.00

PCX

1.02

TIF

1.58

BMP

1.00

PNG

1.75

PSD

1.03

TCA

1.00

PCT

1.03

Tuning and Validation Ekperrements' Image S&

Quantization factor = 2 Quantization hctor = 7 Quantization fhctor = 12

~ k & t i o n factor = 3

Ouantization Factor = 35.0

Original (cliffid) Quantization factor = 25.0

Page I l l

Quantization &tor = 1 S

Fienre B-3 'Girl & ADD^^' Images

Quantization factor = 1 Quantization -or = 6 Quantization &tor = 1 1

Quantization factor = 2 Quantization fictor = 7 Quantization factor = 12

Quantization factor = 8

Quantization hctor = 9

Quantization factor = 10 Quantization &tor = 15

Quantization firctor = 16


quantization factor = 18


Quantization tactor = 20

Quantization fgctor = 22.5 Quantization *tor = 35.0

Quantization fgctor = 25.0

Quantization factor = 27.5

Quantization &tor = 32.5

16

Fieure B-4 'Lens Cover' Images

Quantization &tor = 16 Quantization &tor = 22.5

Quantization factor = 25.0 Quantization fhctor = 17

Quantization fictor = 27.5 Quantization tactor = 18

Quantization &tor = 30.0 Quantization factor = 19

Quantization &tor = 32.5 Quantization factor = 20

Figare B-5 'Rose' Images

Quantization hctor = 4


Quantization fictor = 7

. .

Quantization kctor = 8

Quantization factor = 6 Quantization fiicror = 9

Quantization fhctor = 10


Quantization factor = 1 2

Quantization &tor = 14

Quantization &tor = 16



Quantization fictor = 17

-tion factor = 32.5

Figure B-8 'Cars' Images



Quantization factor = 3 Quantization factor = 8


Quantization factor = 5 Quantization fhctor = 10


Quantization fkctor = 19


Quantization factor = 22.5 Quantization &tor = 35.0---

--

Quantization factor = 25.0 Original (cats002)

Quantization ktor = 22.5

Quantization factor = 32.5

Quantization k t u r = 16 Quantization factor = 22.5 Quantization tactor = 35.0

Fimre Ell 'Mouse' Images

Quantization &tor = 1 5

Quantization factor = 25.0 Original ( m o d 1)

Quantization factor

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

22.5 25

27.5 30

32.5 35

Evaluation for

Nandi (07-Sep99)

90 90 90 86 78 65 63 50 52 40 39 33 35 34 15 15 15 15 15 15 15 15 15 15 15 15

Table El

Ruby (1 8-Aug-99)

98 94 93 88 83 68 75 68 60 53 57 53 50 42 40 40 45 41 45 40 38 29 28 20 20 20

Table of

Ryan (1 9-Aug-99)

95 86 84 82 75 56 52 48 46 34 36 36 32 25 25 25 15 15 15 25 15 10 10 10 10 10

Sabiective

Alice (23-Aug-99)

99 95 90 85 89 79 75 80 58 39 60 54 50 44 45 29 33 31 29 25 20 15 15 10 10 5

'Clifford'

Average

95.20 92.40 88.80 84.60 80.60 69.00 68.00 62.80 57.60 46.20 46.80 46.20 45.80 40.60 33.00 32.80 28.60 26.80 25.80 26.00 22.60 16.80 16.60 12.00 14.00 11.00

Oualitv

Mark (07-Sep-99)

94 97 87 82 78 77 75 68 72 65 42 55 62 58 40 55 35 32 25 25 25 15 15 5 15 5

Standard Deviation

3.56 4.39 3.42 2.61 5.50 9.35 10.34 13.54 9.74 12.64 10.94 10.76 1230 12.28 12.55 15.30 13.22 11.45 12.38 8.94 9.56 7.16 6.73 5.70 4.18 6.52

'Kevs'

Average

94.60 90.20 82.20 81.20 72.60 61.00 45.00 41.00 51.60 39.00 34.00 33.40 22.40 26.20 22.60 16.80 22.20 15.00 15.00 7.00 7.00 8.00 6.00 6.00 6.00 6.00

Quantization factor

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

22.5 25

27.5 30

32.5 35

Standard Deviation

5.13 3.11 17.08 4.09 11.76 18.75 7.91 4.85 18.62 22.19 15.97 10.01 12-90 13.77 12.90 12.19 9.47 6.12 6.12 2.74 5.70 4.47 5.48 5.48 5.48 5.48

Oualitv

Mark (07-Sep99)

95 91 93 86 83 83 38 37 72 65 45 35 32 33 25 25 25 25 25 5 15 15 15 15 15 15

Evaluation for

Nandi (07-Sep99)

86 85 53 83 53 35 35 35 30 15 15 27 15 29 15 5 15 15 15 5 5 5 5 5 5 5

Table B-2

Ruby (1 8-Aug-99)

98 93 88 80 75 56 51 46 40 30 30 30 10 10 38 10 28 10 10 10 0 10 0 0 0 0

Table of

Ryan (1 9-Aug-99)

95 92 82 75 72 56 52 42 46 25 25 25 15 15 5 10 10 10 10 5 5 5 5 5 5 5

Sabiective

Alice (23-Aug-99)

99 90 95 82 80 75 49 45 70 60 55 50 40 44 30 34 33 15 15 10 10 5 5 5 5 5

Table

Quantization factor

1 2 3 4 5 6 7 8 9 10 1 1 12 13 14 15 16 17 18 19 20 22.5 25 27.5 30 32.5 35

B-3

Ruby (1 8-Aug-99)

97 95 93 90 89 85 87 60 66 68 70 55 56 58 51 46 42 40 30 45 20 10 20 10 10 10

Table of

Ryan (1 9-Aug-99)

95 90 80 78 75 72 55 46 42 40 34 40 34 25 15 15 15 15 10 10 10 5 5 5 5 5

Subiective

Alice (23-Aug-99)

99 95 90 85 80 81 75 79 70 71 71 60 57 55 54 50 44 45 40 39 35 30 25 10 25 15

Oualitv Evaluation

Mark (07-Sep99)

97 94 86 84 77 73 70 65 67 55 65 62 46 42 35 35 32 36 25 25 25 15 15 5 15 5

Standard Deviation

4.69 21.79 7.00 5.64 5.68 9.26 15.04 14.67 14.14 15.86 21.17 14.92 13.61 15.55 16.67 14.52 12.05 12.11 11.94 13.68 9.08 9.35 7.42 2.74 8.37 4.47 ,

for 'Girl & Annle'

Nandi (07-Sep99)

87 45 76 76

Average

95.00 83.80 85.00 82.60

76 1 79.40 61 1 74.40 50 1 67.40 43 58.60 42 j 57.40 36 1 54.00 27 1 53.40 27 48.80 26 26

43.80 41.20 36.00 iz 34.20

25 3 1.60 25 32.20

24.00 25 IS 28.80 25 ' 23.00 15 15 5 5

15.00 16.00 7.00 12.00

5 i 8.00

'Lens Cover'

Average

90.40 86.60 78.40 66.80 63.00 52.60 47.60 42.20 42-80 32.80 34.00 3 1.60 29.60 20.00 18.00 14.00 17.00 13.00 15.00 12.00 12-00 4.00 4.00 5-00 5.00 8.00

Quantization factor

1 2 3 4 5 6 7 8 9 10 1 1 12 13 14 15 16 17 18 19 20 22.5 25 27.5 30 32.5 35

Standard Deviation

8.02 6.1 1 11.72 16.32 15.33 23.27 17-90 22.29 15.09 19.21 13.82 15.26 13.63 14.58 12.04 12.94 12.55 13.51 11-73 5.70 4.47 2.24 2.24 3.54 3.54 4.47

Oualitv Evaluation

Mark (07-Sep99)

88 85 81 75 74 65 55 55 45 34 38 40 35 30 25 25 25 IS 15 15 15 5 5 5 5 5

for

Nandi (07-Sep99)

80 79 60 49 48 25 27 15 26 5 15 15 15 5 5 5 5 5 5 5 5 5 5 5 5 IS

Subiective

Alice (23-Aug-99)

99 95 90 85 80 75 60 70 50 55 49 45 40 40 35 30 35 35 35 20 15 5 5 10 10 5

Table B-4

Ruby (1 8-Aug-99)

98 90 86 75 67 68 66 46 63 45 43 43 43 10 10 0 10 0 10 10 10 0 0 0 0 10

Table of

Ryan (19-Aug-99)

87 84 75 50 46 30 30 25 30 25 25 15 15 15 15 10 10 10 10 10 15 5 5 5 5 5

Quantization factor

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

22.5 25

27.5 30

32.5 35

Standard Deviation

9.09 8.86 9.52 6.96 21.58 21.66 18.29 16.73 21.10 16.41 26.75 14.46 18-36 16.75 11.59 9.96 10.96 4-47 7.91 8-37 4.18 4.18 2.24 2.24 2.24 2.24

Table B-S

Ruby (1 8-Aug-99)

98 96 95 86 83 80 59 68 65 50 55 50 45 45 3 1 3 1 30 20 20 10 10 10 0 0 0 0

Evaluation for

Nandi (07-Sep99)

80 77 77 76 44 42 43 36 35 35 15 15 15 34 15 15 15 15 5 5 5 5 5 5 5 5

'Rose'

Average

90.80 89.00 86.20 84.00 71.00 64.40 59.80 55.00 52.00 45.60 39.60 32.80 38-00 35.80 26.60 22.20 25.20 18.00 15.00 12.00 9.00 9.00 4-00 4.00 4-00 4.00

Oarrlitv

Mark (07-Sep-99)

95 95 94 94 93 85 85 74 75 68 73 45 62 55 42 35 41 25 25 25 15 15 5 5 5 5

Table of

R Y ~ (I 9-Aug-99)

82 82 75 79 52 40 42 40 25 25 10 25 25 10 15 15 15 15 10 5 5 5 5 5 5 5

Subiective

Alice (23-Aug-99)

99 95 90 85 83 75 70 57 60 50 45 29 43 35 30 15 25 15 15 15 10 10 5 5 5 5

Standard Deviation

3.36 16.27 20.44 18.38 21.58 1837 14.92 18.51 2 1.52 18.43 20.58 24.00 21.10 16.54 19.63 12.40 18.09 11.73 17.85 5.70 8.22 10.37 5.70 9.62 2.24 2.24

Quantization factor

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

22.5 25

27.5 30

32.5 35

Onalitv

Mark (07-Sep99)

93 87 85 85 83 72 65 70 62 62 58 64 55 45 52 36 35 25 42 15 25 30 15 25 5 5

Subiective

Alice (23-Aug-99)

99 95 90 80 75 61 50 55 45 44 38 43 40 33 35 25 29 10 30 5 10 15 10 10 5 5

Table B-6

Ruby (1 8-Aug-99)

96 93 86 67 67 57 57 40 40 45 30 45 45 28 10 10 45 30 0 0 10 10 0 0 0 0

Table of

Ryan (1 9-Aug-99)

95 78 75 58 36 34 34 25 25 25 15 15 15 10 15 15 5 5 10 10 5 5 5 5 5 5

Evaluation for

Nandi (07-Sep99)

90 55 40 39 38 29 30 30 5 15 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5

'Sunelasses'

Average

94.60 81.60 75.20 65.80 59.80 50.60 47.20 44.00 35.40 38.20 29.20 34.40 32.00 24.20 23.40 18.20 23.80 15.00 17.40 7.00 11.00 13.00 7.00 9.00 4.00 4.00

-- 4

Standard Deviation

4.50 5.68 9.18 5.12 7.27 11.31 10.24 15.73 14.93 9.36 10.56 10.20 10.80 11.75 7.19 8.62 11.00 11.00 8.62 8.72 5.26 4.76 1.00 1.00 1.00 1.00 i

Evaluation

Barb 8 - F e W

95 85 82 76 70 49 49 40 38 40 38 27 38 22 27 22 22 22 22 7 7 7 7 3 3 3

Snbiective Oaalitv

Vicky 29-Jan-00

99 98 97 88 86 73 63 71 72 25 62 51 60 50 25 1 15 15 25 25 15 15 5 5 5 5

Quantization factor

1 2 3 4 5 6 7 8 9 10 I1 12 13 14 15 16 17 18 19 20 22.5 25 27.5 30 32.5 35

for 'Bus'

Average

95.25 93.25 84.75 83.25 80.75

65 62.25

60 57.75 38.25 48.75

41 44 37

31.5 28.25 25.75 25.75 28.25

18 10.5

8 5.5 4.5 4.5 4.5

Table B-7

Ming 4-Dec-99

98 95 75 85 83 65 63 55 55 47 43 45 37 35 33 25 25 25 25 15 15 5 5 5 5 5

Table of

IVY 2-Dec-99

89 95 85 84 84 73 74 74 66 41 52 41 41 41 41 41 41 41 41 25 5 5 5 5 5 5

Standard Deviation

4.35 9.78 5.07 6.06 6.18 15.97 5.23 5.38 4.55 11.82 13.56 14.80 15.63 11.79 9.60 13.00 10.00 9.57 9.60 5.26 5.26 5.26 5.42 1.00 1.00 1.00

for 'Cars'

Average

94.25 84.75 83.5 79

75.25 64.75

48 41.25

39 30.5 25

26.5 25.5 21.25 19.25 23.75

20 17.5 19.25 10.5 10.5 10.5

7 4.5 4.5 4.5

Evaluation

Barb 8-Feb-00

95 86 76 82 70 70 49 40 40 29 29 29 22 22 22 25 25 15 22 7 7 7 3 3 3 3

Table

Quantization factor I 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

22.5 25

27.5 30

32.5 35

B-8

Ming 4-Dec-99

98 88 85 78 75 65 55 48 45 43 35 38 35 25 25 33 25 25 25 15 15 15 15 5 5 5

Table of

IVY 2-Dec-99

88 71 87 71 72 43 44 35 35 15 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5

Subjective Qualitv

Vicky 29-Jan40

96 94 86 85 84 81 44 42 36 35 3 1 34 40 33 25 32 25 25 25 15 15 15 5 5 5 5

Standard Deviation

1.26 2.00 7.26 8.37 3.86 4.12 8.18 13.04 15.88 12.11 10.24 12.37 16.36 10.30 10.08 19.77 15.61 17.91 7.05 13.20 8.83 12.19 9.83 8.30 6.40 6.40

Onalitv

Vicky 29-Jan-00

96 95 93 94 87 75 82 84 88 76 79 74 78 39 56 77 65 67 52 55 54 53 38 15 15 15

Evaluation

Barb 8-Feb-00

95 95 76 76 85 71 66 62 58 54 56 52 56 42 45 42 45 42 42 42 42 30 32 32 27 27

for 'Com~uter'

Average

96.25 96 85 87

83.25 75-5

72.25 66

64.25 60

64.25 56.75 57.25

47 49.75 48.75 43.5 42.25 42.5 38-25

42 36 29

21.75 20.5 20.5

of Subiective

IVY 2-Dec-99

96 99 83 93 83 81 76 65 56 62 59 56 57 62 60 3 1 3 1 25 41 3 1 39 36 3 1 25 25 25

Table

Quantization factor

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

22.5 25

27.5 30

32.5 35

B-9 Table

Ming 4-Dec-99

98 95 88 85 78 75 65 53 55 48 63 45 38 45 38 45 33 35 35 25 33 25 15 15 15 15

Subiective Ouahtv

Vicky 29-Jan-00

93 84 92 83 82 90 81 91 80 64 66 45 64 46 25 25 25 25 5 5

40 42 5 5 5 5

Table of

IVY 2-Dec-99

99 94 92 79 78 91 75 75 70 68 60 46 60 67 60 45 60 35 41 37 36 25 25 5 5 5

Table

Quantization factor

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 I8 19 20

22.5 25

27.5 30

32.5 35

El0

Ming 4-~ec-99

98 85 88 78 75 65 55 55 43 45 43 48 35 33 38 35 25 35 25 25 25 15 15 15 5 5

Evaluation

Barb 8-Feb-00

95 86 83 74 73 74 49 55 45 37 40 33 37 32 27 22 27 22 15 15 15 15 15 7 3 3

for 'Table'

Average

96.25 87.25 88.75 78.5 77 80 65 69 59.5 53.5 52.25 43 49 44.5 37.5 31.75 34.25 29.25 21.5 20.5 29 24-25 15 8 4.5 4.5

Standard Deviation, 2.75 4.57 4.27 3.70 3-92 12.68 15.41 17.44 18.38 14.89 12.71 6.78 15.12 16.30 16.05 10.44 17.19 6-75 15.35 13.70 11.28 12.74 8.16 4-76 1.00 1.00

Table

Quantization factor

I 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

22.5 25

27.5 30

32.5 35

E l l

Ming 4-Dee-99

98 88 85 83 75 65 58 63 45 33 35 25 15 15 25 15 15 15 15 15 15 5 5 5 5 5

Subiective Ourrlitv

Vicky 29-Jan-00

99 98 97 89 85 88 87 69 68 78 57 79 77 49 76 55 63 46 67 25 25 15 15 15 15 5

Table of

IVY 2-Dee-99

97 87 77 77 69 64 63 48 47 38 25 32 25 15 34 25 25 5 5

25 5 5

25 5 5 5

Evaluation

Barb 8-Feb-00

95 86 83 76 72 68 60 60 45 45 45 45 29 34 37 29 29 29 32 25 25 22 22 22 3 3

for 'Mouse'

Average

97.25 89.75 85.5 81.25 75.25 71.25 67 60

5 1.25 48.5 40.5 45.25 36.5 28.25 43 31 33

23.75 29.75 22.5 17.5 11.75 16.75 11.75

7 4.5

Standard Deviation 1.71 5.56 8.39 6.02 6.95 11.30 13.49 8.83 11.21 20.27 13.70 23.98 27.63 16.48 22.58 17.05 20.85 17.80 27.22 5.00 9.57 8.30 8.88 8.30 5.42 1.00

Table

Quantization factor

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

22.5 25

27.5 30

32.5 35

Validation

Table

0.36 0.79 1.09 1.66 1.94 2.5 1 2.92 3.23 3.06 4.04 4.7 5.17 6.1 6.16 5.73 5.44 5.69 6.43 7.48 8.07 11.12 12.73 14.18 16.14 17.95 21.48

Exnerirnent

Mouse

0.32 0.64 0.96 1.29 1.52 1.9

2.16 2.49 2.93 3.19 3.7 1 4.06 4.75 4-86 4.83 6.06 6.6 1 6-02 6.13 6.95 9.4 10.7 9.1

10.37 10.52 13.05

Index for the

Computer

0.28 0.66 0.99 1.36 1.76 2.13 2.43 2.89 2.7 3.82 3.8 4.54 5.2 1 6.37 6.02 7.94 7.49 7.16 6.09 5.3 1 4.27

5 6.65 7.9 6.0 1 5.98

B-12 Table

BPS

0.37 0.66 1-16 1.54 1.91 2.29 2.1 1 3.27 3 -27 3.62 3.97 3.73 4.8 5-39 5-09 5.46 6.43 6.74 8.14 6.56 9.69 7.7 7.92 8.15 8.85 9.99

of Blockiness

Cars

0.39 0.8 1.15 1.56 2.05 2.39 2.85 3 -47 3.71 4.17 4.58 4.77 4.58 5.52 6.37 6.97 7.36 7.9 8.5 1 9.09 10.96 1 1.94 12.27 12.77 12.49 15.04

Table

Quantization factor *

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

22.5 25

27.5 30

32.5 35

Validation

Table

31% 21% 19% 1 7% 16% 18% 19% 19% 19% 19% 16% 14% 12% 11% 11% 11% 11% 12% 13% 13% 14% 13% 14% 13% 13% 11%

Emeriment

Mouse

74% 69% 65% 63% 58% 59% 58% 56% 55% 53% 52% 50% 47% 45% 47% 46% 50% 48% 47% 48% 45% 40% 38% 37% 33% 32%

513 Table

Bus

84% 80% 79% 77% 76% 74% 73% 72% 72% 70% 69% 68% 67% 67% 66% 65% 66% 64% 63% 63% 59% 53% 50% 46% 44% 39%

of Similaritv

Cars

71% 65% 61% 60% 58% 57% 56% 54% 53% 51% 47% 44% 43% 41% 41% 42% 42% 40% 40% 40% 39% 35% 30% 27% 24% 22%

Index for the

Compater

82% 76% 73% 70% 70% 65% 62% 62% 63% 63% 59% 53% 46% 45% 45% 54% 53% 49% 47% 50% 46% 44% 40% 36% 29% 28%

Experiment

Mouse

0 0.06 0.16 0.25 0.3

0.38 0.42 0.46 0.53 0.56 0.58 0.64 0.7 0.7 0.71 0.79 0.82 0.76 0.78 0.84 0.96 0.94 0.96

1 1-01 1.03

Table 5 1 4

Quantization factor

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

22.5 25

27.5 30

32.5 35

Table of

Bus

0 0.06 0.2 1 0.3 0.37 0.43 0.39 0.56 0.53 0.57 0.62 0.58 0.7 0.7 0.72 0.73 0.77 0.83 0.92 0.8 1 0.97 0.87 0.88 0.9 1 0.95

1

Index for

Computer

0 0.06 0.16 0.24 0.32 0.4 0.4 0.43 0.43 0.53 0.57 0.6

0.66 0.7 1 0.73 0.78 0.79 0.82 0.76 0.7

0.6 1 0.73 0.85 0.92 0.73 0.68

Modified BIockiness

Cars

0 0.1 1 0.22 0.3 1 0.4 0.46 0.52 0.6 0.6 1 0.66 0.69 0.67 0.68 0.72 0.78 0.84 0.86 0.9 0.93 0.96 1.05 1.08 1.06 1.09 1.06 1.02

the Vaiidation

Table

0 0.1 1 0.2 0.33 0.38 0.48 0.53 0.54 0.54 0.65 0.7 0.7

0.66 0.74 0.73 0.7 0.76 0.83 0.89 0.93 1.06 1.1 1 1.16 1.2 1.25 1.25

Table B-15

Quantization factor

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22.5 25 27.5 30 32.5 35

Table of

Bus

1.45 f -69 1.75 I .85 1.90 2.00 2.05 2.09 2.09 2.18 2.22 2.27 2.3 1 2.3 1 2.35 2.38 2.35 2.42 2.46 2.46 2.60 2.80 2.89 3.00 3.06 3.19

Modified Similaritv

Cars

2.14 2.3 8 2.53 2.57 2.63 2.67 2.70 2.76 2.80 2.86 2.97 3 -06 3 -08 3.14 3.14 3.1 1 3.1 1 3.16 3.16 3.16 3.19 3.29 3.40 3.47 3.54 3.58

Index for

Computer

1.58 1.90 2.05 2.18 2.18 2.38 2.50 2.50 2.46 2.46 2.60 2.80 3 -00 3.03 3 -03 2.76 2.80 2.92 2.97 2.89 3 -00 3 -06 3.16 3.26 3.43 3 -45

the Validation

Table

3.38 3.60 3.64 3.68 3.70 3 -66 3.64 3.64 3 -64 3.64 3.70 3.74 3.78 3.80 3.80 3.80 3.80 3.78 3.76 3.76 3.74 3.76 3.74 3.76 3.76 3.80

Emeriment

Moase

2.00 2.22 2.3 8 2.46 2.63 2.60 2.63 2.70 2.73 2.80 2.83 2.89 2.97 3.03 2.97 3 .OO 2.89 2.94 2.97 2.94 3.03 3.16 3.2 1 3.24 3.33 3.36

-

Mouse

100 96 88 81 76 70 67 62 56 53 50 45 38 36 37 29 29 33 3 1 26 13 1 1 8 5 5 5

Validation Exneriment

Table

100 88 78 63 57 48 42 41 41 30 22 21 25 16 17 20 14 6 5 5 5 5 5 5 5 5

Table

Quantization factor

1 2 3 4 5 6 7 8 9 10 1 1 12 13 14 15 16 17 I8 19 20 22.5 25 27.5 30 32.5 35

El6 Table

Bus

100 97 89 83 79 74 76 65 67 63 59 61 52 51 49 48 46 40 32 40 24 27 24 18 13 5

of POIOE

Cars

100 92 83 76 68 63 58 51 49 44 38 38 37 32 27 22 19 15 12 9 5 5 5 5 5 5

Index for the

Computer

100 97 90 84 79 72 70 68 68 61 56 50 41 35 34 35 34 28 32 40 45 33 20 10 25 30

Appendir C Vdeo Format and Color Spaces of Composite and Component T V S y ~ e m s

Existing color TV systems can be classified as composite system and component system. The major difference between the two is the luminance and chrominance signals of the composite system are encoded into the same channel, whereas those of the component system are transmitted separately.IHpN971 Haskell, Puri, and Netravali m N 9 7 ] provided a very good conversion between each color space used in these systems.

Composite system

Some common analog composite systems are NTSC, PAL and SECAM. Standardized in 1953, the NTSC is mostly used in North America, South America, the Caribbean and Japan. IHpN971 On the other hand, the PAL is commonly used in Western Europe and SECAM system is more common in France, Russia, the Middle East and Eastern Europe. FT)I95] The following provided the color space conversions to RGB signals. Please note that the R 'G'B ' is gamma-corrected RGB.

PAL System (YUV color space)

Y =0.299 R'+Om587G'+0.114 B' RV=1.O Y + 1.140 V

U=-Om147R'-Ua289G'+Oa436B'~4 V=0.615 R'-0.515G'-0.200 B' B'= 1.0 Y - 2.030 U

where Y is Lllminance signal U is Hue signal V is Saturation signal

NTSC System (YIQ color space)

where Y is Luminance I is Inphase Q is Quadrature

a SECAM System (YDrDb color space)

where Y is Luminance signal Db is Dr is

Component system

The component system is used in digital applications, such as P E G and MPEG compressions.

YCrCb color space (CCIR-601)

R' = 1.164 (Y - 16) + 1.596 (Cr - 128) G'= 1.164(Y- 16) - 0.813(Cr - 128) - 0.392 (Cb - 128) B'=1.164(Y-16)+2.017(Cb- 128)

where Y is Luminance signal Cr is Red color difference signal Cb is Blue color difference signal

CCIRdOl video format

As mentioned, the HVS is more sensitivity to variation of luminance information than that of chrorninance information. The CCIR-601 digital video format samples chrominance signal at much lesser spatial hquencies than the luminance signal, except for format 4:4:4. Figure C-1 shows the structures of various formats for 16-by-16 macroblocks.

Fieure C-1 CCIR-601 dipitai video format

the image quality

Documents