the image quality
TRANSCRIPT
THE UNIVERSITY OF CALGARY
A Psychovisually-Based Objective Image Quality
Evaluator for DCT-Based Lossy Data Compression
by
Ruby Wai-Shan Chan
A THESIS SUBMIT'IED TO THE FACULTY OF GRADUATE STUDIES IN
PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE
DEPARTmNT OF MECHANICAL AND MANUFACTURING
ENGINEERING
CALGARY, ALBERTA
August, 200 1
O Ruby Wai Shan Chan 2001
Natiorral Library 1+1 ,Canada BiMbWque nationale du Canada
Aquisitions and Acquisitii et B~Mmgraphic Services services bibliographiques
The author has granted a non- exclusive licence allowing the National Library of Canada to reproduce, loan, distriiute or sell copies of this thesis in microform, paper or electronic formats.
The author retains ownership of tlle copyright in this thesis. Neither the thesis nor substantial extracts fkom it may be printed or otherwise reproduced without the author's permission.
L'auteur a accorde une lice11ce non exclusive pennettant a la Biblioth-e nationale du Canada de reproduire, prgter, distribuer ou vendre des copies de cette these sous la forme de microfiche/^ de reproduction sur papier ou sur format electronque.
L'auteur conserve la propriete du Qoit d'auteur qyi proege cette these. Ni la these ni des extraits substantiels de celle-ci ne doivent etre imprids ou autrement reproduits sans son autorisation.
ABSTRACT
In this thesis, we propose an algorithm for evaluating the quality of DCT-based
compressed images, called the Psychovisually-Based Objective Image Quality Evaluator
(POIQE). The POIQE evaluates the image quality using two psychovisually-based
fidelity indexes: blockiness and similarity. Blockiness measures the patterned square
artifact created as a by-product of the lossy DCT-based compression technique used by
P E G and MPEG, while similarity measures the perceivable detail remaining after
compression. The blockiness and similarity are combined into a single POIQE iudex
used to assess quality. The POIQE model is tuned using subjective assessment results
from five subjects evaluating six sets of images. Then, the capability of the model is
verified by validation experiments involving four new subjects and five new sets of
images.
ACKNOWLEDGEMENT
This section is probably the most difficult but enjoyable section to write. Difficult for the
fear that I might miss anyone but enjoyable for the joy that I am finally done. Starting off
this section with a contradicting and complicated feeling, I believe the f i s t person that I
will have to thank is my supervisor Dr. Peter Goldsmith. Being a mentor and a
supervisor, Dr. Goldsmith provides me with all the valuable advice and guidance that a
student could ever ask for. Also, I will Ore to thank Dr. R Rangayyan for teaching me
the first course in digital image processing and for introducing me to my favorite book
"Digital Image Processing" by Gonzalez and Woods.
Special thanks to my pals and James Mykytiuk for their supports during the course of my
thesis work. Great appreciation goes to my family. I wish to express my gratefulness to
my sisters, Vicky and Ivy, and my brother-in-law Herbert for all their encouragements,
care and companionship. Finally, deeply fiom my heart, I would like to say a million
thanks to my parents for all their love and supports.
to mypamts, Joan and Daniel
TABLE OF CONTENTS
ABSTRACT
ACKNOWLEDGEMENT
TABLE OF CONTENTS
LIST OF FIGURES
LIST OF TABLES
ACRONYM
vii
xii
1.1 THE NEED FOR A PSYCHOVISUALLY-BASED
OBJECTIVE IMAGE QUALITY EVALUATOR (POIQE) .............................. 1
............................................. 1.1.1 The Growth of Digital Data Communication 1
1.1.2 Digital Data Compression and its Quality Criteria ..................................... 2
1.1.3 Psychovisually-based Objective Image Quality Evaluator (POIQE) ......... . 3
... VlU
1.2.1 Hypothesis ................................................................................................ 4
1-22 Objective .................................................................................................... 5
12.3 Scope .......................................................................................................... 5
1.4.1 Related Research on Psychological Studies of the Human Visual
Perception ................................................................................................ 7
.................. ....... 1.42 Related Researches on Quality Measurement Tools .... 9
............................................................ 2.1.1 The Nature and Physics of Optics 14
.................................................................. 2.1.2 The Human Visual Perception 15
..............*.. ........................... 2.1 -3 HVS as a Shift-Invariant Linear System .. 19
.............................................................................. 2.1 -4 Brightness Perception 2 0
...................................................................................... 2.1.5 Color Perception -25
ix
2.1.6 Dark Adaptation and Motion Perception ................................................ 27
.......................................... .............................. 2.1.7 Sequential Perception ... 28
2.2 DIGITAL IMAGE COMPRESSION .................. ....... .. .... ........................ 29
..................................................................................... 2.2.1 Data Redundancy 29
....................................................... 2.2.2 Fundamentals of Image Compression 30
............................................................. 2.2.3 Lossless Compression/Encoding 3 1
.................................................................. 2.2.4 Lossy Compression/Encoding 32
........................................................... 2.2.5 Lossy DCT-based Compression 3 3
FIDELITY ASSESSMENT AND
........................................................... CRITERIA 39
................................................................ ............. . 3.1 1 Subjective Evaluation .- 40
..................................................... 3.1.2 Subjective Assessment Methodology 42
................................................... 3.3.1 Effect of Patterned /Structured Artifact 4 6
3.3.2 Using MSE as an Quality Indicator for Lossy DCT-based
Compressed Image ................................................................................. 47
POIQE MODEL DESIGN ................................... 49
4.1.1 Blockiness Evaluator ................................................................................ 52
4.1 -2 Similarity Identifier ........................ ... ............................................ 57
4.1.3 Merger ................................. .. ................................................................ 60
................................................. 4.2 THE CHARACTERISTIC OF THE MODEL 63
EXPERIMENTAL TUNING AND
VALIDATION OF THE POIQE MODEL .............. 65
5.1.1 Purpose ............................ .. ................................................................ 65
......................................................................... 5.1.2 Method and Procedure 6 6
............................................................................... 5.1.3 Experimental Images 7 0
5.1.4 Subjective Quality Evaluation Data .................... .. .................................. 73
5.2 ANALYSIS .......................................................................................... 7 8
xi
.. 5.2.1 Blockiness Index ................................................................................. 78
5.2.2 Similarity Index ........................ ...... .............................................. -80
5 .2.3 POIQE Index ........................................................................................... 8 1
5.2.4 Errors and Observations ................................................................... ., . . 90
5.3.1 Purpose ............... .. .................................................................................. 92
5.3.2 Method and Procedure .............................................................................. 93
5.3.3 Validation Result ................................................................................... -93
References 99
Appendix A Lossless Image Compression Statistics 105
Appendix B Tuning and Validation Experimentsf Image Sets 109
Appendix C Video Format and Color Spaces of Composite and
Component TV Systems
LIST OF FIGURES
FIGURE 2- 1
FIGURE 2-2
FIGURE 2-3
FIGURE 2-4
FIGURE 2-5
FIGURE 2-6
FIGURE 2-7
FIGURE 2-8
FIGURE 2- 13
................... ............................ ANATOMY OF THE HUMAN EYE .., 16
................................................... ROD AND CONE'S SPATIAL PATTERNS 17
VISUAL ANGLE ................................................................................. 18
................................................................. IMAGE TRANSFORMATIONS 20
JUST DETECTABLE CONTRAST THRESHOLD ......................................... 21
............................ ............................... S m-WAVE GRATING ..... 2 2
........................................ ADELSON'S DIAMOND SHAPED PATERNS -24
CONTRAST SENSITIVITIES FOR LUMINANCE AND CHROMINANCE
FIpFL96 ] .......................................................................................... -26
CRmCAt FLICKER FREQUENCIES fR0~831 ........................................ 29
IMAGE FORMAT ................................................................................... 31
BLOCK DIAGRAM OF JPEG COMPRESSION PROCEDURES ..................... 34
GROW OF PICTURE, SLICE, MACROBLOCKS & SUBIMAGE
............................................................................................... BLOCKS 37
BLOCK DIAGRAM OF MPEG COMPRESSION PROCEDURES .................. -36
SUBJECTWE ASSESSMENT ~~IETHODOLOGY ......................................... -42
................................................ EXAMPLE OF OBJECTIVE ASSESSMENT 44
...................................................................... STRUC~URED ARTIFACT 46
STRU-D ARTIFACT CAUSED BY LOSSY DCT BASED
..................................................................................... COMPRESSION 48
............................................................... BLOCK DIAGRAM OF POIQE 51
FIGURE 5-1
FIGURE 5-2
FIGURE 5-3
FIGURE 5-4
FIGURE 5-5
FIGURE 5-6
FIGURE 5-7
BLOCKINESS A R ~ A C T ....................................................................... 52
COMPUTATION OF BLOCKINESS INDEX ............................................... 56
BLOCK DIAGRAM OF BLOCKINESS EVALUATOR .................................. 57
BLOCK DIAGRAM OF S~MILARITY IDENT~ER ...................................... 58
BLOCK DIAGRAM OF MERGER .............................. .., ............................. 61
SATURATION OF THE HUMAN SUBJECIWE EVALUATION RESULT ........ 61
SORTMG AND GROUPZNG OF SET #1 IMAGES ................... .... .... ....... 68
QUALITY MATCHING OF SET # 1 AND SET#^ AGES ........................... 68
ASSIGNING GROW RANGE .................................................................. 69
PLOT OF COMPRESSED DATA SIZE ...................................................... 72
PLOT OF COMPRESSION RATIO .......................................................... 72
PLOT OF S U B J E ~ QUALITY EVALUATION ....................... ........ 74
PLOT OF STANDARD DEVIATION FOR SUBJECTIVE QUALITY
EVALUATION .................................................................................. 76
PLOT OF BLOCKINESS INDEX ........................... .. ................................ 79
PLOT OF SIMLUNW INDEX ................................................................. 80 PLOT OF MODIFIED BLOCKMESS INDEX .............................................. 81
PLOT OF MODEED SIMILARITY INDEX ................................................ 82
POIQE INDEX ..................................................................................... 85
POIQE INDEX FOR 'CLIFFORD' ................................. ,.., ...................... 87
POIQE INDEX FOR 'KEYS' ..................... .. ........................................ 87
POIQE INDEX FOR 'GIRL & APPLE' .................................................... 8 8
POIQE INDEX FOR 'LENS COVER' ........................................................ 88
POIQE INDEX FOR 'ROSE' ................................................................... -89
POIQE INDEX FOR 'SUNGLASSES' ................................................... 89
.............................. ............................... POIQE INDEX FOR 'BUS' ... -94
POIQE INDEX FOR 'CARS' ................................................................... 94
.......................................................... POIQE INDEX FOR 'COMPUTER' -95
xiv
FIGURE C- 1
FIGURE C-2
FIGURE C-3
FIGURE C-4
FIGURE C-5
FIGURE C-6
FIGURE C-7
FIGURE C-8
FIGURE C-9
FIGURE C- 1 O
FIGURE C- 1 1
POIQE INDEX FOR TABLE' ............................................................... 95
.............................................................. POIQE INDEX FOR UOUSE' 96
........................................................................... 'CLLFFORD' IMAGES 1 10
................................................. ...................... 'KEYS' IMAGES .... 1 12
................................................................. 'GIRL & APPLE' IMAGES 1 14
....................................................................... TENS COVER' IMAGES 1 16
................................................................................. 'ROSE' IMAGES 1 18
....................................................................... 'SUNGLASSES' IMAGES 121
....................................................... ..................... 'Bus' IMAGES .. 1 23
................................................................................. 'CARS' IMAGES 1 2 5
.................................... ............................. 'COMPUTER' IMAGES ...,.. 127
............................................................................. TABLE' IMAGES 1 2 9
'MOUSE' IMAGES ......................... .. .................................................... 13 1
CCR-60 1 DIGITAL VIDEO FORMAT ................................................... 15 1
LIST OF TABLES
TABLE 2- 1 DIGITAL DATA REDUNDANCIES .................. .... .............................. 3 0
TABLE 2-2 MPEG VERSION LIST .......................................................................... 35
TABLE C- 1 TABLE OF SUBJECIWE QUALITY EVALUATION FOR 'CLIFFORD' ......... 133
TABLE C-2 TABLE OF SUBJECTIVE QUALITY EVALUATION FOR ' m y S' ................ 134
TABLE C-3 TABLE OF SUBJECTIVE QUALITY EVALUATION FOR 'GIRL &
APPLE' ................... .. ....................................................................... 135
TABLE C-4 TABLE OF SUBJEC~VE QUALITY EVALUATION FOR 'LENS
COVER' .............................................................................................. 136
TABLE C-S TABLE OF SUBJECTIVE QUALITY EVALUATION FOR 'ROSE' ................ 137
TABLE C-6 TABLE OF SUBJECTIVE QUALITY EVALUATION FOR
~SUNGLASSES' ................................................................................... 1 3 8
TMLE C-7 TABLE OF SUBJECTIVE QUAL~TY EVALUATION FOR 'BUS' .................. 139
TABLE OF SUBJEC~M QUALITY EVALUATION FOR 'CARS' ................ 140
TABLE OF SUBJECTIVE Q u a m EVALUATION FOR'COMPUTER' ....... 141
TABLE OF SUBJECTIVE QUALITY EVALUATION FOR TABLE' .............. 142
............. TABLE OF SUBJECTIVE QUALITY EVALUATION FOR MOUSE' 143
TABLE OF BLOCKINESS INDEX FOR THE VALIDATION
..................................................................................... EXPERIMENT 144
TABLE OF SIMLWITY INDEX FOR THE VALIDATION
.................................................................................... EXPERIMENT 1 4 5
TABLE OF MODIFIED BLOCKINESS INDEX FOR THE VALIDATION
.................................................................................... EXPERIMENT 1 4 6
TABLE OF MODEIED SIMILARITY INDEX FOR THE VALIDATION
.................................................................................... EXPERIMENT 1 4 7
TABLE OF POIQE INDEX FOR THE VALIDATION EXPERIMENT ........... 148
ACRONYM
ASCII
AC
AVI
bmp or BMP
bps
BPS
CCITT
CD ROM
CIE
CODEC
cpi
cpd
CPS
DC
DCT
DIP
American Standard Code for Information Interchange
Alternate k e n t
Audio Video Interleave
Bitmap
Bits per second
Bytes per second
Consulting Committee for International Telegraphs and Telephones
Compact Disk - read-only-memory
Commission International De 1'Eclairage
Compressot/Decompressor
cycles per hch
cycles per visual degree
cycles per second
Direct Current
Discrete Cosine Transformation
Digital Image Processing
Qs
GIF
GOP
HAV
HDTV
H V S
IS0
IEC
IEEE
ITU-R
ITU-T
PEG
L z w
MAD
MSE
MPEG
NTSC
PAL
PCX
P P ~
P P ~
PSTN
Frames per second
Graphics Interchange Format
Group of Picture
High quality Audio1 Video
Kigh-De finition Television
Human Visual (or Vision) System
International Organization for Standards
International Electrotechnical Commission
Institute of Electrical and Electronic Engineers
International Telecommunications Union-Radio Sector
International Telecommunications Union-Telecommunication Sector
Joint Photographic Experts Groups
Lempel-Ziv- Welch compression
Mean Absolute Distortion
Mean Square Error
Moving Picture Experts Group
National Television System Committee
Phase Alternate Line
ZSoft PC Exchange image
pixel per inch
pixel per visual degree
Public Switch Telephone Network
RLE
SECAM
SNR
TGA
TIFF
VLE
Run Length Encoding
SEquentiel Couleur uvec Memoire
to Noise Ratio
Truevision Taqga Image File Format
Tagged Image File Format
Variable Length Encoding
Chapter 1 Introduction
1.1 The Need for a Psychovisually-based Objective Image
Quality Evaluator (POIQE)
1.1.1 The Growth of Digital Data Communication
Digital communication technology has grown aggressively in the past two decades. In
the telecommunication industry, the digital data transmission system is slowly taking
over the analog system, due to its robustness in controlling signal transmission.
However, digitally represented signals require far greater bandwidth and storage space
than traditional analog signals.[HPN97]
As digital video transmitting related technology (such as HDTV, video conferencing, and
Internet communication) grows, so does the need for a faster and higher quality digital
data transmission. In the United States, approximately one in every four households has
access to the 1ntemet1. This information superhighway is extremely busy. The
transmission of high quality digital video results in a massive amount of data clogging the
transmission media. However, the bitrate of the transmission media is limited. For
example, the PSTN modem has a transmission bitrate of up to 3Okbps. For application
such as NetMeeting, the 'real-time' transmission of a 3 x 8-bits uncompressed color video
' According to the December 1998 U.S. Department of Commerce Census Bureau, 42.1% of American households owned computers, and 26.2% of a11 households had Internet access.
Page 2
under the NTSC standard2 will require a data rate of 168Mbps. This problem motivates
the development of digital data compression technoiogy.
1.1.2 Digital Data Compression and its Quality Criteria
Digital image data is just like a sponge, which is massive and fbll of spatial and temporal
redundancies, In order to reduce this massive volume of data into a size that allows
efficient data transmission, this porous sponge needs to be compressed before it is sent
down through the designated data line.
DigitaI data compression provides a significant reduction in data size, which benefits
substantially the data transmission and storage media industries. The International
Organization for Standardization (ISO) and the International Electrotechnical
Commission (IEC)~ developed an international standard for digital data compression
[GW92J. Some of the well-known data compression standards are JPEG (Joint
Photographic Experts Group) for digital still image compression and MPEG (Moving
Picture Experts Group) digital video compression.
Digital data compression can be classified into two types: lossless and lossy. Lossless
compression allows reversible and error-fiee compression (i.e., the compressed image is
exactly the same as the original), whereas the lossy compression is an irreversible process
and non-essential details of the image will be truncated. However, as a compensation for
the loss in image quality, lossy compression gives a much higher compression ratio.
Lossy compression for single frame image is usually capable of achieving a compression
' Under the NTSC system, the image is refieshed at a rate of 29.97 fiame per second with video resolution of 480 pels by 480 lines pN97] ' The development of MPEG-2 and MPEG-4 standards are also participated by the Consulting Committee for International Telegraphs and Telephones (CCITT), the third major international standards organization. The name of CCITT is recently changed to International Telecommunications Union-Telecommunication S e a r W - T ) FIPFL961
ratio4 of 2 5 1 or more with still acceptable quality, whereas lossless compression can
seldom achieve a compression ratio of more than 2.6: 1'.
In the existing video compression algorithms, the criteria used to evaluate the video
quality are mainly mean square error (MSE) and mean absolute difference (MAD). MSE
and MAD are very good criteria to show the difference two images fiom a numerical
point of view. But they fail to show the differences between the two images from a
human visual point of view. (An example showing that MSE is not a good image quality
indicator is provided in Chapter 3 .)
1.13 PsychovisuaIly-based Objective Image Quality Evaluator
POIQE)
Since the end-user of the compressed image data is a human being, it is important to
evaluate the quality based on human perception. The evaluation of the image quality can
be performed subjectively or objectively. (More details on the subjective and objective
fidelity criteria will be discussed in Chapter 3.)
To evaluate an image quality by subjective means is infeasible. The process is time
consuming, inconsistent and expensive. In addition, subjective measurement cannot be
implemented into a compression algorithm. In contrast, objective quality measurement
offers a much faster and consistent evaluation method.
By developing a mathematical model to resemble the subjective judgement of a human,
we propose a objective evaluator, called Psychovisually-based Objective Image Quality
Evaluator (POIQE), to weight the importance of specific image details based on what is
perceivable to the human vision system. This objective evaluator can be implemented in
the compression algorithm as a quality indicator to facilitate discarding of image details
' The compression ratio of the image is defined as the ratio of the compressed data size and the uncompressed data size. Please refer to analysis results in Appendix A Table A-3.
that cannot be perceived by the human eyes. In another words, the quality of the image
will be maintained, whereas the compression ratio of the data will remain small.
1.1.4 Applications
The proposed image quality evaluator can be implemented into a compression algorithm
to enhance the compression process, or stand alone as an image quality index. It has a
wide variety of applications, which are required by data transmission (or storage) media
having limited bitrate (or storage space). Typical examples of such applications are:
Multimedia storage
Internet or teleconferencing
Quality evaluation for digital displaying device
Security surveillance system
Tele-robotics
Media broadcasting
1.2 Objective and Scope
1.2.1 Hypothesis
If the MSE is not a good indicator of image quality, what properties does the human
observer use to rate the image quality? When a human observer looks at an image, he
will sub-consciously look for some details that he has seen before, or some artifacts that
does not belong to the image. Then, he will identify these recognizable details and make
appropriate judgments based of their occurrences. In this thesis, it is hypothesized that
the human subjective rating is mainly based on the following:
patterned artifacts generated as a side effect of the compression algorithm
recognizable level of the detzils in the image
1.23 Objective
The objective of this thesis work is to design and validate a Psychovisually-based
mathematical algorithm for objective image quality measurement.
1.23 Scope
The thesis work consists of:
Presentation of background theories and related research on human visual perception
Design of experiments to measure the human evaluation index that will be correlated
with the objective evaluation criteria used in the mathematical model
Conducting these human subjective evaluation experiments with six different sets of
monochromic images generated from the PEG compression software to obtain model
data.
Designing a Psychovisually-based Objective Image Quality Evaluator
Impiementation of the mathematical model in Visual C/C+ using the data obtained
from the human subjective evaluation experiment
Validation of the proposed POIQE model
This thesis focuses on the demonstration that the proposed POIQE can resemble the
human subjective judging perception. This work does not include the implementation of
the evaluator into any image/video compression algorithm.
1.3 Contribution of this Thesis
The main contributions of this thesis are:
A mathematical model of digital image quality that is based on human psychovisual
properties.
An algorithm (POIQE) based on this model that simulates human subjective
evaluation of digital image quality
Experimental results that give psychovisual parameters used in the model
Coding of the JPEG compression s o h a r e in Visual C/C* for generating the test
images used in the experiment of human subjective evaluation
Experimental validation of the algorithm
An example demonstrating the ineffectiveness of using MSE as a quality indicator for
DCT-based compressed images
Experimental evidence showing that the human quality assessment is very subjective
and iaconsistent. Also, the assessment result shows that the evaluation is more
consistent when the image quality is extremely good or extremely bad.
1.4 Related Work
Previous research related to this thesis work can be categorized into two types of studies.
Psychological characteristics of human visual perception
Quality measurement tools
The related works regarding objective image quality measurement tools are vast. But
most of the time, the quality measurement studies and the psychological studies are
closely tied together. They all have the same characteristic of investigating the symbolic
relationships between visual perception and mathematical models.
1.4.1 Related Research on Psychological Studies of the Human
Visual Perception
The biological studies on the human vision system can be traced back to Leonardo da
Vinci's Trattato della Pittura [Ste58] for stereo vision in the fifteenth centuries. This
biological research provided the backbone for the later research on the psychological
studies of vision.
Most of the psychological studies focused on the investigation in area such as luminance,
spatial contrast sensitivity, temporal contrast sensitivity and spatio-temporal contrast
sensitivity. Also, many of these psychological studies are built based on a biological
approach peL58] [CG66] [Owe72]. In addition, the work on spatio-temporal sensitivity
provided the background knowledge that lead to the investigation of the motion analysis
of video [WAS51 [Gir88].
1.4.1.1 Perception of Luminance and Brightness
Arend and Goldstein [AG87] and Adelson [Ade93] provided detailed investigations and
demonstrations of the characteristic on human perceptual organization and judgement of
brightnessflightness constancy. These researches focused on the studies on the perceived
brightness of a patch relative to the surroundings pattern's luminance. In many other
studies, the relativity of brightness is also investigated in terms of the contrast sensitivity
threshold. Both Weber's Law and Commission International de I'Eclairage (CIE)
proposed computational methods in calculating the brightness level that is just detectable,
known as "contrast sensitivity threshold".
1.4.1.2 Spatial Contrast Sensitivity
The studies of contrast sensitivity are further extended into the investigation on how the
human vision system respond to the change in spatial frequency of the stimulus. Van Nes
and Bournan's w 6 7 ] and Mullen's m 8 5 ] researches provided detail investigation on
the contrast sensitivity of sinusoidal monochrome and chromatic gratings at various
spatial kquencies and luminance levels.
In 1973, Vassilev m 7 3 ] conducted a study on contrast sensitivity of near borders (i.e.
edges).6 Although the research did not covered experiments with an extensive range of
temporal variation, the results showed that the duration of the stimulus is also a very
important factor to the contrast sensitivity.
1.4.13 Temporal Contrast Sensitivity
Well before Vassilev's work, the temporal effect on contrast sensitivity was investigated.
The studies of sinusoidal time-varying stimuli began as early as 1922 by H. E. Ives, but
the attention on the temporal effect of the contrast sensitivity only began in the 1950's by
de Lange Fel6laJ. In most experiments on temporal contrast sensitivity KelCla]
Weldlb], the subject was presented with a flashing modulated light at a constant
temporal fiequency. The luminance of the source will gradually increase until the subject
began to detect the contrast sensitivity threshold. Then, the experiment was repeated
with a different temporal fiequency.
1.4.1.4 Spatio-Temporal Contrast Sensitivity
It is well known that the spatial and temporal effect is an inseparable property of the
contrast sensitivity firnction. In Owen's research [Owe72], a series of experiments were
conducted to valid the interdependence of luminance (just detectabIe contrast), area
(spatial) and duration (temporal) of a visual stimulus. Also many research efforts [Bar581
Bob661 pK98] were focused in building a mathematical model in quantifying the
spatio-temporal contrast sensitivity relation.
In 1956, Schade [Sch56] proposed the spatid contrast-sensitivity hct ion to
mathematically model the viewer judgement on the 'just detectable' threshold contrast. In
Demonstrating the foveal edge effect on stimulus of varies size and shape, VassiIcv explained the contradictory results on related literahrrcs.
1966, Robson mob661 modified Schade's fhction by adding the temporal effect on the
human contrast sensitivity, and proposed the spatial-temporal contrast sensitivity
bction.
In 1998, van den Branden Lambrecht and Kunt mK98] proposed a mathematical
representation to model the human spatio-temporal contrast sensitivity characteristic.
Their works focused on characterizing the human visual perception of the coding artifact
(a "band-pass" filtered white noise) by perceptual channels.
1.4.1.5 Motion Perception
Some spatio-temporal investigations have led to the studies of motion or velocity
perception of the HVS [AB85] PA851. In 1988, Girod [Gir88] explored the motion
perception by tracking of the eye movements of the human observer. He extended his
researched into the relevance of using human eye movement in video sequence encoding
for better data compression.
Gathering information on these innovative researches, many books [Cor'lO] pD88]
w 9 1 ] [Eva481 related to vision perception studies have been published. Wandell's
book, titled Foundations of vision [Wan95], contains a complete resource for the human
vision system. The book explains many human visual characteristics and its
psychological effects from a biological and anatomical point of view. It also covers
substantially on spatial and temporal sensitivity and motion perceptions.
1.4.2 Related Researches on Quality Measurement Tools
The psychological studies of human visual perception began as early as the mid-
nineteenth centuries. However, the research on the development of the psychovisually-
based videohmage quality measurement tools is a fairly new topic. The rise of interest ia
this area is primarily due to the dramatic growth of digital video application in the past
ten years. All these researches focused on the development of an objective image or
video evaluation tools that could resemble the subjective human quality assessment
results. In general, they can be classified into image quality measurement tools for still
image and for image sequence (i.e. video).
1.4.2.1 Stin Image
For still image quality evaluation tools, the evaluation criteria used are typically based on
mean square differences of pixel intensity and edge information between reconstructed
and original images. Also, the techniques used are mainly focused on image properties
such as contrast, luminance and spatial fkequency.
Heeger and Teo is one of the few research that focused in the development of the
perceptual image fidelity model for still image [TH94] m95]. The model is based on
the measurement of the difference in the contrast and luminance sensitivity between the
original and the reconstructed image.
A methodology for determining objective quality metrics used in image coding is
presented by Miyahara, Kotani and Algazi -981 and Horita, Katayama, Murai and
Miyahara -963. The method was used to obtain a picture quality scale (PQS) for
coding achromatic images over the range of image quality defined by the subjective mean
opinion score (MOS). This PQS considers the properties of visual perception for global
features and localized disturbances. It was found to closely approximate the MOS,
except at the low end of the image quality range.
1.43.2 Image Sequence (Video)
For image sequence's quality evaluation tools, the evaluation criteria used are usually
divided into two parts: i n t r a - h e evaluation and inter-fiame evaluation. Intra-frame
evaluation for video is just the same as the evaluation used for still images. The
evaluation technique based mainly on the spatial properties of the intra-he; whereas
the i n t e r - h e evaluation focused on the human assessment of quality deterioration in
the present of motion (i.e. the temporal properties of the inter-frame) [Bra961 [NB97]
[BFLSV97]. Many of the researches started as an image quality evaluation tools
[Lub95], and then eventually added in the temporal component into the algorithm for
video quality evaluation [Lub97].
Focusing on the masking effect and the spatial-temporal frequency characteristics of the
HVS, Okamoto, Hangai and Miyauchi [OHM961 proposed an objective video quality
evaluation index namely 3-D SNR (three dimensional signal to noise ratio). The
experiment shows that the 3-D SNR results provide much smoother and closer match to
the human subjective ratings than the SNR results.
Tan, Ghanbari, Gardiner and Pearson [GGPT97] FGP971 [TGP98] proposed an
objective measurement tool for accessing MPEG video application. The work included
two parts: distortion weighting and cognitive emulator. The first part targeted on the
quality evaluation of still M e . It measured the distortion weighted by the distance of
the pels from any nearby extreme contrast change (i.e. edge). The weighting of this part
is based on the human perceptual effect called 'activity masking'. (More details on active
masking will be provided in Chapter 2.) The second stage focused on the quality
evaluation of a sequence of continuous h e (i-e. video) by incorporating human
decision making processes of smoothing, saturation, asymmetry and delay.
1.5 Terminology
Before proceeding any tiuther, it is essential to clarifL some of the terms commonly used
in this document. The definitions of these terms are summarized and put together based
on their conventional usage and definition in various literatures.
As defined in GonzaZez and Woods [GW92], luminance, measured in lumens, measures
the amount of energy an observer perceives ftom a light source7. But the incident energy
is weighted according to the spectral sensitivity of the eyes WFL961. In another words,
the amount of energy perceived by the human eyes varies as a function of the sue of the
source. In contrast, chrominance measures the hue and saturation of color. Hue is
defined as an attribute associated with the dominant wavelength of the color, and
saturation is defined as the level that the hue is being diluted by the white light w 2 ] .
For example, if the color of the detail is pastel purple, the hue of the detail will be "purple"
and the saturation is the amount of white added to that hue (i-e. the pastel tone to the
detail).
In this document, the word 'quanta' is used to quantitatively represent luminance. The
word 'imtensityd implies the luminance of the object. It indicates the physical luminance
only and has nothing to do with human interpretation, unless they were used together
with adjectives, such as the word 'apparent' or 'perceived', which indicate that they are
idonnation resultant from human interpretation.
According to Evans @2va48], bn'gktness is the apparent luminance of a patch in an
image, without referencing to its surroundings. The brightness perception is mentally
how this luminance is viewed and understood by the viewer. Another very similar term
is lightness. It is defined as the apparent reflectance of a perceived surface relative to
other patches in the same scene [Eva48]. In many books, the use of the term brightness
and lightness refers to different issue, which the latter is affected by the surrounding and
the previous is not.
Also for image compression, a reconstructed image is one that has been modified or
processed. In this document, sometimes the compressed image is referred to as the
reconstructed image. The term blockiness9 refers to the patterned square artifact created
by the lossy DCT-based compression such as PEG and MPEG. A more detailed
explanation and example for blockiness will be provided in Chapter 4.
' In this case, light source is mfemd to as the object or detail in the image. In some literatures, it is also known as "shadesn. In some case, shade is also used to describc color information (e.g. shades of color). In some literature, "blockincssn is also known as blocking artifkct.
Page 13
1.6 Organization of this Thesis
In order to design an evaluator that can simulate the psychovisual behavior of the human
visual system (HVS), it is important to understand some hdamentals of the HVS. In
Chapter 2, some hdamental background research on the HVS and the human visual
perception is provided. Not all information provided in this chapter is explicitly related
to the focus of this thesis research. However, the materials in this chapter provide
background, which lead to the understanding to the HVS and the rationalization behind
the proposed thesis. In the later part of the chapter, a brief introduction of different image
formats and compression technique was described. Readers who are familiar with the
psychovisual behavior of the HVS and image compression techniques can skip this
section.
Chapter 3 provides detailed descriptions and definitions of subjective and objective
fidelity assessments and criteria. This chapter also gives examples on the effects of
structured artifacts and the failure of using MSE as an objective fidelity criterion.
In Chapter 4, the proposed mathematical model of the POIQE is explained. Chapter 5
provides a detailed description on the method and procedure of the model tuning and
validation experiments. The model tuning experiment sets the model parameters. And
the validation experiment fiuther justifies the robustness of the model in evaluation the
quality of varies different image sets. Finally, the thesis is summarized and concluded in
Chapter 6.
This document does not provide descriptions on the JPEG and MPEG compression
algorithm. If the reader would like more information in those areas, he may refer to
[GW92] w F L 9 6 ] IHpN971 for more details.
Page 14
Chapter 2 Background
For image compression, it is inefficient to store and transmit information that cannot be
sensed by the human eye. Therefore, it is important to understand the fundamental
characteristics of the human visual system, in order to identify the non-perceivable details
and remove the perceptual redundancy. In this chapter, some hdamental details of
optics and the human visual system are described.
2.1 The Physics and Fundamentals of Vision
2.1.1 The Nature and Physics of Optics
The human body is surrounded in an environment that is filled with electromagnetic
radiati~n'~. According to their wavelengths or frequencies, the electromagnetic radiation
is recognized in the various forms such as Light, radio wave, ultraviolet, infi.ared, etc.
Amount all these forms, light, of wavelength ranged from 400 to 700~1" [SZY87], is
the only form of electromagnetic radiation that can be sensed by humans. And that is
why it is also known as 'visible radiation'.
A light ray can be understood as a set of electromagnetic radiation of various
wavelengths. It travels in a straight path and strikes on the object in the path. According
to the color and reflectivity of the surface of the object, the object will reflect the light
rays of the same wavelengths as its shades of color and absorb the rest of the light ray.
'O Electromagnetic radiation, consists of time-varying electric and magnetic fields, travels or propagates through space without artificial guide or media at a definite speed (c = 3.00 x 10' ms"). [SZY87]
The reflections of the objects create a scene of different colors and intensity levels in the
3-dimensional space. As the reflections of the objects change in time, time becomes the
fourth dimension of scene.
2.1.2 The Human Visual Perception
The study of human visual perception is a study of how this Cdimensional scene is
transformed into psychological interpretation of space and time by humans. The human
visual perception is very complicated. In general, visual perception can be understood as
a psychological interpretation of an image acquired by the vision system.
In order to understand the psychovisuai behavior of the human vision system, it is
important to understand the hdamentals of how it works.
2m1m2.1 Human Vision System (HVS) Fundamentals
The vision system is one of the most magnificent systems in the human body. This
system includes the left and right eyes, a series of neural pathway, and the brain. Its
sophisticated design allows it to identi@ image details at thousands of color shades and
intensities levels. The binocular visual field is roughly 200 degrees by 135 degrees with
respect to the visual axis -951.
Figure 2-1 shows a cross-section of the human eye. For acquiring images fiom a scene,
light is reflected by the object and enters into the eye. The light is then focused by the
comea (tixed-focus) and the lens (variable-focus)12, and projected onto the retina, which
is lined with photoreceptors. These photoreceptors are light sensitive cells that can
absorb Light. Stimulated by the light, the photoreceptors create a pattern of signals. This
" For standardization purpose, Commission International De L'Eclairage (CIE) designated in 1931 the followhg specific wavelength for visible light: blue = 435.8nm. green = 546.lnm and red = 700nm. [GW92] The combined optical power, measures as the reciprocal of the focal length, of the cornea and the lens is 58.8 dioptcrs. For focusing nearby object, the muscle connected to the eye will change the shape of the lens to increase its optical power. This process is known as accommodarion. @,Van951
pattern of optical signal is then transmitted by a system of neural pathways13 running
fiom the eye to the brain at a rate of 1000 impulses per second-
Fiwre 2-1 Anatomv of the human eve"
The retina of human eye contains two types of photoreceptor~: rods and cones. There are
6 to 7 millions of cone receptors and 75 to 150 millions of rod receptors in each eye
[GW92].
Located densely at the foveal5, the cow is responsible forplio#opic vision, which means
that the cones are stimulated by scene with only high-illumination. There are three types
of cone photoreceptors: red cone (long wavelength), green cone (medium wavelength),
and blue cone (short wavelength)I6. As described by Rogowitz Fog921, these cones can
be thought of as broadband filters for three ranges of wavelength. According to the
trichromatic theory1', mixing these color signals, detected by these three cones, allows
the human eye to discern different colors and shades.
The rod receptors are distributed radially symmetric about the fovea over the retina,
except the blind spot1'. This region is also known as the peri@hery region. Rods are
mainly responsible for scotopic vision, which is sensitive to low-illumination condition.
" The neural pathway consists of several layers of retina neurons and ttrc output fibers make up the optic nerve that leads to the primary visual cortex, also known as area V1, of the brain- wan951 '' The graphic of this figure was taken fiom a wcbsite (amst unlmown). ' Fovea, also known as macula, is a spot where the visual axis intersects with the reha
l6 The red, green and blue cones isre also known as L- (for long wavelength), M- (for middle wavelength) and S- (for short wavelength) cones respectively in some biological literatures.
I' The trichromatic theory states that any color can be regenerated by combining red, green and blue (Cor701. la Blind spot, also known as optic disk, is where the optic nerves arc gathered and connected with the eye.
In comparing with the cone cells, it is not responsible to color details due to its scotopic
characteristic.
Fieure 2-2 Rod and cone's s~at id m at terns
Figure 2-2 provides a of rod and cones distribution on the retina. Each cone
receptor is connected to a single nerve as one unit. These cone units create a 4 x 6 array
of photosensitive cells. In comparison, a number of rod receptors are gathered and
connected to one nerve as one unit. Figure 2-2 shows four rod units (labeled as Rod #1,
Rod #2, Rod #3 and Rod #4), which create a 2 x 2 array of photosensitive cells. Since the
cone array is much h e r than the rod array, the cone is considered as receptor that is
capable in resolving finer details.
Since each rod unit consists of a number of rod receptors, the rod unit is capable of
capturing more quanta. In another word, the rod unit is more sensitive to light than the
cone unit. This explains the scotopic characteristic of the rod receptors.
Moreover, the periphery region is more temporal sensitivity [Pir67]. In another words,
rods are more sensitivity to stimuli that is temporarily varying (e-g. a flashing source).
This phenomenon can be understood, if we look at a flashing source as source with
sufficient luminance but low quanta or energy due to its flashing behavior. Since the rod
l9 Please note that the actual distriiution pattern is more imgular than the pattern shown in Figure 2-2. The cone population at the fovea region is a lot denser than that at the periphery region, vice versa applies to the rod population. (Please ref- to [Wan95 Fig 3.41 for actual spatial mosaic).
Page 18
in periphery region is more capable in capturing more quanta, it is more capable of
detecting flashing source.
2.1.2.2 Visual Angle
Visual angle 8 is defined as a one-dimensional angular measurement of the dimension of
a detail based on the horizontal distance between the object and the eye. In Figure 2-3,
the visual angle can be represented as
where 8 = Visual Angle
hl and h2 = Height of the object above and below the visual axis2'
respectively
x = Horizontal distance between the object and the eye
Fimrre 2-3 Visual AnpIe
h= h, + hz where i3 = visual angle
But for simplicity, equation (2-1) is often represented as
where h = Total height of the object = hr + h2
20 Thc term "visual axis" used in this section is different from the "visual axis" described in the in Figure 2-1 Anatomy of the human eye.
Page 19
2.13 HVS as a Shift-Invariant Linear System
The image formation of the HVS includes a series of optical and neural transformations.
The input is the object image and the output is the retinal image formed. This image
formation process is represented as a l ined1 shift-invariant transformation pNan951.
Two important properties of a linear transformation:
Homogeneity: if r = T(i),
then q u i ) = a q i ) = a r
Superposition: if rr = T(ir) and r2 = Tiit),
then r, + r2 = nil + i2)
where i is the input, r is the output, T is the transformation h c t i o n and a is
the arbitrary constant
Also, the system's shift-invariant characteristic indicates that it is spatially homogeneous.
When a system is spatially homogenous, it means that the transformation result is true for
all locations and directions (isotropy) in space (i-e. r = T(i) for all locations and directions
in space). Two main properties of shift-invariant system are:
1) Due to its spatially homogeneous properties, the system transformation matrix
can be defined fiom one single stimulus.
2) The response to a harmonic function (such as sinusoid and cosinusoid fiom
discrete cosine and discrete Fourier transformations) at frequency f is also a
harmonic function of the same kequency.
*' Although the actual image formation process of the vision system is a non-linear transformation, many analytical approaches assumed that it is linear. This is mainly due to the simplicity of the linear analysis, such as the Fourier techniques. This linear analysis still gives correct results when it is applied to the linear portion of the non-linear system.
Page 20
Fimre 2-4 Imaee Transformations
I n ~ u t Out~ut il r,=T(id A A
i2 A Optical &
- Transformations - - - b x
if +i2 rl +r2=i(il +iJ
2.1.4 Brightness Perception
As mentioned, light reflected by the object in a 3-D scene enters into the eye and
stimulates the receptors to create an optical pattern. Each eye perceives the optical
pattern as 2-D image consisting of patches with different brightness. Defined by the
Weber's Law WFL961, the contrast between two adjacent luminance Y and Y + AY is
'just detectable' when
Similarly, a 'just detectable' contrast calculation is also suggested by the Commission
International de I'Eclairage (CIE). The perception of brightness, also hown as Lightness
(L*) for a specific luminance Y is shown in equation (2-6). The just detectable lightness
occurred when AL* = 1.
Page 21
(2-6)
where Y, is the luminance of white
Both just detectable contrast calculations by Weber's Law and the CIE are very similar.
Figure 2-5 shows a plot of the AY, also known as the "contrast threshold", versus Y for
both calculations.
Fieure 2-5 Just Detectable Contrast Threshold
2.1.4.1 Contrast Sensitivity
Besides the calculations provided by Weber's Law and CIE, there are also other
researches PeL581 par581 [Sch56] Wel6laJ [KeMlb] mob661 w 6 7 ] [Owe721
Wul851 [AB85] @3K98] on contrast sensitivity of the HVS based on the effect of spatial
and temporal variation. In most studies for the modulation transfer fimction (MTF) of the
vision system, a sine-wave grating is used as the stimuli. The sine-wave grating is a
Page 22
vertical stripe pattern with intensity distribution as shown in Figure 2-6 with the
definitions and terminology as followed
Fimrre 2-6 Sine-Wave Grating I
modulation
amplitude
Contrast, also known as the "percentage of modulation" or just the "modulation", is
defined w 6 7 ] as the modulation amplitude of a sinusoid variation at the just detectable
threshold, divided by average luminance.
Conm* Modulatiom Amplitude - f (Y- - Y,, ) - Y,, - Y,,
or - - - - % of Modulation Average Luminance +(Y- + Y,, ) Y,, + Y,,
Contrast sensitivity is defined w 8 5 ] as the inverse of the modulation amplitude at just
detectable threshold.
Contrast Sensitivity = 1 - - 1 Modulation Amplitude f (Y,, - Y-) (2-8)
Refatrgve contrust sensitivi@ is defined Wu185J as the inverse of percentage of
modulation at just detectable threshold.
1 - Y,, + Y- Relative Contrast Sensitivity = - % of Modulation Y,, - Y,,
These definitions of contrast are extensively used in the studies of human visual
sensitivity to brightness.
2.1.4.2 Simultaneous Contrast
The Lightness of a particular region does not simply depend on the luminance of that
region. It is often affected by the luminance of the surroundings. The HVS encodes
information on a relative basis. This phenomenon is known as "simultaneous contrast"
[co~~o]". Adelson [Ade93] provided a very good illustration as shown in Figure 2-7 to
demonstrate this effect. The diamonds in the illustration have the same physical
reflectance, but it is experimentally proven that they have a brightness of approximately
3 5% perceived difference.
" Cornsweet's book ([Cor'lO] p. 272-7) provided a substantial amount of examples illustrating simultaneous contrast.
Fieure 2-7 Adelson's Diamond Shaned Patterns
2.1.43 Activity Masking
Besides simultaneous contrast, another very common phenomenon of the human
perceptual property is Activity ~mking'~. Vassilev was731 provides a very brief
literature review on activity masking. Activity masking shows that contrast threshold at
the region, which is closer to the edget4 detail, is substantially higher. In another word,
small variation of intensity at area near edge details is less sensitive to the HVS.
This phenomenon is also known as "edge eflectn. " Edge is defined as detail with rapid intensity gradient.
2.1.5 Color ~erce~tion~'
2.1.5.1 Contrast Sensitivity for Luminance and Chrominance Variations
The HVS is more sensitive to variation of luminance information than that of
chrominance information. Mitchell, Pennebaker, Fogg and LeGall's book N I T 9 6 1
fiuther provides evidence to this phenomenon by combining Ness and Bouman's [NB67]
results on contrast sensitivity for gray luminance stimuli with Mullen's ~ u l 8 5 1 findings
on contrast sensitivity for chrominance stimuli as shown in Figure 2-8. The plot indicates
that, for the same spatial frequency, HVS has much higher contrast sensitivity for
luminance than for chrominance. In another words, the H V S can detiw finer spatial
variations in luminance than that in color.
To demonstrate this phenomenon, Rogowitz Fog921 used the yellow text in white
background and the dark blue text in a black background as examples. It is very difficult
to see the text in those two examples, but we can still see some yellowish and dark blue
blurs. This is because the luminance variations are too small, even though it has
significant chrorninance variations.
This thesis work covers evaluation based on monochrome images only. This section is written to make the documentation for human vision system more completed.
Page 26
Fig~re 2-8 Contrast Sensitivities for Luminance and Chrominance IMPFL961
Therefore, the MPEG compression employs the CCIR-60 1 digital video format, which
allows chrominance sampling at much lower spatial frequencies than the luminance
sampling. The CCIR-601 format represents the color space with one luminance signal Y
and two chrominanfe signals Cr Cb. (MPEG is also adaptable to other color space sets.
Appendix B provides more detail descriptions and conversions between each color space
to RGB.) The available video formats are 4:2:0,4:2:2 and 4:4:4 as shown in Figure C- 1.
2.1.5.2 Sensitivity for Red, Blue and Green Cones
As mentioned earlier, there are three types of cone receptors for photopic vision: red for
long wavelength, green for medium wavelength and blue for short wavelength. The peak
in sensitivity between this three color varies. The green-absorbing cones are
approximately 5% or sensitive than the red-absorbing cones. And both the green-
absorbing and red-absorbing cones are about 2900% more sensitive than the blue-
absorbing cones [MPFL96].
2.1.6 Dark Adaptation and Motion Perception
As mentioned, the HVS is sensitive to thousands of color shades and intensities.
However, at any one instant, the HVS can only sense a small range of intensities. And
the neural system will gradually adjust this dynamic range to match the ambient light-
An example, given by Rogowitz [Rog92], to demonstrate this process is as followed.
When you enter into a movie theater, at first everything is dark with minimal fine details.
Then, the H V S gradually re-adjust the dynamic range of intensity so that the image
becomes more apparent. This phenomenon is known as 'dark adaptation' [Cor70]. The
dark adaptation process indicates that the HVS is very slow in reacting to dramatic
intensity change?
This phenomenon explains a motion perception known as forward masking". According
to Mitchell, Pennebaker, Fogg and LeGall WFL961, forward masking occurs when
there is a sudden scene change. This scene change can occur globally for the entire
image or locally for just a region of the image. During the instant of scene change, the
eye cannot immediately re-adjust itseIf to all of the changes in intensity. As the scene is
being perceived by the eye for a period of time, the HVS starts to pick up the details of
the scene. In another words, very fine details will not be visible immediately right after
scene change. If the details are available to the viewer for certain duration of time, the
viewer will be able to slowly perceive all the details. However, if the duration between
successive scene changes is too short, the fine details in the scene will not be perceivable.
In another word, the fine detail is being masked out. However, the HVS can still
perceive information such as global intensity, overall contrast, and motion information.
In comparing to the standard 29.97 frame per second refieshig rate for display, the dark adaptation process is very slow. (In North America and Japan, 29.97 fmmels is a NTSC TV broadcasting standard frame rate. It is also so-called the "real-time" video transmission rate.)
" It is also known as "temporal masking".
Page 28
2.1.7 Sequential Perception
The human visual system has a continuous response, due to its ability to persist an image
for a short duration. p N 9 7 ] m 5 ] As a result, the video display media only need to
regenerate the image at a moderate frequency in order to produce a continuous effect. If
the display is refreshed at a lesser rate, display flicker will be produced and the image
will appear shifted and discontinuous.
Researches [DeL58] WeMlb] have been carried out on the study of the perception of
flicker with various luminances, which leads to the term critical flicker frequency (CFF).
Critical flicker frequency is the minimum frequency required in regenerating the display
stimuli in order to create a 'continuous' effect for the HVS.
Rogowitz's experimental observation mog83] in Figure 2-9 shows that the CFF increases
as the luminance of the stimuli increases. In another words, the duration of the vision
persistence reduces as the intensity of the scene increase. A scene with brighter details
will need to be refreshed more frequently than that with darker details in order to
maintain the same continuous effect. Also, Rogowitz result shows that the CE'F increases
as the size of the stimuli increases.28
The standard analog color TV system used in North America is the NTSC. Under the
NTSC system, the image is refieshed at a rate of 29.97 frame per second with video
resolution of 480 pels by 480 lines. In commercial video display, 30 frame per second is
considered as 'real-time' performance.
In the plot, the size of the object is measured as per unit of visual angle.
Page 29
Figure 2-9 Critical Flicker Freauencies lRoe831
i - ?*-y
I I I lo1; 10' rd 10' to' lot 10'
2.2 Digital Image Compression
2.2.1 Data Redundancy
The purpose of image compression is to remove redundancies that takes up valuable
storage space and transmission time. These data redundancies have no contribution to the
quality of the image and carry no new information. In general, they can be classified as
spatial, coding, psychovisual and temporal redundancies as shown in Table 2- 1.
Page 30
2.2.2 Fundamentals of Image Compression
Table 2-1 Dieital Data Redundancies
In general, an image can be stored in a compressed or uncompressed format. For an
uncompressed image, the data size (or file size) is approximately equal to the product of
height, width and color depth, plus image header.
Redundancies
spatialtg
Coding
Psychovisual
Temporal
uncompressed data size = height x width-x color depth + image header
Descriptions
Information redundancy (or repetition) between pixels and pixels
within the same image fixme
Information redundancy (or repetition) within the series of code that
represent the image.
Information that represents detail which is not perceivable by the
H V S .
Information redundancy (or repetition) of the same pixel between
successive fhmes
where height = vertical resolution (also h o w n as rows), in pixels
width = horizontal resolution (also known as columns), in pixels
color depth = data size used to represent each pixel, in unit of bits
per pixel or bytes per pixel
29 Spatial redundancy is also known as inter-pixel redundancy or geometric redundancy.
Page 31
image header = overhead data used to store information about the
image (e.g. resolution, color depth, remarks, etc.),
in unit of bits or bytes
For digitally compressed format, an image can be classified into two broad categories:
losslrss and los~y as shown in Figure 2-10. Both categories involve compressing and
encoding processes, which focus on the removal of the spatial, coding, and psychovisual
redundancies.
Fignre 2-10 Image Format
The compressed data sire is the total data size after compression and encoding, plus
image header. The compression ratio of the image is defined as the ratio between the
compressed data size and the uncompressed data size.
uncompressed data size compression ratio =
compressed data size
2.23 Lossless Compression/Encoding
Lossless compression allows reversible and error-Fee compression. There is no loss of
image quality and all image data are preserved. The process does not include any
quantization. Therefore, the reconstructed image is exactly the same as the original
image. In terms of removal of redundancy, lossless compression/encoding technique
targets mainly on reducing spatial and coding redundancies. The reduction of data size is
Page 32
achieved only by its unique compression/encoding technique. Typical examples of
compressiodencoding techniques3' include:
Lempel Ziv Welch compression, LZW
Variable Length Coding, VLC (e.g. modified H u f k m encoding, Binary Shift code,
arithmetic coding, etc .)
Bit-Plane Coding
Run Length Encoding, RLE
Predictive Coding
These compression/encoding techniques are employed in file formats such as BMP,
TIFF, GIF, PCX, TGA, etc. These file formats can also store images as uncompressed.
The compression ratio of lossless techniques is based on the characteristics of the image
and is not adjustable3'. Appendix A provides a feature comparison and statistics of
compression ratio for various image formats. Note that the compression ratios of these
lossless compressions can seldom exceed 2.6: 1. (See Table A-3 in Appendix A)
2.2.4 Lossy Compression/Encoding
In contrast, lossy compression is an irreversible process and non-essential details of the
image will be truncated. Since the process includes quantization, the reconstructed image
deviates from the original image. In terms of removal of redundancy, lossy
compressiodencoding technique targets mainly on reducing spatial, coding, and
psychovisual redundancies. Nonetheless, as a compensation for the loss in image quality,
lossy compression gives a much higher compression ratio. Lossy compression for still
'O For more information on the listed compression and encoding techniques, please refer to Digital Image Processing by Gonzalez and Woods [GW92].
" In comparison, the lossy P E G comprcssionos compression ratio can be adjusted by the quantization factor input by the user, which indicates the quality level of the reconstructed image.
Page 33
image is usually capable of achieving a compression ratio of 25:l or more with still
acceptable quality. Typical examples of lossy compression32 are:
Joint Photographic Experts Group, JPEG~'
Lossy predictive coding
23.5 Lossy DCT-based Compression
Since this thesis focuses only on the image quality assessment for lossy DCT based
compressed data, a description on DCT-based compression algorithm for still image and
video is provided below.
23.5.1 JPEG Compression/Encoding (for Still Image)
Developed by the collaboration of the Consulting Committee for International Telegraphs
and Telephones (CCITT) and the International Organization for Standardization (ISO),
JPEG is a very popular continuous tone (monochrome and color), still-frame compression
standard [GW92]. Its popularity is mainly due to its capability in maintaining a
significant compression rate at an acceptable image quality, in comparing with many
lossless compression techniques. The JPEG algorithm focuses on the removal of non-
essential details, which are psychovisually not perceivable to the human vision system.
Figure 2-1 1 shows the block diagram of the JPEG compression procedures. The JPEG
compression algorithm involves a compressor and an encoder. The compressor consists
of three sequential steps: -2"-' level shifting, discrete cosine transformation and
quantization. M e r compression, the data will be encoded by variable length coding.
Please refer to Gonzalez and Woods' book [GW92] for completed explanations of each of
these steps.
32 For more information on the listed compression and encoding technique, please refer to Digital Image Processing by Gonzalez and Woods [GW92].
33 The P E G compression also has a new lossless compression version known as "PEG-LS".
page 34
Fimrre 2-11 Block Dianram of JPEG Comnression Procedures
Com~ressor Encoder
Compressed Image Shift Coding VLC Code
During compression, the image is first divided into 8x8 subimage blocks. Then, each of
these subimage blocks will undergo the compression algorithm with level shifted,
transformed and quantized individually, The quantization process causes details to be
truncated in according to the pre-set quantization factor. Consequently, a series of 8x8
patterned artifacts is created, which is defined as 'blockiness' in this document. This
blockiness is recognized as deteriorated detail. For a higher compression ratio, the image
will have a more noticeable blockiness effect. As a result, image quality is subjectively
rated down.
2.2.5.2 MPEG CompressionlEncoding (for Video)
MPEG is a series of digital audio and visual data compression standard established by the
Motion Picture Expert Group, which formed under the auspices of the International
Organization for Standardization (ISO) and International Electrotechnical Commission
(IEC). It is an international compression standard that is widely employed by the digital
video broadcasting, telecommunication, digital storage media industries and more. Table
2-2 shows a series of MPEG standards. Each version of the MPEG standards is designed
for an application of specific data transmission rate.
Page 35
- " MPEG-I is intended for video coding at bitrates of about 12Mbps, plus stereo audio coding at bitrates of about 250kbps.
jS Typical example of HDTV is 1920 x 1080 at 30Hz with uncompresscd bit rate of approx. 1 -5Gbps. 36 Although MPEG-2 has been finalized in 1994, the standard is still constantly modified. 37 The initial design for MPEG-3 is for high bit rate and high-resolution application. But as the MPEG-2
development moves along. It is realized that it can bc achieved by minot extension of MPEG-2. Therefore, MPEG-3 is been merged into MPEG-2.
MPEG1
(Isomc 1 117% 5 P-)
MPEGZ
(ISO/IEC 13818, in pus)
MPEG3
MPEC-4
(ISO/IEC 14496, in 6 parts)
MPEG7
Status
F i e d in Nov. 1992
Finalized in Nov. 1 9 9 4 ~ ~
Merged to M P E G - ~ ~ ~
Version 1 finalized in Oct. 1998
Version 2 targeted
Dec. 1999
Targeted Fall 2001
Table
Optimal Transmission
Rate
Intermediate data rate
(1 .5mps)34
High bit rate and high resolution
definition
(10~bp.s or mom)
-
Low bit rate, but content-based
interactive
(64Kbps or less)
-
2-2 MPEG Version List
Applications
For storage and retrieval of moving pictures and audio on storage media (progressive or non- interlaced fiames only) - CD ROM and high quality compact disk
e.g., 352x240 at 30 fps or 352 x 288 at 25fps wl VHS quality
For digital television (w/£ield-interlaced frames) - e.g. telecommunications, digital W broadcasting, interactive television and 3D stereoscopic television, HDTV~'
e.g., 720x485 studio quality CCIR-60 1 images at up to 15 MbiWsec
HDTV
For interactive multimedia applications - telecommunications, error-prone wireless networks
Version 1 : interactive video on CD-ROM and Digital Television
Version 2: extension of version 1, plus fully backward compatibility
For multimedia content description interface - e-g., digital libraries (image catalog, musical dictionary, . . .), multimedia directory services (e.g. yellow pages), broadcast media selection (radio channel, TV channel, . . .) and multimedia editing (personalized news service, . . .)
Page 36
The MPEG algorithm can be divided into 3 main procedures: intra-jhame compression,
inter--/am compression, and encoding as shown in Figure 2-12. MPEG compression
algorithm is relatively more complicated kt comparing with the JPEG compression
algorithm. In MPEG compression, the video stream is mainly divided into series of
groups of pictures (GOP) as shown in Figure 2-13. Each group of picture consists of
three types of fiames: intra-frame (I- frame), forward predicted h e ( P - h e ) , and bi-
directional predicted h e ( B - h e ) .
Each frame is fiuther divided into 16 x 16 macroblocks and 8 x 8 subimage blocks as
shown in Figure 2- 13. If the video is monochrome, each macroblock will consist of four
8 x 8 luminance subirnage blocks. Otherwise, if the video is colored, each macroblocks
will consist of four 8 x 8 luminance subimage blocks, one 8 x 8 Cr chrominance
subimage blocks and one 8 x 8 Cb chrominance subirnage blocks as shown in the figure.
Fieure 2-12 Block Diaeram of MPEG Com~ression Procedures
Intra-Frame Com~ressor Encoder
I- frame Intra Quantization
Inter-Frame Com~ressor VariabIe Length Coding Compressed
P-frame Quantization - data Frame
Packing
Quantization
Page 37
Fieure 2-13 G r o a ~ of Picture. Slice. Macroblocks & Subimane blocks
Video - Seauence of Actureq
0.0 . .
m m Gmup of Picmra (GOP) Gmup of Pictures (GOP)
16x Id& M~mMocks
Four - &8 Luminance One - k 8 Chrominance One - 8x8 Chrominance B I d - Cr Blmk Cb Block
Similar to JPEG, the MPEG compression focuses on the removal of spatial, coding and
psychovisual redundancies. In addition, it also targets in the removal of temporal
redundancy.
8 x 8 Chrominance
(Cr or 0 Block
8x8Lum. Block
k8Lurn . Block
In the intra-frame compression, the I-frame is processed as a still image. Each subimage
block is individually processed in a way very similar to the JPEG algorithm. The inter-
frame compression focuses on removal of temporal redundancy between frames. In a
sequence of video frames, there is very little change in details between h e s . The
8 x 8 Chronrina~)ce
(Cb or V B k k
k8Lu1m. Block
k 8 L u m . Block
Page 38
change in details is often due to shifting of detail position in the image. Therefore,
MPEG employs a process called motion estimation for its inter-frame compression. This
process determines the motion displacement vector for each 16 x 16 macroblock by
matching the macroblock of the current W e s with that of its reference kames. These
motion vectors are then transfomed, quantized and encoded. Since the MPEG algorithm
is also processed in blocks, the artifacts of blockiness still persist.
Page 39
Chapter 3 Fidelity Assessment and Criteria
In many dictionaries, the wordfidelity has the meaning of faithfidness, loyalty, accuracy
and exactness. In image quality measurement, the fidelity of a reconstructed image is
defined as how similar the reconstructed image in comparison to the original image.
Consequently, a fidelity criterion is a standard or tool used to measure this similarity, and
fidelity assessment is the process of evaluating using the criterion. The definition of a
fidelity criterion is extremely important because it defmes the successfulness of any
experiment and algorithm.
This chapter will cover two general classes of fidelity criteria: objective and subjective.
Also, some subjective and objective fidelity criteria commonly used for image quality
assessment will be provided.
Subjective Fidelity Criteria
A subjective fidelity criterion is the standard for assessing the "goodness" of the test
object based on the subjective judgement of a human observer. A typical example of
subjective assessment can be demonstrated by the orange sorting process in the grocery
store as follows.
The worker opens up the boxes of oranges with various sizes and freshness, and has to
sort these oranges into two grades. The "good" quality orange will be rated as Grade #1
and sold for $1.49/lb, whereas the "not-as-good" quality orange will be rated as Grade #2
and sold for S0.99Ab. The criteria of the sorting process are the properties of the test
Page 40
object (which in this case the orange) such as the size and fieshness. At the end of the
sorting process, each orange will be rated as Grade #I or Grade #2.
Assuming the assessment process is controlled under constant external conditions (such
as Lighting, disturbance, etc), the assessment result is still very often highly variable based
on the backgrounds of the evaluator, such as age, sex, vision, education, experience,
career, etc. For example, the storeowner will have a much higher tolerance on
imperfection of the orange than a store worker. The assessment result is not just
inconsistent fkom person to person. Sometimes the result generated by the same person
could be inconsistent fkom time to time. The test object rating is often affected by the
qualities of the preceding test objects. This phenomenon is known as adaptation.[AC72]
For example, assuming orange A and orange B both have identical fieshness and weight.
If orange A is been assessed after a sequence of extremely high quality oranges and
orange B is been assessed after a sequence of close to rotten oranges, it is very likely that
A will be rated as Grade #2 and B will be rated as Grade #l.
3.1.1 Subjective Evaluation
Subjective evaluation can be either on a "Go-or-No Go" basis or on a scaled basis. A
"Go-or-No go" evaluation is just like the evaluation of the orange sorting process. The
test object is either rated as "good" quality (Grade #1) or "not-as-good" quality (Grade
#2). In contrast, for scaled evaluation, the rating can be either made on an absolute scale
or by means of side-by-side comparisons as shown in Table 3- 1.
Page 41
Table 3-1 Scaled Evaluations
a) Absolute Scale w60)
Vdue Rating Descri~tion - 1 Excellent An image of extremely high quality, as good as you
could desire
2 Fine An image of high quality, providing enjoyable viewing. Interference3* is not objectionable.
3 Passable An image of acceptable quality. Interference is not objectionable.
4 Marginal An image of poor quality; you wish you could improve it. Interference is somewhat objectionable.
5 Inferior A very poor image, but you could watch it. Objectionable interference is definitely present.
6 Unusable An image so bad that you could not watch it.'9
b) Side-By-Side Comparisons [GW92]
{much worse, worse, slightly worse, the same, slightly better, better, much better}
'' The word 'interference' refers to noise, distortion and artifacts. 39 Image quality of this rating is totally unacceptable. Viewers will start to feel annoy when watching
image of this type of quality
Page 42
3.1.2 Subjective Assessment Methodology
As suggested by K. T. Tan et al. [TGP98], subjective assessment can be divided into
three classes: single stimufus method, comparison method, and double stimulus method,
as depicted in Figure 3-1.
Fieure 3-1 Subiective Assessment Methodolow
Rating A, B, C , D, .. .
a) Single Stimulus Method
A Rating A
b) Comparison Method
c) Double Stimulus Method
Rating B B ...
A A
* Note: A, B, C, D, . . . and Ref. is referred to as the presentation of
test object A, B, C, D, . . . and reference object.
Rahg A Ref: A Ref:
Rating Ref.
. . .
Page 43
3.1.2.1 Single Stimulus Method
In the single stimulus method, the subject is presented with a single test object one at a
time. At the end of each presentation, the subject is asked to give a rating for the test
object. Then, the same procedure is repeated until all test objects are presented. In single
stimulus method, the subject does not refer back to the previous assessment results for
references.
This method is typically used in experiments in which it is difficult to assess more than
one stimulus at a time (e.g. audio and video assessments) or when the assessment time
permitted is limited. The phenomenon of adaptation will tend to have a significant effect
on the test results. A typical example of the single stimulus method is the orange sorting
method given in the previous section.
3.1.2.2 Comparison Method
In the comparison method, aIl test objects are presented to the subject at the same time.
During the presentation, the subject will have the opportunity to compare and sort the
qualities of all test objects. The subject will rate the objects after they have been sorted.
In this method, the effect of adaptation is least significant.
3.1.23 Double Stimulus Method
Similar to the single stimulus method, the test objects of the double stimulus method are
presented in a sequence. But in each presentation, a constant reference object is also
present at the same time. The subject is not informed about which is the reference object
and is required to give rating for both the reference and test objects.
This method is very popular in video assessment. It is relatively more time consuming,
but the result yielded is less adaptive and more reliable than the single stimulus method.
3.2 Objective Fidelity Criteria
An objective fidelity crifen'on is the standard for assessing the quality based on
quantitative measurements obtained fkom the test object. In most cases, if the assessment
process involves more than one criterion, it is represented in a form of a mathematical
model.
Using the same orange sorting process as an example, in this case, the decision-making
process is no longer carried out by the worker. The responsibility of the worker is limited
to the acquisition of the relevant criteria data such as weight (for size) and packaged date
(for freshness). He will enter these data into the mathematical model as shown in Figure
3-2. If the value generated by the model is above a certain threshold, the orange will be
rated as Grade #I. Otherwise, the orange will be rated as Grade #2.
Fieure 3-2 Example of Obiective Assessment
Mathematical Threshold ,b Grade #1 package Testing
date - -b Grade #2
In image quality assessment, the objective fidelity criteria commonly used are shown in
Table 3-2. The purpose of these criteria is typically used to measure numerically the
difference between the original and reconstructed image. However, it does not
necessarily represent the difference that the HVS perceives. In the next section, a more
complete proof will be provided.
Pase 45
Table 3-2 Commonlv Obiective Fidelitv Criteria
3.3 Using An Error Based Criterion as an Image Quality
Error:
Total Error:
Root-Mean-
Square Error
(root MSE):
Root-Mean-
Square signal
to noise ratio
(root SNR):
Indicator
e(x,v) = F ( x , ~ ) - f ( x ~ ~ ) (3-1)
M-1 N-1
e,, = C. C licx, Y -my Y)[ (3-2) x=O+
M-1 N-l 'I" (3-3) =[L zz [ i (XyY~-f ,,o (xyY,
14-lN-l [ -0 Z Y-o E ~(x.Y)']% (3-4) S% = A# lY-I [ i *P [ icX.~-/cx.Y~
ma Y-o I]"
Previous research has shown that the pattern sensitivity of humans plays a very important
role in human visual perception pK98][Wan95]. Details with a specific pattern or
structure will tend to be more perceivable than details that are distributed randomly.
f (x, y) b the p b f value of the ongr'nal image where j(x, y) b the pirrl vafue of the reconstruc!ed inago
This section uses two examples to demonstrate the effect of patterned artifacts and to
show the ineffectiveness of using MSE as a quality indicator for DCT-based compressed
images. In the first example, it shows how the noise becomes more noticeable as the
noise becomes structured A very similar example was also given in van den Branden
Lambrecht and Kunt's research [BK98]. The second example provides evidence that
MSE is ineffective when it is used to indicate the quality of a DCT-based compressed
image.
33.1 Effect of Patterned /Structured Artifact
In Figure 3-3a) shows an image polluted with random 'Speckle' noise, and b) and c) show
the same image with horizontally structured 'speckle' noise of frequency 7.5 and 5.0 cpda
(cycles per visual degree) respectively.
This horizontal structured artifact in b) and c) makes the noise become more noticeable
and annoying than that in a), even though they all have the same root MSE of
approximately 27 grayscale levels. As the spatial fkequency of a horizontal artifact
decreases in c), the annoyance and perceivable level due to the present of the noise
increases. This phenomenon of increasing sensitivity of patterns corresponds with the
contrast sensitivity behavior of the HVS as mentioned in Chapter 2. As shown in Figure
2-8, the conatrst sensitivity of the HVS decreases as the spatial frequency increases.
Fieure 3-3 Structured Artifact
a) Random noise b) Structured noise at 7.5 cpd c) Structured noise at 5.0 cpd
The visual angle calculation is based on viewer and image distance of 18 inches. At 18 inches viewing distance, linch of image dimension is comsponded to 3.18 degrees of visual angle. For more detailed explanation on the calculation of visual angle, pteasc refer to Section 2.1.
Page 47
33.2 Using MSE as an Quality Indicator for Lossy DCT-based
Compressed Image
MSE is widely used in evaluating image quality mainly because of is simplicity. It is a
good quality indicator for image with random noise, but (as proved by the previous
example) it is not a good indicator for images with structured artifacts. In lossy DCT-
based compression, a structured artifact, namely blockiness, is created as a 'by-product' of
the compression. This blockiness effect makes the quality deterioration of the image
more noticeable and annoying to the HVS.
Also, MSE is not a good indicator for quality of image deterioration due to quantization
of details in lossy compression. Figure 3-4a) and b) show images polluted with random
'Salt & Pepper' and 'Speckle' noise respectively. And c) & d) show the images
compressed by P E G algorithm. They all have the same root MSE of approximately 15
grayscale levels. For Figure 34a) and b), it is still acceptable to say that they have
similar quality level. However, comparing the qualities of a), b) and c), viewers will fmd
that the image in c) is so highly distorted that the quality of c) is much less than that of a)
and/or b).
Moreover, the image of a rose in d) shows a compressed result, in which the compression
causes a sigmficant deterioration of quality. The detail in the image is totally not
recognizable. (The original uncompressed version of the 'Rose' is available in Appendix
B.) And it is unreasonable to say that c) and d) have the same quality, even though they
have the same root MSE.
Therefore, the MSE is not a good criterion for image quality evaluations. In the next
chapter, a mathematical model using more sophisticated criteria is proposed. It targets at
compressed image with the existence of structured artifact and the characteristic of the
human assessment processed is suggested.
Page 48
Fienre 3-4 Structured Artifact caused bv Lossv DCT based Com~ression
a) Random "Salt & Pepper" noise b) Random "Speckle" noise
r d) JPEG compressed image (rose)
Page 49
Chapter 4 POIQE Model Design
As mentioned in Chapter I, the proposed quality evaluator, namely Pychovisud&-based
Objective Image Qualiw Evaluator (POIQE), is a mathematical model that imitates the
subjective human evaluation of image W i t y . This evaluator objectively yields an
evaluation index as an output of the model. In simulating the subjective human
evaluation, some of the major concerns are:
How does a human observer judge the image quality? What are the criteria
that the human observer used during the judging process?
Is the observer particularly more sensitive to any specific distortion in the
image?
When the human observer looks at the transmitted image through the Internet or a
teleconference session, the first thing he will do is to identi@ or match the details shown
in the image with the objects that he has seen before in his everyday life. The quality
assessment of the transmitted image will be based on the recognizable level of these
details. In general, these details that the observer tries to match can be divided into two
categories: d-orated detail and genuine detail.
The deteriorated details are usually noises and artifacts that are embedded in the data as
by-products of the transmission or compression process. Typical examples are noise and
blockiness. The deteriorated details only occur in the reconstructed image, not in the
original image. It is very common that human observer's attention will be drawn towards
these deteriorated details in the image4', especially towards artifacts with a specific
4' It is well known that the human visual system is very sensitive to distortions and artifacts.
Page 50
pattern or structure. The judgement process is based on the occurrences and noticeable
levels of these deteriorated details that are perceived. If the occurrence and noticeable
level of the deteriorated detail is higher, the image @ty becomes lower.
The genuine details are the original details that the original image is transmitting. If more
details are being recognized, the higher the image quality will be rated- For example, Ivy
sends her sister Vicky a picture of herself that was taken during a vacation by electronic
mail. She scanned in the image and stored it in JPEG format. When Vicky receives the
picture, she does not have the original image to compare with. Instinctively, the first
thing that Viclq does is to identify what is in the picture. To do that, she will search
through her huge database of memory, and try to look for a match for each detail in the
picture. Of course, the first detail that she recognized is her sister Ivy. Then, finding a
match of a sandy beach scene in her memory, she realizes that the picture was taken at a
beautiful beach. She also realizes that Ivy is trying to show her a magazine in her hands.
Then, she will try to recognize the characters and the cover girl on the magazine to find
out what she is trying to show. During the entire process, if it takes Vicky a long time to
recognize the details, she will rate the quality of the picture downP2
Based on these characteristics of the assessment process, the POIQE is designed to
robustly weigh the deteriorated and genuine details, based on their occurrence and
perceivable level to the human vision system. In comparison to human subjective
evaluation, the advantages of this evaluator are its consistency, cost-efficiency and high-
speed processing capability. Most importantly, this evaluator can be integrated into the
digital data compression algorithm. In this chapter, a mathematical model is proposed to
resemble the human quality evaluation based on the measurement of the deteriorated and
genuine details.
42 Of course, one might say the quality assessment of an image is also based its properties such as contrast and color representation. But, all these properties aIso define the clarity of the image, which leads to the ease of recognizing the details of the image.
Page 51
4.1 Mathematical Model
Based on the identification of deteriorated and genuine details in the reconstructed image,
the mathematical model used for POIQE consists of three parts: Blockiness Evaluator,
Sinrilari@ Identifier and the M e w as shown in Figure 4-1. The objective fidelity
criteria used in this model are the blockiness and similarity. Using the original and
reconstructed images as inputs, the Blockiness Evaluator and Similarity Identifier
generate the Blockiness Index "B'and Similarity Index "S', respectively, for
measuring those fidelity criteria. These index numbers will be combined by the Merger
to generate a POIQE Index "P'.
Fieure 4-1 Block Diaeram of POIOE
Mathemab'cd Modcl
Original I Image
Reconstructed Image
Index ,.pee
f
The POIQE Index "P" is a number that resembles the human observer's subjective quality
assessment results obtained through experiments4'. The value of the index ranges from 0
to 100, which 0 indicates extreme poor image quality and 100 indicates perfect image
quality (i.e., exactly the same as the original).
43 A more detailed description of the experiment procedures and conditions will be available in the Chapter 5.
4.1.1 Blockiness Evaluator
Designed for evaluation of DCT-based compressed images, the Blockiness Evaluator
focuses on the measurement of the blockiness artifacta. Figure 4-2 shows an example of
the blockiness artifact compared with the original image (on the left). The reconstructed
image (on the right) has a compression ratio of 51.95:l. As mentioned in Chapter 1,
blockiness is defined as the patterned square artifact created as a by-product during the
lossy DCT-based compression. This artifact is clearly shown in the compressed image on
the right.
Fipure 4-2 Blockiness Artifact
4.1.1.1 Dewtion of Blockiness
In JPEG and MPEG algorithms, the discrete cosine transformation is carried out
individually for each 8 x 8-subimage block using the forward and inverse 2-D DCT
formula provided in equations (4- 1 ) and (4-2).
ec A deteriorated detail generated as a side-product of the DCT-based compression.
Page 53
Assume the entire image has a size of X x Y pixel.
Let Rxpy) = Image intensity represented for x=0 ,1 ,2 ,..., X in space domain y=0,1 ,2 ,..., Y
C(up v ) = Image intensity represented for u=0,1,2 ,..., X in spatial - fkquency domain v=0 ,1 ,2 ,.,., Y
Forward 2-D DCT for each subimage block:
Inverse 2-D DCT for each subimage block:
N = Subimage size (Example, if the subimage size is 8 x 8 pixel, N is equal to 8.)
In the spatial frequency domain, if both u and v is equal to zero (the top Left corner of the
subimage block), equation (4- 1) becomes:
where fa"- = Average of all elements within the 8x8 subimage block
C(0,O) is also known as the DC coefficient, whereas the rest of the coefficients are known
as AC c0efficient8~. As illustrated in Chapter 2, the perceivable level of details in a
pattern is based on its contrast relative to its neighbors. Since the DC coefficient shows
the average intensity of all the pixels in the subimage block, the blockiness effect can be
extracted in terms of the difference in luminance of the current subimage relative to that
of its neighbors.
'' In the electrical engineering literature, AC and DC are the acronyms for alternating current and direct current respectively. The term DC also has the meanings of constant and average in the description of signal. In discrete cosine transformation, the C(0,O) is known as DC coefficient due to it averaging characteristics.
Page 55
4.1.1.2 Methodology in Computing Blockiness
The computation of Blockiness Index "B" can be explained graphically as shown in Figure
4-3 and Figure 4 4 by letting:
foe (x,y) & = Original and reconstructed images ; for x = 0,1,2, ..., X f, b y 1 represented in space domain y=0,1,2 ,..., Y
respectively
Cow (u, v) & = Original and reconstructed images ; for u = O,1,2, . . . , X c, (up v) represented in spatial - frequency v=0,1,2 ,..., Y
domain respectively
DC, (m.n) & = DC coefficient of the original and ; for m = 0,1,2, ..., M8 DC,(m,n) reconstructed images respectively n=0,1,2 ,..., Y/8
where X = Numbers of Row (Height)
Y = Number of Column (Width)
After the level shifting and discrete cosine transformation process, both the original
C.,(u. v) and reconstructed C,(u, v) images are represented in spatial-frequency domain
of size X x Y pixels.
The Blockiness Index B(m, n) is defined as the average absolute difference between the
current DC coefficient and its eight neighborhoods' DC coefficients of the reconstructed
image relative to that of the original image. As shown in Figure 4-3 a), the image is
divided into (Xx 0164 units of 8 x 8 subimage blocks. In each of these subimage blocks,
the top left coefficient is the DC coefficient (as shown by a shaded pixel in Figure 4-3a).
The DC coefficient of each subimage block is then extracted and represented as DC(m, n)
as shown in Figure 4-3b. Then, the DC(m, n) is subtracted with each of its eight
neighbors.
Page 56
Finure 4-3 Com~u tation of BIockiness Index
a) Image C(u, v) represented in spatial-frequency b) Average absolute Merence of current DC domain. coefficient DC(m, n) with its eight
The differences between DC(i, n) and its eight neighbors are averaged out to obtain ~ ( m ,
n). Finally, the Blockiness Index is computed as the absolute difference between the
original &*(rn, n) and the reconstructed ~ ,~~ (m, n) as shown in equation (4-9.4
where 1 1 I E4(m,n)= - 8 i=-I x j-I Z I { D c , ( ~ , ~ ) - ~ C , ( m + i , n + j ) ]
Figure 4-4 shows a block diagram for the computation of the Blockiness Index for each
subimage blocks. The Blockiness Index B(m, n) for each subimage blocks is then added
Please note that the Blockiness Index is designed to measure the gradient difference between the current subimage block and its 8 neighbors in term o f 8-bit grayscale. Therefore, Equation (44) is not nmm~lizcci by he E, (m, n ) .
together and divided by the total number of subimage blocks to obtained the Blockiness
Index "B" for the whole image.
Fimre 4-4 Block Diaeram of Blockiness Evaluator
Blockiness Evafuador t
coeff. neighbors Blockiness Index
Rcewst. I B(m. n)
fm&y) f DCT coe& I
4.1.2 Similarity Identifier
Many evaluators often try to define image quality based on the difference between the
original image and the reconstructed image. A common criterion used is the error-based
objective fidelity criteria as shown in Table 3-2. However, during the assessment
process, the observer is actually trying to identi@ details instead of comparing
differences. This is even more obvious when the observer only has access to the
compressed image.
4.1.2.1 Definition of Similarity
Targeting the "details recognizing behavior" of the observer, the Similarity Identifier
focuses on the measurement and identification of the genuine details remaining in the
reconstructed image numerically. In general, sinriIari@ can be defined as a measure of
the degree of resemblance when two objects are under comparison. As mentioned in the
example, the comparison takes place between the reconstructed image and the bits-and-
pieces in the observer's memory. The observer's memory is a huge database accumulated
through experience and time.
4.1.22 Methodology in Computing Similarity
To teach a machine to numerically resemble this matching procedure is almost
impossible due to the lack of this huge database. This is one of the reasons why most
evaluators tend to use the original image as a datum in substitution to the memory
database of the observer. In comparison to the implementation of a database that
resemble the observer's memory, the use of the original image as a datum is much
simpler method. Figure 4-5 provides a block diagram indicating the steps in computing
the similarity of the image: edge detection and matching of genuine details.
Fimre 4-5 Block Diaeram of Simitaritv Identifier
Simiiaritv Idena'fier
f
Detection
Detection i
The first step of the Similarity Identifier is to perform edge detection for both the original
and reconstructed images. The purpose of this step is to outline the details in the images.
The principle behind edge detection is very simple. For each pixel in the image, a
gradient magnitude "G(x, y)" is computed using equation (4-6).
As shown in the equation, the gradient magnitude is a measure of intensity difference of
the current pixel with its neighbors diagonally. If the gradient magnitude increases, it
means that the contrast between the current pixel and its neighbors is bigger. (i-e., It is
more Wrely that the current pixel contains an edge.) After the computation of the
gradient magnitude, a threshold "7" is used to determined whether the gradient magnitude
is large enough to define the current pixel is an edge or not.
E(x, y) = f (edge) if G+, y) 2 z 0 (non - edge) if G(x,y) < z
E(x, y) is a edge matrix containing only binary values of 0's and l's, where 1's represents
an edge pixel and 0's represents a non-edge pixel.
After the edge detection is performed on both images, the next step is to identify the
similarity between the two images. Before matching, the first procedure is to filter out
the nonogenuine details in the reconstructed image using the original edge matrix as a
masking filter as shown in equation (4-7).
If a genuine detail is detected in the original image (i.e. E,,(x, y) = I), then the edge pixel
of the reconstructed image will not change (i-e. E, . mPTrLed(i, y) = Em=& y) x 1).
Otherwise, the edge pixel of the reconstructed image will be masked out (i-e. Ere= .
ma~fi , y) = Em& y) x 0 = 0). The purpose of this masking process is to remove all
false edge that is caused by compression related artifacts.
Then, the next step is to count the occurrence of edge pixel in the original and X Y
reconstructed images ZZ E(x, y). The Similarity Index 5" is the ratio of the x Y
occurrence of the masked edge pixel in the reconstructed image to that of the original X Y
image. Since the edge pixel in the reconstructed image is being masked, E, (x, y) is = Y
X Y always greater than or equal to Z Z E,-,(x, y) . In another words, the Similarity
Index always lies between 0 to 1.
4.13 Merger
The purpose of the Merger is to combine the Blockiness Index and the Similarity Index to
generate a POIQE Index that range between 0 and 100 in resembling the subjective
assessment result. The POIQE Index has a value close to 100 for image almost the same
as the original, and a value close to 0 for image of extremely poor quality.
As shown in Figure 4-6, the inputs of the Merger include the Blockiness Index and the
Similarity Index. The Merger consists of 3 steps: Blockiness Index Modification,
Similarity Index Modification and Index Merging.
Since it is known that the human subjective evaluation results are often saturated at
extreme Limits [GGPT97] [TGP97] [TGP98], the purposes of the Blockiness Index
Modification and the Similarity Index Modification are to shape the two indexes into
curves that gradually increase at a decreasing rate with respect to the quantization factor as
shown in Figure 4-7.
Page 6 1
Fimrre 4-6 Block Diaeram of Mewer - Blockiness Index
' , POIQE Index
M M e d Me@g Similarity Index - Index Index
. ,
Firmre 4-7 Saturation of the Human Subiective Evaluation Result
Modified A
Blockiaess I' Index (or Modified Similarity Index)
0 -, Quantization factor
Page 62
4.13.1 Blockiness Index Modification
The purpose of this step is to mod@ and clip the Blockiness Index "B" and to output a
Modified Blockiness Index "B,". The Blockiness Index is modified such that as the
quantization increases, the modified value will increase gradually at a decreasing rate. In
order to do so, the Blockiness Index is modified as shown in equation (4-914'. An
increase in the Modified Blockiness Index indicates that the image has high blockiness
and its quality decreases.
Let x =log@ + 0.5)
4.13.2 Similarity Index Modification
Similar to the Blockiness Index Modification, the Similarity Index Modification process
will modify and clip the Similarity Index "3' and output a Modified Similarity Index "S,"
as shown in equation (4-10)~~. An increase in the Modified Similarity Index indicates that
less genuine details are being recognized and the image quality decreases.
Let x = log2 (1 00 x (I-s))
47 Please refer to section 4.2 for the development o f the equation. '' Please refer to section 4.2 for the development of thc equation.
Page 63
4.1.3.3 Merging
Once the Blockiness Index and the Similarity Index is being modified, the Merging
process will combine the two value and generate the POIQE Index "P" value using
equation (4-1 1). The POIQE Index will be clipped to a value between 5 to 100.
Let x=lOO-oxB, xS,
where a = Tuning parameter
4.2 The Characteristic of the Model
Unlike other quality evaluators that focus on the intensity difference between the original
and compressed results, the model investigates the "Similarities" between the original and
compressed results, based on the percentage of detaiIs that remain in the compressed
result. This allows the model to compare two images without under-estimating the image
quality based on intensity difference that is not perceivable to the HVS. Also, the model
is tailored to evaluate DCT-based compressed images by evaluating the patterned
artifacts called " B lockiness" produced by DCT-based compression.
In addition, in the development of the equations (4-9) and (4-lo), the parameters were
implicitly determined by matching the human subjective data of the tuning experiments
using the method of "Trial and Error". In Chapter 5, the model is fine-tuned explicitly
using the tuning parameter 'a' of equation (4-1 1). The selection of this mathematical
approach in tuning the model was mainly driven by the simplicity and efficiency of the
method. In tuning the model, the author tried various different mathematical approaches,
including a model represented by a polynomial of degree 6. However, none of these
methods seem to be able to provide a better model tuning than the "Trial and Error"
method.
Page 65
Chapter 5 Experimental Tuning and
Validation of the POIQE model
In order to adjust the tuning parameters of the POIQE model proposed in Chapter 4, a
series of experiments on the human subjective assessment on image quality is conducted
in this chapter. The details of the experimental procedures and human evaluation results
(i.e., experimental results) are covered in Section 5.1. Then, the human evaluation result
is analyzed and used to tune the model parameters in Section 5.2.
In order to validate the capability of the model, a set of validation experiments is also
conducted. The validaiion result is obtained by using image sets and subjects, which are
completely different fiom the experiment used to tune the model parameters. A detailed
description on the validation experiment is available in Section 5.3.
5.1 Model Tuning Experiments
5.1.1 Purpose
The purpose of the model tuning experiments on human subjective assessment is to
measure the human image evaluation data, which will be used for tuning the parameters
of the proposed POIQE model.
5.1.2 Method and Procedure
As mentioned, one of the advantages of the using objective evaluator is that it gives
consistent image quality evaluation. Therefore, it is very important that the model
parameters are tuned based on some consistent experimental results. However, it is well
known that human evaluation is very subjective and adaptive. Therefore, the
experimental method selected has to be able to maintain a certain degree of consistency
in the experimental results in order to build a consistent model.
As described in Section 3.1.2, the comparison method allows the subjects to constantly
refer back to previous evaluation results that they made. As a result, the comparison
method provides data that are relatively more consistent and less af5ected by the effect of
"adaptation". For this reason, the subjective assessment method chosen in the experiment
is the comparison method with absolute scaled eva~uation~~.
5,1,2,1 Procedure
During the experiment, the image sets are presented to the subject for evaluation
consecutively. All evaluated images are available for comparison at all times. At the
final step of the experiment, the subject is requested to provide an evaluation for each
image based on the absolute scaled evaluation table shown in Table 5-1, which is a
modification of Table 3-1 PB60J based on the purpose of this research.
For the entire experiment, each subject has to provide evaluation of six sets of images,
each consisting of 27 imagesso. Details of the test image sets will be described in Section
5.1.3.
4 9 Detailed descriptions of the comparison method and absolute scaled evaluation are provided in Section 3.1 -2. Each set is composed of 26 reconstructed images at different quantization level and 1 original image.
Page 67
Table 5-1 Absolute Scaled Evaluation Table
Ouality Average Subiective Descri~tion
Index Value - Rating Ranee
Original
Excellent
Good
Satisfactory
Acceptable
Average
Inferior
Bad
Worse
Worst
Unusable
-- -
An image of extremely high quality, as good as you could desire
An image of excellent quality. Details can be recognized instantly.
An image of acceptable quality. Interference is not objectionable.
An image of acceptable quality. Interference is not objectionable.
An image of poor quality; you wish you could improve it. Interference is somewhat objectionable.
Just barely recognizable
Details in image is totally not recognizable
5.1.2.2 Experiment Instructions
At the beginning of the experiment, the subject is presented with Table 5-1 and
experiment instructions. The experiment instructions are as follows. (Note that the
experiment instructions described below are provided for the subjects in both oral and
written format.)
Page 68
S t 1 Group the images of Set #1 into columns of similar qualities, and
sort the columns in descending quality as shown in Figure 5- 1.
Fimre 5-1 Sortinn and Grou~ine of Set #1 Imanes
Hieh b - Low OuPlitv OualitY m w m m ml m m m rn rn tw m EJ
tnl m la HI El El ml W#l m la tBFl m
- -
S t e ~ #2 Repeat the same sorting and grouping process as Step #1 for the
images of Set #2.
Sten #3 Match the quality of the Set #1 with the Set #2, column by column
as shown in Figure 5-2. (At this time, some columns of Set #1 might
have to be broken up into two and a new column might have to be
created with Set #2 as shown in the figure.)
Figure 5-2 Oualitv Matchine of Set #1 and Set#2 Ima~es
Em b - Low Oualitv Quality
m ~ m ~ r m m m m E l m . m m m 1
1?1 0 , m I rn m m m fa* rn
5 - lrlrl r m m m m . 0 m i m w
A61 Set#l rn Set#2
m
Page 69
S t e ~ #4 Assign a quality index range value (first column of Table 5-1) for
each column as shown in Figure 5-3.
Fimre 5-3 Assiane Group Ranee
100 99-90 89-80 79-70 6940 59-50 4940 39-30 29-20 19-10 9-0 - -------- m m m m m m m 6 6 5 1 m m m m m m m m m m f m w m m m m m
m m m m m m w m #rl rn
Z m i m rn Set#l m Set #2
w
S t e ~ #5 Repeat the same sorting and grouping process as Step #1 for the
images of Set #3.
Sten #6 Match the quality of the Set #3 with the previous columns.
S t e ~ #7 Repeat Step #5 and Step #6 until all six sets are evaluated.
Sten #8 After all six sets of image are sorted and evaluated, assign an
evaluation number for each image based on the evaluation index
range assigned to each column. Repeat this step for all columns
except the last three columns (column 0-9, 10-19 and 20-29). For
the last three columns, assigning evaluation number for each image
is not necessary. Simply give all images in the column the average
values1 as given in the second column of Table 5-1). Example, give
all images in column 10-1 9 an evaluation number of 15.
'I Since the average value is given to the group range '9-O', the lowest evaluation rating is '5' (not '09). This explains why the lower limit of equation (4-1 1) is '5'.
Page 70
5.13 Experimental Images
The test objects used in this experiment include six sets of images. AU images are
monochromic. They are represented in 8-bit grayscale precision and displayed at pixel
density of 96 pixels per inch (or 20 pixel per visual degree53 for both horizontal and
vertical directions. These 6 sets of images are labeled as 'Clifford', 'Keys', 'Girl & Apple',
'Lens Cover', 'Rose' and 'Sunglasses'. Refer to Figure B-1 to Figure B-6 to view the
images.
I . each set, there is one original uncompressed image and twenty-six compressed images.
The compressed images are developed from the original image using the PEG
compression algorithm at twenty-six different quantization levels."
Table 5-2 Exneriment Imaee Parameters
5.13.1 Compressed Data Size and Compression Ratio
Pixel dimension (row x
Image type:
Grayscale precision:
Display pixel density:
Subimage block size:
Each of the twenty-six compressed images has a different compressed data size and
compression ratio. Figure 5 4 and Figure 5-5 shows the plots of the compressed data
size5* and the compression ratio for all six sets of test objects.
160 x 120
Monochrome
8-bit (0 to 255)
96 ppi (or 20 ppd)
8 x 8
s2 The pixel density can also represented as pixel per visual degree ppd, assuming the horizontal distance between the viewer and the object is approximately 12 inches. All sets of images have pixel dimension of 160 x 120, except for image set 'Rose'. The pixel dimension for 'Rose' is 184 x 96. The quantization level is a parameter that defines how much detail will be truncated during the quantization process as descnid in Figure 2-1 1. These twenty-six levels of quantization levels used 1, 2, 3, 4, 5,6, 7, 8,9 , 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 22.5, 25, 27.5, 30, 32.5 and 35.
Page 71
As the quantization factor increases, it indicates that more infomation is being
truncated. As a result, the compression achieves a reduction in data size (as shown in
Figure 5-4), which leads to an increase in compression ratio (as shown in Figure 5-5).
As the compensation to a better compression ratio, the truncation of details yields a
deterioration of image quality as shown in each set of images in Appendix B. In Figure
5-4, the reduction of data size starts to level off at a quantization factor of approximately
10. This indicates that bther truncation of details will not provide any significant
achievement in data reduction. In this document, the range between quantization factor
0 to 10, which shows a steep slope in the compression size reduction, is considered as
the "effective range" of compression.
55 Unlike the measurement suggested in Section 2.2.2, the measurements of the compressed and uncompressed data size am obtained in exclusive to the image header. Since the image size is not very large, the image header has a significant effect on the analysis of the data reduction. To allow clear analysis of the data reduction due to compression, the image header is excluded.
Page 72
Fieare 5-4 Plot of Com~ressed Data Size
Fieure 5-5 Plot of Comnression Ratio
Page 73
5.1.4 Subjective Quality Evaluation Data
5.1.4.1 Experiment Subjects
This experiment involved five individual subjects. They each individually performed the
experiment according to the procedures as described in Section 5.1.2. The experiment on
average takes about 1 '/z hours to complete for each subject.
5.1.4.2 Evaluation Data
The subjective quality evaluation results are recorded as shown in Table B-1 to Table B-6
of Appendix B. The average values of the five subjects are plotted in Figure 5-6 and
summarized in Table 5-3. Each set of data is approximated with a polynomial function
that provides the best fit of the curve.
All six sets of evaluation results indicates a deterioration of image quality as the
quantization factor increase. However, as shown in Figure 5 4 , the rate of deterioration
of quality for each image set is different. For example, the deterioration of 'Lens Cover'
is at much faster rate than that of the 'Girl & Apple'.
Pienre 5-6 Plot of Subiective Oualitv Evaluation
Page 75
5.1.4.3 Inconsistency of the Evaluation Data
Table
Quantization factor
1 2 3 4 5 6 7 8 9 10 1 1 12 13 14 15 16 17 18 19 20
22.5 25
27.5 30
32.5 35
As mentioned Section 3.1, human evaluation results are very subjective and inconsistent.
As shown by Table B-1 to Table B-6, the evaluation of the same image given by each
subject can be quite different. The difference is represented by the standard deviation
5-3
Clifford
95.20 92.40 88.80 84.60 80.60 69.00 68.00 62.80 57.60 46.20 46.80 46.20 45-80 40.60 33.00 32.80 28.60 26.80 25.80 26.00 22.60 16.80 16.60 12.00 14.00 1 1.00
Average
Keys
94-60 90.20 82.20 8 1.20 72.60 6 1-00 45.00 4 1.00 5 1.60 39.00 34.00 33.40 22.40 26.20 22.60 16.80 22.20 15.00 15.00 7.00 7.00 8 .OO 6.00 6.00 6.00 6.00
Results
Rose
90.80 89.00 86.20 84.00 71.00 64.40 59.80 55.00 52.00 45.60 39.60 32.80 38.00 35.80 26.60 22.20 25.20 18.00 15-00 12.00 9.00 9.00 4.00 4.00 4.00 4.00
Subiective Ouality
Sunglasses
94.60 8 1.60 75.20 65.80 59.80 50.60 47.20 44.00 35.40 38.20 29.20 34.40 32.00 24.20 23.40 18.20 23.80 15.00 17.40 7.00 1 1 .OO 13 .OO 7.00 9.00 4.00 4.00
Girl & Apple 95.00 83.80 85.00 82.60 79.40 74.40 67.40 58.60 57.40 54.00 53.40 48.80 43.80 41.20 36.00 34.20 31.60 32.20 24.00 28.80 23.00 15.00 16.00 7.00 12.00 8.00
Evaluation
Leas Cover 90.40 86.60 78.40 66.80 63.00 52.60 47.60 42.20 42.80 32.80 34.00 3 1.60 29.60 20.00 18.00 14.00 17.00 13 -00 15.00 12.00 12.00 4.00 4.00 5.00 5.00 8.00
Page 76
& as s b w in Figure 5-1. The figure shows that the standard
deviation tends to be highest at a quantization factor of approximately 10. Then, the
deviation starts to decrease and levels off. This indicates that the human evaluation is
least subjective when the image quality is extremely good or extremely bad.
Figure 5-7 Plot of Standard Deviation for Subiective Ouaiitv Evaluation
5.1.4.4 Confidence Interval
Since the variance of H V S evaluation result for the entire population is unknown, the
experimental data is statistically assumed to follow the t-distni~tion~~ m 8 9 ] , instead
of the normal distribution. The probability of the t-distribution with a 95% confidence is
represented as:
where f = Sample mean,
,U = mean,
tat, = T coefficient of t - Distribution,
tF = Sample standard deviation, and
n = Sample size.
Equation (5-1) shows that the confidence interval (C.I.) of the distribution is bounded
within the range defined as:
CJ. = .T + half - width, (5-2)
where 8 half - width = t,,, * - . J;;
By re-arranging equation (5-3), the minimum sample size (n) required for the result data
to fall within a certain confidence intend range is:
The t-distribution is also known as the "Student t-distribution".
Page 78
At a 95% confidence level, the tan coefficient is 2.776. From the experimental data, the
average sample standard deviation is 1 129. Considering a confidence interval with a
halGwidth of 15, the minimum sample size required is 4.36.
The experimental data is obtained and averaged from data performed by five different
subjects. Therefore, it is reasonable to conclude that the subjective human evaluation
results obtained fall within 95% confidence level of the population mean p.
5.2 Analysis
5.2.1 Blockiness Index
Using the calculations suggested in Section 4.1 .l, the Blockiness Index is presented in
Figure 5-8 for all six sets of images.
The initial portion of the plot increases fairly linearly. Referring to the image sets in
Appendix B (Figure B-1 to Figure B-6), observers will find that the blockiness of the
images with quantization factors 1 to 10 tends to increase fairly gradually. Also, the rate
of increase in blockiness of aLl six sets of images is very much the same.
Fimre 5-8 Plot of BIockiness Index
As the quantization factor increases, the Blockiness Index of different image sets
starts to diverge. In Figure 5-8, the diverging directions that image sets 'Lens
Cover' and 'Sunglasses' are quite different. The diversion can be understood by
looking at the characteristics of the two sets of images. Comparing the two sets,
observers will find that image set 'Sunglasses' is composed of a large area of a
smooth background. In contrast, the image set 'Len Cover' has a relatively more
complex background. As the quantization factor increases to the levels beyond
20, this blockiness becomes less severe in the 'Sunglasses' images set (in
comparing to the 'Lens Cover' image set) due to slow gradient change in the
background.
Page 80
5.2.2 Similarity Index
The Similarity Index is measured using the method suggested in Section 4.1.2. Figure
5-9 shows a plot of the results for all the image sets. As shown in the plot, the Similarity
Index of each image set decreases gradually as the quantization factor increases. This
reduction in the Similarity Index indicates that the details in the image become less
recognizable, as more details are being truncated.
Finure 5-9 Plot of Similaritv Index
In addition, Figure 5-9 shows that the Similarity Index of the 'Lens Cover' image set is
much less than that of the other five sets of image. The image set 'Lens Cover' has a
much smaller Similarity Index because the majority of the edge details used to define the
index is the wood grain in the image's background. These edge details are easily
truncated even at a very low quantization factor. Comparing the original image and the
compressed image with quantization factor of 1 in Figure B-4 of Appendix B, observers
will find that most of the fine wood grains in the compressed image are truncated. As a
result, a significant amount (by percentage of total) of edge details is lost. Therefore, the
Similarity Index is much lower in comparison to the other five sets.
5.23 POIQE Index
5.23.1 Modified Blockiness hdex
With the Blockiness Index, the modified Blockiness Index can be calculated using
equation (4-9). The results are presented in Figure 5-10 as follows.
Fieure 5-10 Plot of Modified BIockiness Index
53.32 Modified Sidarity Index
Similarly, the modified Similarity Index is calculated in according to equation (4- 10)
using the Similarity Index. The results are presented in Figure 5- 1 1.
m e 5-11 Plot of Modified Similaritv Index
5.233 Numerical Analysis for Model Parameter
Now that both the modified Blockiness Index and the modified Similarity Index are
obtained, the next step is to merge the two indexes together in order to generate a POIQE
Index. The POIQE Index generated is expected to closely match the subjective quality
evaluation results obtained from the experiments (as recorded Figure 5-6 or Table 5-3).
Io order to match the POIQE Index to the subjective quality evaluation results as closely
as possible, the arbitrary parameter 'a' in equation (4-1 1) is determined by using the
Method of Least ~~uares". In equation (4-1 l), the calculation of the POIQE Index is
57 For a detail explanation of the Method of Least Square, please refer to Cbapter 10 of Numerical Mathematics and Computing by Cheney and Kincaid. [CK85]
Page 83
where a = An arbitary parameter, P = POIQE Index,
B, = Modified Blockiness Index, and S, = Modified Similarity Index.
Let k be the Subjective Quality Evaluation result that the POIQE Index tries to match.
Then, the mean square error @ of the Least Squares Approximation is
where n = Total number of Subjective Evaluation data (i-e., 26 images per set x 6 image sets = 156 images).
According to the Method of Least Square, @ is minimized with respect the parameter 'a',
a# if - = 0. Using equation (5-5), aa
Then, re-arrange the above to determine 'a'.
Ushg averaged Subjective Quality Evaluation (6 ), modified Blockiness Index ( B,, )
and modified Similarity Index (S,,), the numerator and denominator of equation (5-7)
are obtained as follow.
Therefore,
By substituting 'a' =53.346 into equation (4-1 1). the POIQE Index generated &om the
mathematical model proposed in this thesis is obtained as shown in and Figure 5-12.
Page 85
Fimre 5-12 POIOE Index
Figure 5-13 to Figure 5-18 show the plot of the POIQE Index obtained for each image set
in comparing to the Subjective Quality Evaluation results.s8 In each plot, the data
representations are as follows.
1) The Subjective Quality Evaluation results obtained fiom the experiments are
represented in markers,
Please note that the plots in Figure 5-13 to Figure 5-18 are not smoothened because the model is designed so that it can also evaluate the quality of a standalone image, where the image qualities rating of the previous and following images arc unknown. Also, these figures arc not plotted with respect to their comprcssion data size for better data presentation. As shown in Figure 5-4, the compression data size level off beyond quantization level of 10. If the figures are plotted with respect to heir comprcssion data size, the data with the low compression data size (or quantization level of 10 or higher) will be all collapsed together.
Page 86
2) The thin Line, indicated as the "Poly (Human Evaluation)" in the legend, is the best-fit
polynomial5g of the Subjective Quality Evaluation results generated by Microsoft
EXCEL using polynomial of sixth-order.
3) The POIQE Index generated from the mathematical model proposed in this thesis is
represented as a thick line in the plot.
'' The purpose of this best-fit polynomial is to smoothen the experimenta1 result with a continuous tine, so that reader can use this line to visually determine how close is the POlQE Index matching with the experimental results.
Page 88
Figure 5-15 POIOE Index for 'Girl & Arrnle'
Figure 5-16 POIOE Index for 'Lens Cover'
Page 89
Figure 5-17 POIOE Index for 'Rose'
Fieure 5-18 POIOE Index for 'Sunelasses'
5.2.4 Errors and Observations
5*2*4.1 Mean Square Error # and Root Mean Square Error
With P with as shown in equation (5-9, the root mean square error (root MSE) f i of the results generated by the proposed mathematical model is approximately 10.6 units.
The POIQE Index of all six set of image provides fairly close match to the Subjective
Quality Evaluation results, except for image set "Sunglasses" (Figure 5-18). Table 5-4
summarized the MSE and root MSE of all image sets, all image sets (excluding
"Sunglasses") and just the "Sunglasses" image sets. Excluding the image set
"Sunglasses", the POIQE Index have a root MSE of 7.4 units.
Table 5-4 Mean Sauare Error & Root Mean Sauare Error
root MSE (units)
10.6
7.4
19.8
6.1
All image sets
AM image sets (except "Sunglasses1')
Just the wSunglrrssesw image set
All image sets of quantization factor 1-1 0 (except llSunglasses")
MSE (units sq.)
1 1 1
55
394
37
Page 9 1
5,2,4.2 Observations
Based on the above analysis, two observations were found.
1) The POIQE Index is most accurate when the data compression is at its
"effective rangew.
As indicated by Figure 5-4, the compression data size of the images is most
significantly reduced at a quantization factor of 10 or below. In Section 5.1.3,
the range with quantization factor below 10 is defined as the "effective range".
Any further detail truncation with higher quantization factor will have no
significant achievement on data reduction. However, the image quality
continuously reduced. For industrial applications, there is no reason to fiuther
compress the image, if there is no significant reduction in data size. Therefore,
the evaluation of the performance of the proposed POIQE should be focused at
the "effective range".
In Figure 5-13 to Figure 5-17, the POIQE Index provides a very close match to
the Subjective Quality Evaluation at quantization factor of 10 or below. The root
MSE of the images (except image set "Sunglasses"), from quantization factor I
to 10 is approximately 6.1 units as shown in Table 5-4. As the quantization
factor increases beyond 10, the POIQE Index begins to deviate fiom the
Subjective Quality Evaluation.
Combining the above, it is reasonable to say that the proposed POIQE Index
provides the most accurate match with the Subjective Quality Evaluation, when
the data compression is at its "effective range".
2) The POIQE Index provides a good match to the Subjective Quality
Evaluation, except when the image contains only very simple contents.
Page 92
The POIQE Index provides a reasonably close match to the Subjective Quality
Evaluation for image sets "Clifford", "Key", "Girl & Apple", "Lens Cover" and
"Rose" (Figure 5-13 to Figure 5-17). However, the POIQE Index for image set
"Sunglasses" (Figure 5-18) is relatively higher than the Subjective Quality
Evaluation. In another words, the proposed POIQE rates the "Sunglasses" image set
at a higher quality rating than the subjective human evaluation. The reason for this is
explained as follow.
From the image set "Sunglasses", it is obvious that the image contains only very
simple details. Majority of the area in the image is occupied by a plain smooth
background, and the image contains details of very simple outlines and features.
Therefore, there is relatively less information for the viewer to use in recognizing the
details. As a result, a very small amount of information truncation will cause the
image to become unrecognizable. For the same reason, images with simple details
are more likely to be rates down by the human subjective evaluation; even only very
small percentage of details is been truncated.
5.3 Validation of Model
5.3.1 Purpose
The purpose of the validation experiment is to verify the capability of the proposed
mathematical model in matching the Subjective Quality Evaluation of image sets
other than the test image sets. Also, it allows further investigations of the two
observations obtained in Section 5.2.4.
Page 93
53.2 Method and Procedure
The validation experiments are performed by four subjects. Using the same procedures
documented in Section 5.1.2, the subjects are requested to provide quality evaluation for
five sets of images as shown in Figure 8-7 to Figure B-l 1. Note that the subjects and
images sets in the validation experiment are completely different fkom those in the model
tuning experiments.
53.3 Validation Result
533.1 Evaluation Data
Obtained from the validation experiments, the subjective quality evaluation is shown in
Table B-7 to Table B-1 I. Using the model, the calculated resuits for Blockiness Index,
Similarity Index, Modified Blockiness Index, Modified Similarity Index and POIQE
Index are summarized in Table B-12 to Table B-16 respectively
Figure 5-19 to Figure 5-23 show the plots of the POIQE Index obtained for each image
set in comparing to the Subjective Quality Evaluation results. In each plot, the data
representations are the same as indicated in Section 5.2.3.
m e 5-19 POIOE Index for 'Bus'
Figure 5-20 POIOE Index for 'Cars'
Finure 5 2 1 POIOE Index for 'Com~ater'
Fienre 5-22 POIOE Index for 'Table'
Finure 5-23 POIOE Index for 'Mouse'
533.2 Observations
The validation shows a good match between the POIQE Index and the Subjective
Quality Evaluation. Also, the validation results further demonstrate findings
observed in Section 5.2.4.
Figure 5-19 to Figure 5-22 show that the POIQE Index is most accurate when the data
compression is within its "effective range". The POIQE Index provides a very close
match to the Subjective Quality Evaluation at a quantization factor of 10 or below.
Also, the results indicate that the POIQE Index provides a good match to the
Subjective Quality Evaluation, except for images containing only very simple
contents. Similar to image set "Sunglasses", image set "Mouse" contains only very
simple details. As a result, the Subjective Quality Evaluation of image set "Mouse"
was lower than the POIQE Index as shown in Figure 5-23.
Page 97
Chapter 6 Conclusion
In this thesis, a mathematical model for objective image quality evaluation was proposed.
This model was designed in accordance with the hypothesis described in Chapter 1,
which is that the HVS evaluates image quality based on patterned artifacts and
recognizable details of the image. The model measures the image quality of DCT-based
compressed images using the psychovisually-based indexes of blockhess and similarity,
and combines these to yield the so-called POIQE index.
The model was calibrated by using the subjective assessment results obtained by the
model tuning experiments. Then, the performance of the model was verified by a series
of validation experiments. The validation results demonstrated that the POIQE Index is
most accurate within the "effective range" of compression for most images. However,
the model does not perform as well for images containing only very simple contents.
The validation results obtained indicate that the model has its limitations. Firstly, the
model is not very accurate when evaluating images with simple contents. This is mainly
because the human vision system is very complex. The use of only two fidelity criteria in
designing a model to simulate the quality assessment result of the human vision system
may not be sufficient.
Secondly, the model is not designed to account for users' subjective expectation or
definition of image quality. It is designed to measure how closely a test image matches
the original (with respect to HVS-perceived detail). Therefore, a test image that is less
noisy than the original is penalized by the POIQE (since it does not match), whereas
other definitions of quality might consider it better than the original.
Page 98
Thirdly, for simplicity, a simple model was used to combine the modified Blockiness and
modified Similarity indexes. A model with additional tuning parameters might give better
accuracy. (For example, P#= crB, +bS, + c&S, ; where a, b, and c are tuning
parameters)
F M y , the human contrast sensitivity is highly dependent on the variation of spatial
frequency. Therefore, the size of the blocks within the pattern artifact could have a
significant effect on the assessment results. However, only one block size was used in
both the tuning and validation experiments.
In summary, the proposed model provides image quality evaluation according to how the
human vision system assesses an image. The model provides accurate evaluations of
PEG compressed images that contain enough detail.
Further research is suggested to consider the introduction of an additional index to
measure the complexity of the image, so that the POIQE model can extend its image
evaluation capability to images with very simple contents. In addition, future work could
also investigate the effectiveness of using additional tuning parameters in the equation
that computes the POIQE index fiom the modified Blockiness and modified Similarity
indexes.
References
[ A B ~ ~ I E. H. Adelson and J. R. Bergen, "Spatiotemporal models and the
perception of motion", Journal of the Optical Society of America A, 2(2):
284-95, Feb 1985
[AC72] J. W. Allnatt and J. M. Corbett, "Adaptation in observers during television
quality-grating tests", Ergonomics, 1 5: 353-3 56, 1972
[Ade93] E. H. Adelson, "Perceptual organization and the judgment of brightness",
Science 262: 204292044,1993
[AG87] L. Arend and R Goldstein, " Simultaneous constancy, lightness, and
brightness ", Journul of the Optical Society of America A, 4(12): 2281-5,
1987
l?3=58] H. B. Barlow, "Temporal and spatial summation in human vision at
different background intensities", Journal of Physiology, 14 1 : 3 3 7-50,
1958
lJ3FLSV971 L. Boch, S. Fragola, R. Lancini, P. Sunna and M. Visca, "Motion detection
on video compressed sequences as a tool to correlate objective measure
and subjective score", 13th International Conference on Digital Signal
Processing, DSP v 2 p 1 1 19-1 122, Jul2-4 1997
PK981 C. J. van den Branden Lambrecht and M. Kunt, "Characterization of
human visual sensitivity for video imaging application", S i g d Processing
67: 255-269, 1998
Pa%e 1 0
C. J. van den Branden Lambrecht, "A working spatio-temporal model of
the human visual system for image restoration and quality assessment
applications", Proc. International Conference on Acoustics Speech and
Si@ Processing, AtIanta, GA, 7- 1 0 May 1996
F. W. Campbell and R. W. Gubisch, "Optical Quality of the Human eye",
Journal of Physiology, 1 86: 558-578, 1966
Ward Cheney and David Kincaid, Numerical Mathematics and
Computing, Brooks/Cole Publishing Company, Belmont California, 1985.
Tom N. Cornsweet, Visual Perception, Harcourt Brace Jovanovich, Inc.,
Orlando, Florida, 1970
Russell L. De Valois and Karen K. De Valois, Spatial Vision, OKford
University Press, New York, 1988
H. De Lange Dm, "Research into the Dynamic Nature of the Human
Fovea-Cortex Systems with intermittent and modulated light", Journal of
the Optical Society of America, 48(11): 777-84, 1958
Ralph Merrill Evans, An introduction to color, New York: Wiley, 1948.
G. L. Frendendall and W. L. Behrend, "Picture Quality - Procedures for
Evaluating Subjective Effects of Interference", Proc. IRE, Vol. 48, pp.
988-998, 1960
P. N. Gardiner, M. Ghanbari, D. E. Pearson, K. T. Tan, "Development of a
perceptual distortion meter for digital video", IEE Conference Publication
Proceedings of the 1997 International Broadcusting Convention, n447,
v. 1, pp. 493497, Amsterdam, 1997
B. Girod, "Eye movements and coding of video sequences", SPE Visual
Communications and Image Processing, volume 1001, pages 398-405,
1988
Page 101
Rafael C. Gonzalez and Richard E. Woods, Digital Image Processing,
Addison-Wesley Publishing Company, 1 992.
Barry G. Haskell, Atul Puri and Arun N. Netravali, Digital video: An
introduction to MPEG-2, Chapman & Hall, New York, NY, 1997.
Y. Horita, M. Katayama, T. Murai, M. Miyahara, "Objective picture
quality scale for video coding", International Conference on Information
Processing ICIP-96, Lausanne, Switzerland, Vol. 3, pp. 3 1 9-322, 1 6- 1 9
September 1996
D. J. Heeger and P. C . Teo, "A model of perceptual image fidelity", Proc
International Con$ on Image Processing, Washington, DC, pp. 343-345,
23-26 October 1995
D. H. Kelly, "Visual responses to time-dependent stimuli. I. amplitude
sensitivity measurements", Journal of the Optical Society of America, 5 1 :
422-9, 196 1
D. H. Kelly, "Flicker Fusion and Harmonic Analysis", Journal of the
Optical Society of America, 5 1 (8): 9 1 7-8, 1 96 1
J. Lubin, "A visual discrimination mode for image system design and
evaluation", E. M (Ed.), Visual Models for Target Detection and
Recognition, World Scientific publishers, Singapore, 1995
J. Lubin, "Human Vision System Model For Objective Picture Quality
Measurements", IEE Conference Publication Proceedings of the 1997
International Broadcasting Convention, n447, p498-503, Amsterdam,
1997
Makoto Miyahara, Kazunori Kotani and V. Ralph Algazi, "Objective
picture quality scale (PQS) for image coding", EEE Transactions on
CommUILications v. 46 no9 p. 12 15-26, Sept. 1998
Joan L. Mitchell, William B. Pennebaker, Chad E. Fogg and Didier J.
LeGall, UPEG Video Compression Standard, Chapman & Hall, New
York, NY, 1996.
K. T. Mullen, "The contrast sensitivity of human color vision to Red-
Green and Blue-Yellow chromatic gratings", Jounal of Physiology, 359:
381-400,1985
F. I. Van Ness and M. A. Bouman, "Spatial Modulation Transfer in
H u m Eye", Journal of the Optical Society of America, 57(3): 401-6,
Mar. 1967
M. R. M. Nijenhuis and F. I. J. Blornmaert, "Perceptual-error measure and
its application to sampled and interpolated single-edged images", Journal
of the Optical Society of America A, 14(9): 2 1 1 1-27, Sept. 1997
A. N. Netravail and B. G. Haskell, Digital Pictures: Representation,
Compression and Standards, 2& Ed., Plenum Press, New York, 1995
p. 110-1 16
3. Okamoto, S. Hangai, K. Miyauchi, "A study on subjective and objective
evaluation method for coded moving picture quality", Picture Coding
Symposium PCSf96, pp. 5 1 9-523, Melbourne, 3- 1 5 March 1 996
W. 0. Owen, "Spatial-temporal integration in the human peripheral
retina", Visual Research, 12(1): 10 1 1-26, 1972
Maurice Hemi Leonard Pirenne, Vision and the eye (2nd ed.), 1967
J. G. Robson, "Spatial and temporal contrast sensitivity hct ions of the
visual system", Journal of the Optical Society of America, 56: 1 141 -2,
1966
B. E. Rogowitz, "The Human Visual System: A Guide for the Display
Technologist", Proc. SID, 24(3): 235-52, 1983
Page 103
c~og921
[Sc h56]
[SteSS]
[SzU87]
[TGf 971
[TGP98]
[TH94j
Bemice E Rogowitz, "Displays: the human factor", Byre v. 17 p. 195-9,
July '92
0. H. Schade, "Optical and Photoelectric Analog of the Eye", Journal of
the @tical Society of America, 46(9): 72 1039,1956
Kate Traurnan Steinitz (editor), Leonardo da Vinci's Trattato della pithrra
(Treatise on paintin&, 1958
Francis W. Sears, Mark W. Zemansky and Hugh D. Young, University
Physics, 7th ed., Addison-Wesley Publishing Company, 1987.
K. T. Tan, M Ghanbari, D.E. Pearson, "A video distortion meter", PCS
'97, pp. 1 1 9- 1 22, Berlin, Gennany, 10- 12 September 1 997
K. Tan, M. Ghanbari, D. E. Pearson, "An objective measurement tool for
MPEG video quality", Signal Processing, Vol.70 N0.3 p. 279-94, Nov. '98
P. C. Teo and D. J. Heeger, "Perceptual image distortion", Proc
International Con$ on Image Processing, pp. 982-6, Austin, TX, 13-16
November 1994
A. Vassilev, "Contrast sensitivity near boarders: Significance of test
Stimulus form, size, and duration", Vision Research, 13(4): 7 19-30, April
1973
Brian A. Wandell, Foundations of Vision, Sinauer Associates, Inc.,
Sunderland, Massachusetts, 1 995
Andrew B. Watson and Jr. Albert J. Ahumada, "Model of human visual-
motion sensing", Journal of the Optical Society of America A, 2(2): 322-
341, Feb. 1985
Ronald E. Walpole and Raymond H. Myers, Probability and Statistics for
Engineers and Scientists, 4& edition, New York, 1 989
[WSgl] Nicholas J. Wade and Michael Swanston, Visual Perception: An
Introduction, Routledge, London, 199 1
Page 105
Appendk A Lossless Image Compression S ~ ~ c s
The followings are some statistics on lossless compression provided by the company Bitlaa Inc. The company provided a series of comparison60 between their lossless image-compression technique, BitJazz (symbol: JZZ) and other conventional technique, such as BMP, PCX, etc.
The legend below provides a list of acronyms used in the tables. Table A-1 shows a comparison of features between each of these compression techniques. Please note that all the compared techniques are lossless, except JPG.
Legend:
EPS Photoshop Encapsulated Post Script. These files were made with binary encoding.
S C T Scitex CT files.
PXR Pixar Image Computer files.
PS2 Photoshop 2 files.
BMP MS-DOS Bitmap files.
TGA Truevision Targa files.
RAW These files have a 0-byte header.
PCX ZSofi PC Exchange files.
PSD Native Photoshop 5 files-
PCT Macintosh PICT files.
IFF Amiga Interchange File Format.
PDF Photoshop Portable Document Format. These files were made with ZIP (a form of LZW) compression.
TIF Tagged Image File Format. These files were made with LZW (LempeYZiv/Welch) compression.
PNG Portable Network Graphics. These files were made with no interlace and with an adaptive filter. The PNG file size randomly varies within
Thc comparison in this section is obtained fiom the web site httD:f/www.bitiazzcom/a~I~sis.html and httD9fwww.bi tiazz.com/statistics.hhnl.
Page 106
a couple hundred bytes.
L W LuraWave, available fiom LuRaTech. These files were made with baseline lossless compression and without a key.
JZZ PhotoJazz files.
Table A-1 Feature Com~arison of Various Imaee Format
Table A-2 and Table A-3 show the compression ratios resulted fiom using different compression techniques on a set of 24 photo-quality images donated to research by Kodak. Each of the 768x512 images is represented in RGB color map6'. Please note that the compression ratio of lossless compression can seldom exceed more than 2.6: 1.
isdud
RCB
CMYK
Gnysuk
IhotoRe
L.b
Mmltkbuael
w Spot
Compradom
L4mku
Noa-lm8gc Dun
1cchoma
CRC
m4
L.ycn
6' Referring to , the uncompressed data size is approximate 1.18 M bytes. Assuming each RGB pixel requires 3 x 8 bits of storage space.
PSD PS2 BMP DCS EPS CTF IFF JPC JZZ P m PCX PDF PNC PXR RAW SCT TCA TtF
l . l m . . l . . . l l l
l l l l l l l o o o o a l l •
l l l a l a l l l . l l l l l l l l . . . . . l l . l l l . l l a l •
l l l
. l . 1 1 1 1 . t l
a l l 1 I 1 1 . 1 l
RLE RLE LZW RLE MJT JZZ RLE RLE LZ77 U 7 7 LZW l l l 0 . 0 0 l . l . l a a l l l
l l l l a l l l . l a * . . l l l
l l a l l l . l
. l
a l l
l
Page 107
Table A-2 Compression Ratio of Various Image Format
PS2 I BMP I TGA - LWF - size - inti0
Page 108
Table A-3 Snmmarv of Com~ression Ratio
JZZ
2.47
worst c----------------- worse uncompressed better best
Mtao Comp,do, -ti0
EPS
3.748
IFF
1.21
S m
3.998
PXR
3.999
RAW
1-00
1.50
PS2
1.00
PCX
1.02
TIF
1.58
BMP
1.00
PNG
1.75
PSD
1.03
TCA
1.00
PCT
1.03
Tuning and Validation Ekperrements' Image S&
Page 110
Quantization factor = 2 Quantization hctor = 7 Quantization fhctor = 12
~ k & t i o n factor = 3
Ouantization Factor = 35.0
Original (cliffid) Quantization factor = 25.0
Page I l l
Page 112
Quantization &tor = 1 S
Page 11 3
Page 114
Fienre B-3 'Girl & ADD^^' Images
Quantization factor = 1 Quantization -or = 6 Quantization &tor = 1 1
Quantization factor = 2 Quantization fictor = 7 Quantization factor = 12
Quantization factor = 8
Quantization hctor = 9
Quantization factor = 10 Quantization &tor = 15
Page 115
Quantization firctor = 16
Quantization factor = 17
quantization factor = 18
Quantization factor = 19
Quantization tactor = 20
Quantization fgctor = 22.5 Quantization *tor = 35.0
Quantization fgctor = 25.0
Quantization factor = 27.5
Quantization &tor = 32.5
Page 1 16
Fieure B-4 'Lens Cover' Images
Page 117
Quantization &tor = 16 Quantization &tor = 22.5
Quantization factor = 25.0 Quantization fhctor = 17
Quantization fictor = 27.5 Quantization tactor = 18
Quantization &tor = 30.0 Quantization factor = 19
Quantization &tor = 32.5 Quantization factor = 20
Page 118
Figare B-5 'Rose' Images
Quantization hctor = 4
Quantization factor = 5
Quantization fictor = 7
. .
Quantization kctor = 8
Quantization factor = 6 Quantization fiicror = 9
Page 119
Quantization fhctor = 10
Quantization factor = 11
Quantization factor = 1 2
Quantization &tor = 14
Quantization &tor = 16
Quantization factor = 17
Quantization factor = 18
Page 120
Page 121
Quantization factor = 15
Page 122
Quantization fictor = 17
-tion factor = 32.5
Page 125
Figure B-8 'Cars' Images
Quantization factor = 1
Quantization factor = 12
Quantization factor = 3 Quantization factor = 8
Quantization factor = 14
Quantization factor = 5 Quantization fhctor = 10
Page 126
Quantization factor = 18
Quantization fkctor = 19
Quantization factor = 20
Quantization factor = 22.5 Quantization &tor = 35.0---
--
Quantization factor = 25.0 Original (cats002)
Page 127
Quantization ktor = 22.5
Quantization factor = 32.5
Page 129
Page 130
Quantization k t u r = 16 Quantization factor = 22.5 Quantization tactor = 35.0
Page 131
Fimre Ell 'Mouse' Images
Quantization &tor = 1 5
Page 132
Quantization factor = 25.0 Original ( m o d 1)
Page 133
Quantization factor
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
22.5 25
27.5 30
32.5 35
Evaluation for
Nandi (07-Sep99)
90 90 90 86 78 65 63 50 52 40 39 33 35 34 15 15 15 15 15 15 15 15 15 15 15 15
Table El
Ruby (1 8-Aug-99)
98 94 93 88 83 68 75 68 60 53 57 53 50 42 40 40 45 41 45 40 38 29 28 20 20 20
Table of
Ryan (1 9-Aug-99)
95 86 84 82 75 56 52 48 46 34 36 36 32 25 25 25 15 15 15 25 15 10 10 10 10 10
Sabiective
Alice (23-Aug-99)
99 95 90 85 89 79 75 80 58 39 60 54 50 44 45 29 33 31 29 25 20 15 15 10 10 5
'Clifford'
Average
95.20 92.40 88.80 84.60 80.60 69.00 68.00 62.80 57.60 46.20 46.80 46.20 45.80 40.60 33.00 32.80 28.60 26.80 25.80 26.00 22.60 16.80 16.60 12.00 14.00 11.00
Oualitv
Mark (07-Sep-99)
94 97 87 82 78 77 75 68 72 65 42 55 62 58 40 55 35 32 25 25 25 15 15 5 15 5
Standard Deviation
3.56 4.39 3.42 2.61 5.50 9.35 10.34 13.54 9.74 12.64 10.94 10.76 1230 12.28 12.55 15.30 13.22 11.45 12.38 8.94 9.56 7.16 6.73 5.70 4.18 6.52
Page 134
'Kevs'
Average
94.60 90.20 82.20 81.20 72.60 61.00 45.00 41.00 51.60 39.00 34.00 33.40 22.40 26.20 22.60 16.80 22.20 15.00 15.00 7.00 7.00 8.00 6.00 6.00 6.00 6.00
Quantization factor
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
22.5 25
27.5 30
32.5 35
Standard Deviation
5.13 3.11 17.08 4.09 11.76 18.75 7.91 4.85 18.62 22.19 15.97 10.01 12-90 13.77 12.90 12.19 9.47 6.12 6.12 2.74 5.70 4.47 5.48 5.48 5.48 5.48
Oualitv
Mark (07-Sep99)
95 91 93 86 83 83 38 37 72 65 45 35 32 33 25 25 25 25 25 5 15 15 15 15 15 15
Evaluation for
Nandi (07-Sep99)
86 85 53 83 53 35 35 35 30 15 15 27 15 29 15 5 15 15 15 5 5 5 5 5 5 5
Table B-2
Ruby (1 8-Aug-99)
98 93 88 80 75 56 51 46 40 30 30 30 10 10 38 10 28 10 10 10 0 10 0 0 0 0
Table of
Ryan (1 9-Aug-99)
95 92 82 75 72 56 52 42 46 25 25 25 15 15 5 10 10 10 10 5 5 5 5 5 5 5
Sabiective
Alice (23-Aug-99)
99 90 95 82 80 75 49 45 70 60 55 50 40 44 30 34 33 15 15 10 10 5 5 5 5 5
Page 135
Table
Quantization factor
1 2 3 4 5 6 7 8 9 10 1 1 12 13 14 15 16 17 18 19 20 22.5 25 27.5 30 32.5 35
B-3
Ruby (1 8-Aug-99)
97 95 93 90 89 85 87 60 66 68 70 55 56 58 51 46 42 40 30 45 20 10 20 10 10 10
Table of
Ryan (1 9-Aug-99)
95 90 80 78 75 72 55 46 42 40 34 40 34 25 15 15 15 15 10 10 10 5 5 5 5 5
Subiective
Alice (23-Aug-99)
99 95 90 85 80 81 75 79 70 71 71 60 57 55 54 50 44 45 40 39 35 30 25 10 25 15
Oualitv Evaluation
Mark (07-Sep99)
97 94 86 84 77 73 70 65 67 55 65 62 46 42 35 35 32 36 25 25 25 15 15 5 15 5
Standard Deviation
4.69 21.79 7.00 5.64 5.68 9.26 15.04 14.67 14.14 15.86 21.17 14.92 13.61 15.55 16.67 14.52 12.05 12.11 11.94 13.68 9.08 9.35 7.42 2.74 8.37 4.47 ,
for 'Girl & Annle'
Nandi (07-Sep99)
87 45 76 76
Average
95.00 83.80 85.00 82.60
76 1 79.40 61 1 74.40 50 1 67.40 43 58.60 42 j 57.40 36 1 54.00 27 1 53.40 27 48.80 26 26
43.80 41.20 36.00 iz 34.20
25 3 1.60 25 32.20
24.00 25 IS 28.80 25 ' 23.00 15 15 5 5
15.00 16.00 7.00 12.00
5 i 8.00
Page 136
'Lens Cover'
Average
90.40 86.60 78.40 66.80 63.00 52.60 47.60 42.20 42-80 32.80 34.00 3 1.60 29.60 20.00 18.00 14.00 17.00 13.00 15.00 12.00 12-00 4.00 4.00 5-00 5.00 8.00
Quantization factor
1 2 3 4 5 6 7 8 9 10 1 1 12 13 14 15 16 17 18 19 20 22.5 25 27.5 30 32.5 35
Standard Deviation
8.02 6.1 1 11.72 16.32 15.33 23.27 17-90 22.29 15.09 19.21 13.82 15.26 13.63 14.58 12.04 12.94 12.55 13.51 11-73 5.70 4.47 2.24 2.24 3.54 3.54 4.47
Oualitv Evaluation
Mark (07-Sep99)
88 85 81 75 74 65 55 55 45 34 38 40 35 30 25 25 25 IS 15 15 15 5 5 5 5 5
for
Nandi (07-Sep99)
80 79 60 49 48 25 27 15 26 5 15 15 15 5 5 5 5 5 5 5 5 5 5 5 5 IS
Subiective
Alice (23-Aug-99)
99 95 90 85 80 75 60 70 50 55 49 45 40 40 35 30 35 35 35 20 15 5 5 10 10 5
Table B-4
Ruby (1 8-Aug-99)
98 90 86 75 67 68 66 46 63 45 43 43 43 10 10 0 10 0 10 10 10 0 0 0 0 10
Table of
Ryan (19-Aug-99)
87 84 75 50 46 30 30 25 30 25 25 15 15 15 15 10 10 10 10 10 15 5 5 5 5 5
Page 137
Quantization factor
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
22.5 25
27.5 30
32.5 35
Standard Deviation
9.09 8.86 9.52 6.96 21.58 21.66 18.29 16.73 21.10 16.41 26.75 14.46 18-36 16.75 11.59 9.96 10.96 4-47 7.91 8-37 4.18 4.18 2.24 2.24 2.24 2.24
Table B-S
Ruby (1 8-Aug-99)
98 96 95 86 83 80 59 68 65 50 55 50 45 45 3 1 3 1 30 20 20 10 10 10 0 0 0 0
Evaluation for
Nandi (07-Sep99)
80 77 77 76 44 42 43 36 35 35 15 15 15 34 15 15 15 15 5 5 5 5 5 5 5 5
'Rose'
Average
90.80 89.00 86.20 84.00 71.00 64.40 59.80 55.00 52.00 45.60 39.60 32.80 38-00 35.80 26.60 22.20 25.20 18.00 15.00 12.00 9.00 9.00 4-00 4.00 4-00 4.00
Oarrlitv
Mark (07-Sep-99)
95 95 94 94 93 85 85 74 75 68 73 45 62 55 42 35 41 25 25 25 15 15 5 5 5 5
Table of
R Y ~ (I 9-Aug-99)
82 82 75 79 52 40 42 40 25 25 10 25 25 10 15 15 15 15 10 5 5 5 5 5 5 5
Subiective
Alice (23-Aug-99)
99 95 90 85 83 75 70 57 60 50 45 29 43 35 30 15 25 15 15 15 10 10 5 5 5 5
Page 138
Standard Deviation
3.36 16.27 20.44 18.38 21.58 1837 14.92 18.51 2 1.52 18.43 20.58 24.00 21.10 16.54 19.63 12.40 18.09 11.73 17.85 5.70 8.22 10.37 5.70 9.62 2.24 2.24
Quantization factor
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
22.5 25
27.5 30
32.5 35
Onalitv
Mark (07-Sep99)
93 87 85 85 83 72 65 70 62 62 58 64 55 45 52 36 35 25 42 15 25 30 15 25 5 5
Subiective
Alice (23-Aug-99)
99 95 90 80 75 61 50 55 45 44 38 43 40 33 35 25 29 10 30 5 10 15 10 10 5 5
Table B-6
Ruby (1 8-Aug-99)
96 93 86 67 67 57 57 40 40 45 30 45 45 28 10 10 45 30 0 0 10 10 0 0 0 0
Table of
Ryan (1 9-Aug-99)
95 78 75 58 36 34 34 25 25 25 15 15 15 10 15 15 5 5 10 10 5 5 5 5 5 5
Evaluation for
Nandi (07-Sep99)
90 55 40 39 38 29 30 30 5 15 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
'Sunelasses'
Average
94.60 81.60 75.20 65.80 59.80 50.60 47.20 44.00 35.40 38.20 29.20 34.40 32.00 24.20 23.40 18.20 23.80 15.00 17.40 7.00 11.00 13.00 7.00 9.00 4.00 4.00
Page 139
- -- 4
Standard Deviation
4.50 5.68 9.18 5.12 7.27 11.31 10.24 15.73 14.93 9.36 10.56 10.20 10.80 11.75 7.19 8.62 11.00 11.00 8.62 8.72 5.26 4.76 1.00 1.00 1.00 1.00 i
Evaluation
Barb 8 - F e W
95 85 82 76 70 49 49 40 38 40 38 27 38 22 27 22 22 22 22 7 7 7 7 3 3 3
Snbiective Oaalitv
Vicky 29-Jan-00
99 98 97 88 86 73 63 71 72 25 62 51 60 50 25 1 15 15 25 25 15 15 5 5 5 5
Quantization factor
1 2 3 4 5 6 7 8 9 10 I1 12 13 14 15 16 17 18 19 20 22.5 25 27.5 30 32.5 35
for 'Bus'
Average
95.25 93.25 84.75 83.25 80.75
65 62.25
60 57.75 38.25 48.75
41 44 37
31.5 28.25 25.75 25.75 28.25
18 10.5
8 5.5 4.5 4.5 4.5
Table B-7
Ming 4-Dec-99
98 95 75 85 83 65 63 55 55 47 43 45 37 35 33 25 25 25 25 15 15 5 5 5 5 5
Table of
IVY 2-Dec-99
89 95 85 84 84 73 74 74 66 41 52 41 41 41 41 41 41 41 41 25 5 5 5 5 5 5
Standard Deviation
4.35 9.78 5.07 6.06 6.18 15.97 5.23 5.38 4.55 11.82 13.56 14.80 15.63 11.79 9.60 13.00 10.00 9.57 9.60 5.26 5.26 5.26 5.42 1.00 1.00 1.00
for 'Cars'
Average
94.25 84.75 83.5 79
75.25 64.75
48 41.25
39 30.5 25
26.5 25.5 21.25 19.25 23.75
20 17.5 19.25 10.5 10.5 10.5
7 4.5 4.5 4.5
Evaluation
Barb 8-Feb-00
95 86 76 82 70 70 49 40 40 29 29 29 22 22 22 25 25 15 22 7 7 7 3 3 3 3
Table
Quantization factor I 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
22.5 25
27.5 30
32.5 35
B-8
Ming 4-Dec-99
98 88 85 78 75 65 55 48 45 43 35 38 35 25 25 33 25 25 25 15 15 15 15 5 5 5
Table of
IVY 2-Dec-99
88 71 87 71 72 43 44 35 35 15 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
Subjective Qualitv
Vicky 29-Jan40
96 94 86 85 84 81 44 42 36 35 3 1 34 40 33 25 32 25 25 25 15 15 15 5 5 5 5
Standard Deviation
1.26 2.00 7.26 8.37 3.86 4.12 8.18 13.04 15.88 12.11 10.24 12.37 16.36 10.30 10.08 19.77 15.61 17.91 7.05 13.20 8.83 12.19 9.83 8.30 6.40 6.40
Onalitv
Vicky 29-Jan-00
96 95 93 94 87 75 82 84 88 76 79 74 78 39 56 77 65 67 52 55 54 53 38 15 15 15
Evaluation
Barb 8-Feb-00
95 95 76 76 85 71 66 62 58 54 56 52 56 42 45 42 45 42 42 42 42 30 32 32 27 27
for 'Com~uter'
Average
96.25 96 85 87
83.25 75-5
72.25 66
64.25 60
64.25 56.75 57.25
47 49.75 48.75 43.5 42.25 42.5 38-25
42 36 29
21.75 20.5 20.5
of Subiective
IVY 2-Dec-99
96 99 83 93 83 81 76 65 56 62 59 56 57 62 60 3 1 3 1 25 41 3 1 39 36 3 1 25 25 25
Table
Quantization factor
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
22.5 25
27.5 30
32.5 35
B-9 Table
Ming 4-Dec-99
98 95 88 85 78 75 65 53 55 48 63 45 38 45 38 45 33 35 35 25 33 25 15 15 15 15
Page 142
Subiective Ouahtv
Vicky 29-Jan-00
93 84 92 83 82 90 81 91 80 64 66 45 64 46 25 25 25 25 5 5
40 42 5 5 5 5
Table of
IVY 2-Dec-99
99 94 92 79 78 91 75 75 70 68 60 46 60 67 60 45 60 35 41 37 36 25 25 5 5 5
Table
Quantization factor
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 I8 19 20
22.5 25
27.5 30
32.5 35
El0
Ming 4-~ec-99
98 85 88 78 75 65 55 55 43 45 43 48 35 33 38 35 25 35 25 25 25 15 15 15 5 5
Evaluation
Barb 8-Feb-00
95 86 83 74 73 74 49 55 45 37 40 33 37 32 27 22 27 22 15 15 15 15 15 7 3 3
for 'Table'
Average
96.25 87.25 88.75 78.5 77 80 65 69 59.5 53.5 52.25 43 49 44.5 37.5 31.75 34.25 29.25 21.5 20.5 29 24-25 15 8 4.5 4.5
Standard Deviation, 2.75 4.57 4.27 3.70 3-92 12.68 15.41 17.44 18.38 14.89 12.71 6.78 15.12 16.30 16.05 10.44 17.19 6-75 15.35 13.70 11.28 12.74 8.16 4-76 1.00 1.00
Page 143
Table
Quantization factor
I 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
22.5 25
27.5 30
32.5 35
E l l
Ming 4-Dee-99
98 88 85 83 75 65 58 63 45 33 35 25 15 15 25 15 15 15 15 15 15 5 5 5 5 5
Subiective Ourrlitv
Vicky 29-Jan-00
99 98 97 89 85 88 87 69 68 78 57 79 77 49 76 55 63 46 67 25 25 15 15 15 15 5
Table of
IVY 2-Dee-99
97 87 77 77 69 64 63 48 47 38 25 32 25 15 34 25 25 5 5
25 5 5
25 5 5 5
Evaluation
Barb 8-Feb-00
95 86 83 76 72 68 60 60 45 45 45 45 29 34 37 29 29 29 32 25 25 22 22 22 3 3
for 'Mouse'
Average
97.25 89.75 85.5 81.25 75.25 71.25 67 60
5 1.25 48.5 40.5 45.25 36.5 28.25 43 31 33
23.75 29.75 22.5 17.5 11.75 16.75 11.75
7 4.5
Standard Deviation 1.71 5.56 8.39 6.02 6.95 11.30 13.49 8.83 11.21 20.27 13.70 23.98 27.63 16.48 22.58 17.05 20.85 17.80 27.22 5.00 9.57 8.30 8.88 8.30 5.42 1.00
Table
Quantization factor
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
22.5 25
27.5 30
32.5 35
Validation
Table
0.36 0.79 1.09 1.66 1.94 2.5 1 2.92 3.23 3.06 4.04 4.7 5.17 6.1 6.16 5.73 5.44 5.69 6.43 7.48 8.07 11.12 12.73 14.18 16.14 17.95 21.48
Exnerirnent
Mouse
0.32 0.64 0.96 1.29 1.52 1.9
2.16 2.49 2.93 3.19 3.7 1 4.06 4.75 4-86 4.83 6.06 6.6 1 6-02 6.13 6.95 9.4 10.7 9.1
10.37 10.52 13.05
Index for the
Computer
0.28 0.66 0.99 1.36 1.76 2.13 2.43 2.89 2.7 3.82 3.8 4.54 5.2 1 6.37 6.02 7.94 7.49 7.16 6.09 5.3 1 4.27
5 6.65 7.9 6.0 1 5.98
B-12 Table
BPS
0.37 0.66 1-16 1.54 1.91 2.29 2.1 1 3.27 3 -27 3.62 3.97 3.73 4.8 5-39 5-09 5.46 6.43 6.74 8.14 6.56 9.69 7.7 7.92 8.15 8.85 9.99
of Blockiness
Cars
0.39 0.8 1.15 1.56 2.05 2.39 2.85 3 -47 3.71 4.17 4.58 4.77 4.58 5.52 6.37 6.97 7.36 7.9 8.5 1 9.09 10.96 1 1.94 12.27 12.77 12.49 15.04
Page 145
Table
Quantization factor *
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
22.5 25
27.5 30
32.5 35
Validation
Table
31% 21% 19% 1 7% 16% 18% 19% 19% 19% 19% 16% 14% 12% 11% 11% 11% 11% 12% 13% 13% 14% 13% 14% 13% 13% 11%
Emeriment
Mouse
74% 69% 65% 63% 58% 59% 58% 56% 55% 53% 52% 50% 47% 45% 47% 46% 50% 48% 47% 48% 45% 40% 38% 37% 33% 32%
513 Table
Bus
84% 80% 79% 77% 76% 74% 73% 72% 72% 70% 69% 68% 67% 67% 66% 65% 66% 64% 63% 63% 59% 53% 50% 46% 44% 39%
of Similaritv
Cars
71% 65% 61% 60% 58% 57% 56% 54% 53% 51% 47% 44% 43% 41% 41% 42% 42% 40% 40% 40% 39% 35% 30% 27% 24% 22%
Index for the
Compater
82% 76% 73% 70% 70% 65% 62% 62% 63% 63% 59% 53% 46% 45% 45% 54% 53% 49% 47% 50% 46% 44% 40% 36% 29% 28%
Experiment
Mouse
0 0.06 0.16 0.25 0.3
0.38 0.42 0.46 0.53 0.56 0.58 0.64 0.7 0.7 0.71 0.79 0.82 0.76 0.78 0.84 0.96 0.94 0.96
1 1-01 1.03
Table 5 1 4
Quantization factor
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
22.5 25
27.5 30
32.5 35
Table of
Bus
0 0.06 0.2 1 0.3 0.37 0.43 0.39 0.56 0.53 0.57 0.62 0.58 0.7 0.7 0.72 0.73 0.77 0.83 0.92 0.8 1 0.97 0.87 0.88 0.9 1 0.95
1
Index for
Computer
0 0.06 0.16 0.24 0.32 0.4 0.4 0.43 0.43 0.53 0.57 0.6
0.66 0.7 1 0.73 0.78 0.79 0.82 0.76 0.7
0.6 1 0.73 0.85 0.92 0.73 0.68
Modified BIockiness
Cars
0 0.1 1 0.22 0.3 1 0.4 0.46 0.52 0.6 0.6 1 0.66 0.69 0.67 0.68 0.72 0.78 0.84 0.86 0.9 0.93 0.96 1.05 1.08 1.06 1.09 1.06 1.02
the Vaiidation
Table
0 0.1 1 0.2 0.33 0.38 0.48 0.53 0.54 0.54 0.65 0.7 0.7
0.66 0.74 0.73 0.7 0.76 0.83 0.89 0.93 1.06 1.1 1 1.16 1.2 1.25 1.25
Page 147
Table B-15
Quantization factor
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22.5 25 27.5 30 32.5 35
Table of
Bus
1.45 f -69 1.75 I .85 1.90 2.00 2.05 2.09 2.09 2.18 2.22 2.27 2.3 1 2.3 1 2.35 2.38 2.35 2.42 2.46 2.46 2.60 2.80 2.89 3.00 3.06 3.19
Modified Similaritv
Cars
2.14 2.3 8 2.53 2.57 2.63 2.67 2.70 2.76 2.80 2.86 2.97 3 -06 3 -08 3.14 3.14 3.1 1 3.1 1 3.16 3.16 3.16 3.19 3.29 3.40 3.47 3.54 3.58
Index for
Computer
1.58 1.90 2.05 2.18 2.18 2.38 2.50 2.50 2.46 2.46 2.60 2.80 3 -00 3.03 3 -03 2.76 2.80 2.92 2.97 2.89 3 -00 3 -06 3.16 3.26 3.43 3 -45
the Validation
Table
3.38 3.60 3.64 3.68 3.70 3 -66 3.64 3.64 3 -64 3.64 3.70 3.74 3.78 3.80 3.80 3.80 3.80 3.78 3.76 3.76 3.74 3.76 3.74 3.76 3.76 3.80
Emeriment
Moase
2.00 2.22 2.3 8 2.46 2.63 2.60 2.63 2.70 2.73 2.80 2.83 2.89 2.97 3.03 2.97 3 .OO 2.89 2.94 2.97 2.94 3.03 3.16 3.2 1 3.24 3.33 3.36
-
Mouse
100 96 88 81 76 70 67 62 56 53 50 45 38 36 37 29 29 33 3 1 26 13 1 1 8 5 5 5
Validation Exneriment
Table
100 88 78 63 57 48 42 41 41 30 22 21 25 16 17 20 14 6 5 5 5 5 5 5 5 5
Table
Quantization factor
1 2 3 4 5 6 7 8 9 10 1 1 12 13 14 15 16 17 I8 19 20 22.5 25 27.5 30 32.5 35
El6 Table
Bus
100 97 89 83 79 74 76 65 67 63 59 61 52 51 49 48 46 40 32 40 24 27 24 18 13 5
of POIOE
Cars
100 92 83 76 68 63 58 51 49 44 38 38 37 32 27 22 19 15 12 9 5 5 5 5 5 5
Index for the
Computer
100 97 90 84 79 72 70 68 68 61 56 50 41 35 34 35 34 28 32 40 45 33 20 10 25 30
Page 149
Appendir C Vdeo Format and Color Spaces of Composite and Component T V S y ~ e m s
Existing color TV systems can be classified as composite system and component system. The major difference between the two is the luminance and chrominance signals of the composite system are encoded into the same channel, whereas those of the component system are transmitted separately.IHpN971 Haskell, Puri, and Netravali m N 9 7 ] provided a very good conversion between each color space used in these systems.
Composite system
Some common analog composite systems are NTSC, PAL and SECAM. Standardized in 1953, the NTSC is mostly used in North America, South America, the Caribbean and Japan. IHpN971 On the other hand, the PAL is commonly used in Western Europe and SECAM system is more common in France, Russia, the Middle East and Eastern Europe. FT)I95] The following provided the color space conversions to RGB signals. Please note that the R 'G'B ' is gamma-corrected RGB.
PAL System (YUV color space)
Y =0.299 R'+Om587G'+0.114 B' RV=1.O Y + 1.140 V
U=-Om147R'-Ua289G'+Oa436B'~4 V=0.615 R'-0.515G'-0.200 B' B'= 1.0 Y - 2.030 U
where Y is Lllminance signal U is Hue signal V is Saturation signal
NTSC System (YIQ color space)
where Y is Luminance I is Inphase Q is Quadrature
a SECAM System (YDrDb color space)
where Y is Luminance signal Db is Dr is
Component system
The component system is used in digital applications, such as P E G and MPEG compressions.
YCrCb color space (CCIR-601)
R' = 1.164 (Y - 16) + 1.596 (Cr - 128) G'= 1.164(Y- 16) - 0.813(Cr - 128) - 0.392 (Cb - 128) B'=1.164(Y-16)+2.017(Cb- 128)
where Y is Luminance signal Cr is Red color difference signal Cb is Blue color difference signal
CCIRdOl video format
As mentioned, the HVS is more sensitivity to variation of luminance information than that of chrorninance information. The CCIR-601 digital video format samples chrominance signal at much lesser spatial hquencies than the luminance signal, except for format 4:4:4. Figure C-1 shows the structures of various formats for 16-by-16 macroblocks.
Page 151
Fieure C-1 CCIR-601 dipitai video format