dithering and quantization of image and audiomandrade/tvd/2006/trabalhos1-2006/... ·...
TRANSCRIPT
Dithering and Quantization of audio and image
Maciej Lipiński - Ext 06135
1. Introduction
This project is going to focus on issue of dithering. The main aim of assignment was to develop a
program to quantize images and audio signals, which should add noise and to measure mean square errors,
comparing the quality of the quantized images with and without noise.
The program realizes fallowing:
- quantize an image or audio signal using n levels (defined by the user);
- measure the MSE (Mean Square Error) between the original and the quantized signals;
- add uniform noise in [-d/2,d/2], where d is the quantization step size, using n levels;
- quantify the signal (image or audio) after adding the noise, using n levels (user defined);
- measure the MSE by comparing the noise-quantized signal with the original;
- compare results.
The program shows graphic result, presenting original image/audio, quantized image/audio and quantized
with dither image/audio. It calculates and displays the values of MSE – mean square error.
2. DITHERING
Dither is a form of noise, “erroneous” signal or data which is intentionally added to sample data for the
purpose of minimizing quantization error. It is utilized in many different fields where digital processing is
used, such as digital audio and images.
The quantization and re-quantization of digital data yields error. If that error is repeating and correlated to
the signal, the error that results is repeating. In some fields, especially where the receptor is sensitive to such
artifacts, cyclical errors yield undesirable artifacts. In these fields dither is helpful to result in less
determinable distortions.
The field of audio is a primary example of this — the human ear hears individual frequencies. The ear is
therefore very sensitive to distortion. When we dither, we add very low level random noise to signal in order
to mask the imperfections of digital audio. Covering at all frequencies is far less sensitive and increases the
perceived dynamic range.
Basically the audio dithering is commonly used when we want to convert for instance from 24-bit to 16-
bit. We are cutting off the last 8 bits of information from each sample of audio, which correlates to very low
level sounds in the mix. Without dithering we would lose that information, and also have errors, which sound
as added harshness and noise (because the signal is random, the average signal is 0. Information that is
contained in the lowest 8 bits of 24-bit audio actually modulates the random signal, so that the average is
equal to the audio from the lowest 8 bits).
On the other hand, dithering images is a technique used in graphics to create the illusion of color depth. In
a dithered image, colors not available in the palette are approximated by a combination of colored pixels from
within the available palette. The human eye perceives the diffusion as a mixture of the colors within it. A
simple example is an image with only black and white in the color palette. By combining black and white
pixels in complex patterns can create the illusion of gray values. Such a example is shown and described in
the further part.
3. Work Environment
In order to allow processing either images or audio signals the program has two interfaces. The main
window lets choose which part user wants to consider :
The interface, which is connected with performing of images looks following :
This part of program allows to execute quantization of image with specific quantization level “N”, which
is set by user in the left field on the top. The correct values are from the range <1,8>. In the case of the
number out of range, program will show error communication.
By the button in the middle of the top the user can choose image, which is going to be processed. Loaded
file appears below the button with uppercase heading. It is described by two dimension parameters in pixels
“width” and “height”.
After selecting image and setting the quantization level it is time to processed, by the “Processing...”. As a
result of that, in the window will emarge two images, at first quantized , and as the second one the quantized
with added noise – dithered. Below each of appropriate image it will be calculated the MSE.
The second opportunity is to work on audio signals, so in this connection user may choose antoher
interface shown below :
Like previosly the user has possibility to choose the audio file to work with. After selecting, on the
interface screen will come out the original signal and the button to listen to file, which can be heard from the
loudspeakers. What is more that on the field “Audio file properties” the user will get the basics parameters of
the signal such as :
- Number of samples
- Sampling frequency
- Bits on sample
Comparing to image processing, the user may set the quatization level either. Additionaly it is option to
change form of noise – dither which is going to be added to original signal. The user can choose between :
- White Noise
- Pink Noise
“Processing” causes that below the adequate headlines will show another buttons, which previosly are hidden,
for listen and view the obtain modified signals. As well the user will get calculated values of MSE for both
examples. Pushing the “show” as a result the user will see the new window with two plots , the first quantized
and second quantized with added dithering:
4. Algorithm
Quantization is the process of approximating a continuous range of values by a small set of discrete
symbols or integer values, described in the program by “N” quantization level, where 2N is the set of
possible values (in the case of images). Distance between possible values is associated with variable
“step”:
Step = 256/ N,
where N is number from <1,8> .
Either original audio/image or quantized and quantized with dither audios/images are kept as a
matrices. The sizes of matrices are the same. In this way quantization of the original image/audio is the
effect of reduction values of the matrix of original to the range of values set indirectly by “N”and “step” :
q_img=floor(image./step).*step+step/2
The noise is added to original image/audio by selecting random number from the size of
matrix(image/audio) as following :
n = (rand(size(image))-0.5).*step;
img_d = image + n;
The Mean Square Error, which is squared difference between original and processed samples is
simply counted as :
error=(q_img-image).^2
In the case of audio processing the amplitude of the signal is in the range <-1,1>, so “step” is
described on the other way :
step=2/(Nlevel) ,
where “Nlevel” is the value set by the user in the field of interface. The user has possiblity to choose the
type of noise is going to be added. The uniform noise is covered on original signal in the range [-step/2,
step/2]. The part of code is the same like in previouse example.
5. Results ( Image processing)
The main aim was to focus on one part - audio or image, compare the results and draw a conclusions.
The experiments were made mostly for the images. The simulations were done with different kinds of
images (jpg, tiff, bmp), different sizes, with different backgrounds, and for the different quantization
levels. The results of calculated MSE’s are collected in a table below :
Mean Square Error Image
Size
“N”
quantization level Quantized Image Quantized with dither
3 905,76 1150,36
5 310,606 421,679
xray.bmp
256×256 7 155,37 218,816
3 510,255 1077,06
5 264,921 475,021
lena.tif
512×512 7 104,113 218,08
3 607,601 1159,44
5 249,593 454,681
boats.tif
640×640 7 105,674 212,127
3 488,858 897,732
5 225,056 410,695
frame.tif
352×240 7 101,704 206,585
3 871,221 1197,46
5 312,273 450,398
connie_nielsen.jpg
348×200 7 156,769 230,061
3 852,843 1124,47
5 298,797 426,742
Berries.jpg
348×277 7 151,521 221,2
3 605,871 1036,26
5 214,249 391,663
Abstract.jpg
640×480 7 109,797 206,954
3 666,446 1018,58
5 225,595 381,829
mountain.jpg
800×600 7 119,371 206,734
3 643,161 961,988
5 247,759 393,375
porshe.jpg
1024×768 7 121,905 201,744
3 652,406 1078,78
5 244,501 420,464
tree.jpg
1024×786 7 120,453 214,589
The table does not present subjective inforamtions. From the obiective point of view it can be noticed
that
- for the images with little contrast the smaller size, the bigger MSE of quantized image
- for all kind of images the results for quantization with dither are relatively close
- MSE is much higher in the images with added dither
Now it is worth to verify and look deeper on some of results. The presented images are quantized with
5 level. From the top is going to be shown original image, below from the left side quantized image and
quantized with added dither image.
The first two are grayscale images. On the second page are presented the results. As we can see
quantizaton is the process which “cut off” available values from the original palette to limited palette . For
grayscale images the effect of quantization is very visible.
boats.tif xray.bmp
xray.bmp
In the both examples we can definatly affirm that the quantization with added dither is quite better
than just the quantization. In the quantized images there are very clear boundaries between different values of
colors. Althought, the results of MSE were higher, the quantization with dither is more accessible, clear, or
just better in subjective opinion. The bounderies between colors are smooth, but on the other hand there are
visible “pixels”.
berries.jpg
frame1.tif
In this example it was processed image with clear color,
little contras, and monotonous content. As we can see,
almost all background is red with its shades. In spite of
the MSE were one of the highest, the subjective point of
view is one of the best from among, comparing to the
original. The original image is more saturated than
others two. Both processed images are perceptionally
similar, and the effect of dithering is not as visible and
clear as in another examples.
This image is characterized by lots of details,
contrasts and is colorful. The effect of quantization
is gut, although with dither is better either. All
motifs - content on the quantized image are filled
with monotonous color and are seen boundaries in
some parts. Because of the added noise, dithered
image is not as “clear” as quantized, it can be
noticed that “pixels” are spread around the
background, but spectrum of available colors is
wider.
mountain.jpg porshe.jpg
The last examples are another confirmation that dither added to image which afterwards is going to be
quantized, improve visuality. In my opinion landscapes are the best examples to show up the advanteges and
superiority of adding dither to images. On the both , we can watch smoothlness in the details, without any
visible cross. The quantized with dither images are more natural in compare to original. There is no such a
unpleasant jumps of colours and images look just better.
6. Conclusions ( Image processing)
Summing up experiements, few main conclusions, which should be emphasized, come to mind :
- the obtained results do not depend on the format of the image
- the number of quantization level has directly influence on quantized and dithered image
- the counted MSE rely on the number of levels ( the bigger N, the lower MSE)
- the higher N, the smoother and more clear image is got
- the subjective quality is connected with the content of the image
- the less details, the better quality ( f.ex. barries.jpg)
- the value of mean square error is not the factor of the perceptible quality, although if the MSE is
lower, the quality of image is better
- adding DITHER significatly improve subjectivity!!!
7. Results and Conclusions in Audio Processing
For the part of audio processing, few experiements were done with different kinds of audio files:
band.wav – the song performed by various instruments; malevoc.wav – male singing, synthetic.wav – the
song made electronicly . The results are collected and presented below in the table:
Mean Square Error Audio
Size
“N”
level Quantized Audio Quantized with dither
5 0,0223514 0,034087
10 0,00411161 0,00732908
30 0,00039174 0,000755745
50 0,000138395 0,000270658
band.wav
80 0,0000532125 0,000105396
5 0,0354698 0,0396358
10 0,00791151 0,00966012
30 0,000620619 0,000932219
50 0,000186009 0,000309043
malevoc.wav
80 0,0000627444 0,000112866
5 0,0203037 0,0312576
10 0,00452077 0,00749034
30 0,00422205 0,000777958
50 0,000144045 0,000274303
sythetic.wav
80 0,0000544256 0,000105647
Conclusions
The objective results for the same levels, are quite similar, the values of MSE does not very spread
out. Calculation were made for the signal quantizated with added white noise. Changing on pink noise has
not big influence on the MSE values.
From the subjective point of view quality of obtained signals increases with the value set by user :
- the higher level, the lower step, the more exactly processing is made, the subjective quality
better
- for the low level the sound is creaked, very unpleasant, and is percepted as a louder and noiser
- the sound, especially background, get more accessible with bigger level
What is worth to notice that the quantized with dither audio, compering to straight quantized, is more
clear if we look at contents of the audio. It is heard significant persistent noise in backround, but generally
it can be said that is better. The content of quantized audio is “dirty” and nagging in perception. Choosing
different kinds of noise do not have distinct perceptive influence on processed signal.