dithering and quantization of image and audiomandrade/tvd/2006/trabalhos1-2006/... ·...

Dithering and Quantization of audio and image

Maciej Lipiński - Ext 06135

1. Introduction

This project is going to focus on issue of dithering. The main aim of assignment was to develop a

program to quantize images and audio signals, which should add noise and to measure mean square errors,

comparing the quality of the quantized images with and without noise.

The program realizes fallowing:

- quantize an image or audio signal using n levels (defined by the user);

- measure the MSE (Mean Square Error) between the original and the quantized signals;

- add uniform noise in [-d/2,d/2], where d is the quantization step size, using n levels;

- quantify the signal (image or audio) after adding the noise, using n levels (user defined);

- measure the MSE by comparing the noise-quantized signal with the original;

- compare results.

The program shows graphic result, presenting original image/audio, quantized image/audio and quantized

with dither image/audio. It calculates and displays the values of MSE – mean square error.

2. DITHERING

Dither is a form of noise, “erroneous” signal or data which is intentionally added to sample data for the

purpose of minimizing quantization error. It is utilized in many different fields where digital processing is

used, such as digital audio and images.

The quantization and re-quantization of digital data yields error. If that error is repeating and correlated to

the signal, the error that results is repeating. In some fields, especially where the receptor is sensitive to such

artifacts, cyclical errors yield undesirable artifacts. In these fields dither is helpful to result in less

determinable distortions.

The field of audio is a primary example of this — the human ear hears individual frequencies. The ear is

therefore very sensitive to distortion. When we dither, we add very low level random noise to signal in order

to mask the imperfections of digital audio. Covering at all frequencies is far less sensitive and increases the

perceived dynamic range.

Basically the audio dithering is commonly used when we want to convert for instance from 24-bit to 16-

bit. We are cutting off the last 8 bits of information from each sample of audio, which correlates to very low

level sounds in the mix. Without dithering we would lose that information, and also have errors, which sound

as added harshness and noise (because the signal is random, the average signal is 0. Information that is

contained in the lowest 8 bits of 24-bit audio actually modulates the random signal, so that the average is

equal to the audio from the lowest 8 bits).

On the other hand, dithering images is a technique used in graphics to create the illusion of color depth. In

a dithered image, colors not available in the palette are approximated by a combination of colored pixels from

within the available palette. The human eye perceives the diffusion as a mixture of the colors within it. A

simple example is an image with only black and white in the color palette. By combining black and white

pixels in complex patterns can create the illusion of gray values. Such a example is shown and described in

the further part.

3. Work Environment

In order to allow processing either images or audio signals the program has two interfaces. The main

window lets choose which part user wants to consider :

The interface, which is connected with performing of images looks following :

This part of program allows to execute quantization of image with specific quantization level “N”, which

is set by user in the left field on the top. The correct values are from the range <1,8>. In the case of the

number out of range, program will show error communication.

By the button in the middle of the top the user can choose image, which is going to be processed. Loaded

file appears below the button with uppercase heading. It is described by two dimension parameters in pixels

“width” and “height”.

After selecting image and setting the quantization level it is time to processed, by the “Processing...”. As a

result of that, in the window will emarge two images, at first quantized , and as the second one the quantized

with added noise – dithered. Below each of appropriate image it will be calculated the MSE.

The second opportunity is to work on audio signals, so in this connection user may choose antoher

interface shown below :

Like previosly the user has possibility to choose the audio file to work with. After selecting, on the

interface screen will come out the original signal and the button to listen to file, which can be heard from the

loudspeakers. What is more that on the field “Audio file properties” the user will get the basics parameters of

the signal such as :

- Number of samples

- Sampling frequency

- Bits on sample

Comparing to image processing, the user may set the quatization level either. Additionaly it is option to

change form of noise – dither which is going to be added to original signal. The user can choose between :

- White Noise

- Pink Noise

“Processing” causes that below the adequate headlines will show another buttons, which previosly are hidden,

for listen and view the obtain modified signals. As well the user will get calculated values of MSE for both

examples. Pushing the “show” as a result the user will see the new window with two plots , the first quantized

and second quantized with added dithering:

4. Algorithm

Quantization is the process of approximating a continuous range of values by a small set of discrete

symbols or integer values, described in the program by “N” quantization level, where 2N is the set of

possible values (in the case of images). Distance between possible values is associated with variable

“step”:

Step = 256/ N,

where N is number from <1,8> .

Either original audio/image or quantized and quantized with dither audios/images are kept as a

matrices. The sizes of matrices are the same. In this way quantization of the original image/audio is the

effect of reduction values of the matrix of original to the range of values set indirectly by “N”and “step” :

q_img=floor(image./step).*step+step/2

The noise is added to original image/audio by selecting random number from the size of

matrix(image/audio) as following :

n = (rand(size(image))-0.5).*step;

img_d = image + n;

The Mean Square Error, which is squared difference between original and processed samples is

simply counted as :

error=(q_img-image).^2

In the case of audio processing the amplitude of the signal is in the range <-1,1>, so “step” is

described on the other way :

step=2/(Nlevel) ,

where “Nlevel” is the value set by the user in the field of interface. The user has possiblity to choose the

type of noise is going to be added. The uniform noise is covered on original signal in the range [-step/2,

step/2]. The part of code is the same like in previouse example.

5. Results ( Image processing)

The main aim was to focus on one part - audio or image, compare the results and draw a conclusions.

The experiments were made mostly for the images. The simulations were done with different kinds of

images (jpg, tiff, bmp), different sizes, with different backgrounds, and for the different quantization

levels. The results of calculated MSE’s are collected in a table below :

Mean Square Error Image

Size

“N”

quantization level Quantized Image Quantized with dither

3 905,76 1150,36

5 310,606 421,679

xray.bmp

256×256 7 155,37 218,816

3 510,255 1077,06

5 264,921 475,021

lena.tif

512×512 7 104,113 218,08

3 607,601 1159,44

5 249,593 454,681

boats.tif

640×640 7 105,674 212,127

3 488,858 897,732

5 225,056 410,695

frame.tif

352×240 7 101,704 206,585

3 871,221 1197,46

5 312,273 450,398

connie_nielsen.jpg

348×200 7 156,769 230,061

3 852,843 1124,47

5 298,797 426,742

Berries.jpg

348×277 7 151,521 221,2

3 605,871 1036,26

5 214,249 391,663

Abstract.jpg

640×480 7 109,797 206,954

3 666,446 1018,58

5 225,595 381,829

mountain.jpg

800×600 7 119,371 206,734

3 643,161 961,988

5 247,759 393,375

porshe.jpg

1024×768 7 121,905 201,744

3 652,406 1078,78

5 244,501 420,464

tree.jpg

1024×786 7 120,453 214,589

The table does not present subjective inforamtions. From the obiective point of view it can be noticed

that

- for the images with little contrast the smaller size, the bigger MSE of quantized image

- for all kind of images the results for quantization with dither are relatively close

- MSE is much higher in the images with added dither

Now it is worth to verify and look deeper on some of results. The presented images are quantized with

5 level. From the top is going to be shown original image, below from the left side quantized image and

quantized with added dither image.

The first two are grayscale images. On the second page are presented the results. As we can see

quantizaton is the process which “cut off” available values from the original palette to limited palette . For

grayscale images the effect of quantization is very visible.

boats.tif xray.bmp

xray.bmp

In the both examples we can definatly affirm that the quantization with added dither is quite better

than just the quantization. In the quantized images there are very clear boundaries between different values of

colors. Althought, the results of MSE were higher, the quantization with dither is more accessible, clear, or

just better in subjective opinion. The bounderies between colors are smooth, but on the other hand there are

visible “pixels”.

berries.jpg

frame1.tif

In this example it was processed image with clear color,

little contras, and monotonous content. As we can see,

almost all background is red with its shades. In spite of

the MSE were one of the highest, the subjective point of

view is one of the best from among, comparing to the

original. The original image is more saturated than

others two. Both processed images are perceptionally

similar, and the effect of dithering is not as visible and

clear as in another examples.

This image is characterized by lots of details,

contrasts and is colorful. The effect of quantization

is gut, although with dither is better either. All

motifs - content on the quantized image are filled

with monotonous color and are seen boundaries in

some parts. Because of the added noise, dithered

image is not as “clear” as quantized, it can be

noticed that “pixels” are spread around the

background, but spectrum of available colors is

wider.

mountain.jpg porshe.jpg

The last examples are another confirmation that dither added to image which afterwards is going to be

quantized, improve visuality. In my opinion landscapes are the best examples to show up the advanteges and

superiority of adding dither to images. On the both , we can watch smoothlness in the details, without any

visible cross. The quantized with dither images are more natural in compare to original. There is no such a

unpleasant jumps of colours and images look just better.

6. Conclusions ( Image processing)

Summing up experiements, few main conclusions, which should be emphasized, come to mind :

- the obtained results do not depend on the format of the image

- the number of quantization level has directly influence on quantized and dithered image

- the counted MSE rely on the number of levels ( the bigger N, the lower MSE)

- the higher N, the smoother and more clear image is got

- the subjective quality is connected with the content of the image

- the less details, the better quality ( f.ex. barries.jpg)

- the value of mean square error is not the factor of the perceptible quality, although if the MSE is

lower, the quality of image is better

- adding DITHER significatly improve subjectivity!!!

7. Results and Conclusions in Audio Processing

For the part of audio processing, few experiements were done with different kinds of audio files:

band.wav – the song performed by various instruments; malevoc.wav – male singing, synthetic.wav – the

song made electronicly . The results are collected and presented below in the table:

Mean Square Error Audio

Size

“N”

level Quantized Audio Quantized with dither

5 0,0223514 0,034087

10 0,00411161 0,00732908

30 0,00039174 0,000755745

50 0,000138395 0,000270658

band.wav

80 0,0000532125 0,000105396

5 0,0354698 0,0396358

10 0,00791151 0,00966012

30 0,000620619 0,000932219

50 0,000186009 0,000309043

malevoc.wav

80 0,0000627444 0,000112866

5 0,0203037 0,0312576

10 0,00452077 0,00749034

30 0,00422205 0,000777958

50 0,000144045 0,000274303

sythetic.wav

80 0,0000544256 0,000105647

Conclusions

The objective results for the same levels, are quite similar, the values of MSE does not very spread

out. Calculation were made for the signal quantizated with added white noise. Changing on pink noise has

not big influence on the MSE values.

From the subjective point of view quality of obtained signals increases with the value set by user :

- the higher level, the lower step, the more exactly processing is made, the subjective quality

better

- for the low level the sound is creaked, very unpleasant, and is percepted as a louder and noiser

- the sound, especially background, get more accessible with bigger level

What is worth to notice that the quantized with dither audio, compering to straight quantized, is more

clear if we look at contents of the audio. It is heard significant persistent noise in backround, but generally

it can be said that is better. The content of quantized audio is “dirty” and nagging in perception. Choosing

different kinds of noise do not have distinct perceptive influence on processed signal.

dithering and quantization of image and audiomandrade/tvd/2006/trabalhos1-2006/... ·...

Documents