image compression using hopfield neural network
TRANSCRIPT
1
Faculty of Science and Technology
School of Computing, Engineering and Physical Sciences
ZHENG Zong Bin
Image Compression Using Hopfield Neural Network
(EL3990)
Submitted in partial satisfaction of the
Requirements for the degree of
Bachelor of Engineering (with Honors)
In Digital Communications
April 2009
I declare that all material contained in this report, including ideas described in the text, computer programs and drawings, is my own work except where explicitly and individually acknowledged.
Signed …………………….
Date ……………………….
2
Abstract
It goes without saying that the technology of image compression is playing more and
more vital role all over the world. The aim of this project is to develop an image
compression technology using Hopfield Neuron Network.
In this dissertation, image compression is base on an algorithm named Block Truncation
Coding (BTC). It is a coding technique which simple and effective. Next step is using
Hopfield Neuron Network to define a new threshold to optimize the reconstruction
quality. No matter in BTC or HNNBTC that the block size is fixed.
Additionally, using Hopfield Neuron Network and Block Truncation Coding to compress
color image in different color space, like RGB, YUV, YIQ, and HSV.
Finally, try to use variable block size in order to do more compression and improve a
better compression image quality.
Above algorithm are implemented by using Matlab software.
3
Acknowledgements
Obviously, I cannot finish this project without my Project Supervisor Dr. Martin Ray
Varley, who helps me through all stages of this project. Without his valuable advice and
considerable patience, I would have not been able to complete this project successfully.
Meanwhile, thanks Dr. Chen xin, due to his advice in programming.
Finally, I specially thank my parents and girl friend, for their loves, encouragement, and
support all the way through my life.
4
Contents
Abstract……………………………………………………………………………………………………………………………………...2
Acknowledgements…………………………………………………………………………………………………………………….3
Contents……………………………………………………………………………………………………………………………………..4
List of Figures………….………………………………………………………………………………………………………………….6
List of Tables…………..…………………………………………………………………………………………………………………11
Chapter 1 Introduction
1.1 Research Background……………………………………………………………….……………………………….…12 1.2 Project Objectives……………………………………………………………………………..…………………………13 1.3 Compression System Model…………………………………………………..……………..…………………….13
1.3.1 Model 1.3.2 Redundancy Types 1.3.3 Lossless & Lossy
1.4 Principle of Image Compression……………………………………………………………………………….….16 1.4.1 Encoding 1.4.2 Decoding
1.5 Principle of Hopfield Neural Network……………………………..…………….…………………………….17 1.6 Information Measurement……………………………………………………………………………..……………19 1.7 Outline of Dissertation………………………………………………………………………………………………...21 1.8 Test Images…………………………………………………………………………………………………………………22
Chapter 2 Block Truncation Coding 2.1 Introduction ……………………………………………………………………………………………………………….23 2.2 Basic BTC Algorithm…………………………………………………………………………………………………….24 2.3 Process ……………………………………………………………………………………………………………………...26
2.4 Program Flowchart ………………………………………………………………………………..………..…………30 2.5 Result Analyze………………………………………………………………………………………..………………….31
2.5.1 Cameraman 2.5.2 Pirate 2.5.3 Woman_darkhair
2.6 Summary………………………………………………………………………………………………………………….…40
Chapter 3 Hopfield Neural Networks
3.1 A Short Introduction about Neural Network…………………………………………………………….….42 3.2 Basic Algorithm ………………………………………………………….………….……………..……………………43 3.3 Program Flowchart ………………………………………………………………………………………..……………45
3.4 Result Analyze………………………………………………………………………………………….………………...46
5
3.4.1 Cameraman 3.4.2 Pirate 3.4.3 Woman_darkhair
3.5 Summary....................................................................................................................…...53
Chapter 4 Comparison 4.1 Algorithm Compression………………………………………………………………………………………….……54 4.2 Image Result Comparisons……………………………………………….…………………………….…………..54 4.3 Summary………………………………………………………………………………………………………………….…57 Chapter 5 Color Image Compression 5.1 Color Image Compression Base on RGB Color Space…………………………………………………….59
5.1.1 Introductio 5.1.2 Flowchart 5.1.3 Result Analyze
5.2 Color Image Compression Base on YUV Color Space…………………………………………………..64 5.2.1 YUV Color Space 5.2.2 Flowchart 5.2.3 Result Analyze 5.2.3.1 Lena_color 5.2.3.2 Madril_color
5.2.4 A way to Achieve More compression in YUV Color Space…………………………………69 5.3 Color Image Compression Base on YIQ Color Space…………………………………………………..…74 5.4 Color Image Compression Base on HSV Color Space…………………………………………………….77 5.5 Conclusion & Comparisons…………………………………………………………………………………………..80 Chapter 6 Variable Block Size 6.1 Principle ………………………………………………………………………………………………………………..…….83 6.2 Programming……………………………………………………………………………………………………………….84 6.3 Practical Works…………………………………………………………………………………………………………….85 Chapter 7 Conclusion and Future Work 7.1 Conclusion …………..………………………………………………………………………………………………………86 7.2 Future Work………………………………………………………………………………………………………..……….88 References …................................................................................................................................89 Appendix A. Statement of Work (SOW)……………………………………………………………………….………….93
Appendix B. Gantt chart………………………………………………………………………………………………….……...95
6
List of Figure Figure 1.1 Compression System Model Figure 1.2 Lossless & Lossy
Figure 1.3 Encoding
Figure 1.4 Decoding
Figure 1.5 Three Neurons Hopfield Network
Figure 1.6 Bidirectional Connection Diagram
Figure 1.7 State Transitions to Stable State
Figure 1.8 Cameraman Figure 1.9 Pirate Figure 1.10 Woman_darkhair Figure 1.11 Lena_color Figure 1.12 Mandril_color Fihgure2.1 Original Image Figure2.2 Bitmap Figure2.3 Reconstructed Image Figure2.4 Difference between Original and Reconstructed image Figure2.5 Many Blocks Figure2.6 Errors Figure2.7 Programming Flowchart Figure2.8 Original Image Figure2.9 Histogram of Original Figure2.10 4*4 block size Figure2.11 Histogram of 4*4 block size Figure2.12 8*8 Block Size Figure2.13 Histogram of 8*8 Block Size Figure2.14 16*16 Block Size Figure2.15 Histogram of 16*16 Block Size Figure2.16 Original Image Figure2.17 4*4 Block Size Figure2.18 8*8 Block Size Figure2 .19 16*16 Block Size Figure2.20 Original Figure2.21 Intensity Profile of Original Figure2.22 4*4 Block Size Figure2.23 Intensity Profile of 4*4 Block Size Figure2.24 8*8 Block Size Figure2.25 Intensity Profile of 8*8 Block Size Figure2.26 16*16 Block Size
7
Figure2.27 Intensity Profile of 16*16 Block Size Figure2.28 Original Figure2.29 4*4 Block Size Figure2.30 8*8 Block Size Figure2.31 16*16 Block Size Figure2.32 Original Figure2.33 4*4 Block Size Figure2.34 Errors Figure2.35 8*8 Block Size Figure2.36 Errors Figure2.37 16*16 Block Size Figure2.38 Errors Figure3.1 Flowchart Figure3.2 Original Figure3.3 Histogram of Original Figure3.4 4*4 Block Size Figure3.5 Histogram of 4*4 Block Size Figure3.6 8*8 Block Size Figure3.7 Histogram of 8*8 Block Size Figure3.8 16*16 Block Size Figure3.9 Histogram of 16*16 Block Size Figure3.10 Original Figure3.11 4*4 Block Size Figure3.12 8*8 Block Size Figure3.13 16*16 Block Size Figure3.14 Original Figure3.15 Intensity Profile of Original Figure3.16 4*4 Block Size Figure3.17 Intensity Profile of 4*4 Block Size Figure3.18 8*8 Block Size Figure3.19 Intensity Profile of 8*8 Block Size Figure3.20 16*16 Block Size Figure3.21 Intensity Profile of 16*16 Block Size Figure3.22 Original Figure3.23 4*4 Block Size Figure3.24 8*8 Block Size Figure3.25 16*16 Block Size Figure3.26 4*4 Block Size Figure3.27 Errors Figure3.28 8*8 Block Size
8
Figure3.29 Errors Figure3.30 16*16 Block Size Figure3.31 Errors Figure4.1 Group 1 Cameraman (BTC) Figure4.2 Group 2 Cameraman (HNNBTC) Figure4.3 Group 1 Pirate (BTC) Figure4.4 Group 2 Pirate (HNNBTC) Figure4.5 Woman_darkhair (BTC) Figure4.6 Woman_darkhair (HNNBTC) Figure5.1 RGB Figure5.2 RGB Cube Figure5.3 RGB 24-bit Color Cube Figure5.4 Flowchart Figure5.5 Lena Figure5.6 Bitmap __ RGB Figure5.7 Decoding _ RGB Figure5.8 Reconstructed Figure5.9 Differences Figure5.10 4*4 Block Size Figure5.11 8*8 Block Size Figure5.12 16*16 Block Size Figure5.13 4*4 Block Size Figure5.14 8*8 Block Size Figure5.15 16*16 Block Size Figure5.16 BTC Figure5.17 HNNBTC Figure5.18 Flowchart Figure5.19 Original Image Figure5.20 YUV Color Image Figure5.21 Bitmap_ YUV Figure5.22 Encoding_YUV Figure5.23 Reconstructed YUV Color Image Figure5.24 Reconstructed Image Figure5.25 4*4 Block Figure5.26 8*8 Block Figure5.27 16*16 Block Figure5.28 4*4 Block Size Figure5.29 8*8 Block Size Figure5.30 16*16 Block Size Figure5.31 4*4 Block Size
9
Figure5.32 8*8 Block Size Figure5.33 16*16 Block Size Figure5.34 4*4 Block Size Figure5.35 8*8 Block Size Figure5.36 16*16 Block Size Figure5.37 4*4 Block Size Figure5.38 8*8Block Size Figure5.39 16*16 Block Size Figure5.40 4*4 Block Size Figure5.41 8*8 Block Size Figure5.42 16*16 Block Size Figure5.43 Bitmap_YUV’ Figure5.44 Bitmap_YUV Figure5.45 Bitmap_RGB Figure5.46 Decoding_YUV’ Figure5.47 Deocidng_YGB Figure5.48 Decoding_RGB Figure5.49 Group 1 Group 2 Group 3 Figure5.50 Group 1 Group 2 Group 3 Figure5.51 RGB Image Figure5.52 YIQ Image Figure5.53 Y Channel Figure5.54 I Channel Figure5.55 Q Channel Figure5.56 Bitmap_YIQ Figure5.57 Decoding_YIQ Figure5.58 Reconstructed YIQ Image Figure5.59 Reconstructed Image Figure5.60 BTC Figure5.61 HNNBTC Figure5.62 BTC Figure5.63 HNNBTC Figure5.64 HSV Color Space Figure5.65 RGB to HIS Conversion
Figure5.66 RGB Image Figure5.67 HSV Image Figure5.68 HSV_HSV Figure5.69 BTC Figure5.70 HNNBTC Figure5.71 BTC
10
Figure5.72 HNNBTC Figure5.73 RGB Figure5.74 YUV Figure5.75 YIQ Figure5.76 HSV Figure 6.1 Flowchart Figure 6.2 Original Figure 6.3 Bitmap Figure 6.4 Average of two classes about 8*8 & 4*4 Block Size
11
List of Table
Table 2.1 Parameter for ‘Cameraman’
Table 2.2 Parameter of ‘Pirate’
Table 2.3 Parameter of ‘Woman_darkhair’
Table 3.1 Parameter of ‘Cameraman’
Table 3.2 Parameter of ’Pirate’
Table 3.3 Parameter of ‘Woman_darkhair’
Table 4.1 Compare BTC and HNNBTC
Table 4.2 MSN and SNR
Table 5.1 Compare BTC & HNNBTC in RGB Color Space
Table 5.2 Compare BTC & HNNBTC in YUV Color Space
Table 5.3 Comparison
Table 5.4 Compare BTC & HNNBTC in YIQ Color Space
Table 5.5 Compare BTC & HNNBTC in YIQ Color Space
Table 5.6 Compare Color Image Compressions in Different Color Space
Table 5.7 MSE & SNR in Image Compression Using Different Color Space
12
Chapter 1 Introduction
1.1 Research Background
Digital image processing techniques have been being used more and more widely in
many fields such as multimedia, the Internet, television and fax, etc.
In effect, the objective of image compression is to minimize the data of digital images [1].
Suppose, there is a 512 x 512 pixels image, and 8 bits (256 difference grey levels) replace
each pixel, which covers this image is over 2,000,000 bits (256 Kbytes). It is obviously
that the amount of storage holds an extraordinary space. Result from the huge data to be
stocked, image Compression becomes one of the most important key techniques in image
processing.
Transmit data used in an efficient form is the other motivation. If an image is to be
transmitted across a channel, in other words, 256 Kbytes are to be transmitted across a
channel. Cost is a very vital aspect that needs to consider [1].
In past twenty years, modern techniques based on the Neural Networks, fractal theory
and wavelet transform have been successfully used to Image Compression.
This project focuses on the application of Hopfield Neural Network for compressing
images. The basic compression method is named Block Truncation Coding, which has a
better development at the end of 1970s’ by Purdue University [2] (Chapter 2), next is to
combine BTC and Hopfield Neural Network to implement image compression (Chapter
3), which main purpose is to ensure an appropriate computational energy function in
order to get a stable state to define a threshold [3].
13
1. 2 Project Objectives
First and foremost, is to get a brief about image compression, realize its scope,
significance,
Second, understand the principle of image compression.
Third, start from the basic algorithm called Block Truncation Coding.
Fourth, investigate another algorithm named Hopfield Neuron Network.
Fifth, improve the function using Matlab.
Sixth, compare BTC and HNNBTC.
Seventh, have a complete plan about the whole plan, including a risk assessment of each
task.
1.3 Compression System Model
1.3.1 Model
Figure 1.1 Compression System Model
A general compression system model is shown above. f(x,y) represents the original image,
and f’(x,y) is the reconstructed image. This is a process illustration how an original image
be reduced. It should be across encoding (includes source and channel encoder) and
decoding (channel and source decoder). The process essentially transforms two-
dimensional array pixels into a statistically associated with a data set [4].
Here, we only focus on source part, which means source encoder and decoder.
14
1.3.2 Redundancy Types
The aim of source encoding is to reduce more and more redundancy as possible in the
original image. The redundancy may be classified into three different types [4] :
(1) Coding Redundancy
Huffman coding is the most popular technique. The key is to use difference length
binary numbers to represent each pixel value. This exerts a tremendous influence
in reducing coding redundancy.
(2) Inter-pixel Redundancy
If two pixels are adjacent, the pixel value will be similar. As the result of high
correlation, a difference can be used as representing a pixel value between it and a
neighbor.
(3) Psycho visual Redundancy (irrelevancy)
This is a way to reduce less important or unimportant elements without harming
the visual quality of the image if this is a visual image. For example, the human
eye cannot distinguish an image quantized to 8-bit per pixel or 7-bit per pixel.
Therefore, psycho visual redundancy would be presented in the 7-bit resolution
image. In fact, psycho visual redundancy is a lossy compression technique.
1.3.3 Lossless & Lossy
Date back to 1948, C.E.Shannon formulated the concept of distortion function in his “The
Mathematical Principles of Communication” [5]. In this book, it said, there is redundancy
in any information, the radio of redundancy is related with the probability of information
about the size of each symbol (numbers, letters or words) occurs. In 1959, the rate
distortion theory was established, which laid the foundation of the source coding theory.
Hence, derived from two basic ways of data compression: Lossless and Lossy [6].
15
i) Lossless, from an objective point of view, it reduces the amount of data (e.g.
redundancy in space and time) required to represent an image. At the same time,
retaining all the information in the image. More important, there are no errors
between the original and reconstructed image. Reconstructed image is the copy of the
original image.
ii) Lossy, human perception of the information in an image normally does not involve
quantitative analysis of every pixel value in the image. Therefore, the different above
the last technique is that it can be eliminated without significantly impairing the
quality of image perception. However, this loss information cannot be restored.
Figure 1.2 has shown the traditional digital image lossless and lossy compression method.
[5]
Figure 1.2 Lossless & Lossy
16
1.4 Principle of Block Truncation Coding
1.4.1Encoding
Figure 1.3 Encoding
1.4.2 Decoding
Figure 1.4 Decoding
17
1.5 Principle of Hopfield Neural Network
Hopfield Neural Network offered the ‘good’ image quality through the feedback from
neurons output to the input [7], which is a common type of neural network, as shown
below (a network with 3 neurons) [7] [8].
Figure 1.5 Three Neurons Hopfield Network
It is obviously that this is a single layer network with feedback. On the left, the elements
are not neurons, but are merely fan-out elements enabling the diagram to be drawn
clearly.
Below diagram shows a bidirectional connection between any two neurons.
Figure 1.6 Bidirectional Connection Diagram
18
There is an equal probability of attempting to fire for any one neurons. Because there are
three neurons, the attempting fire probability is 1/3.
The activation function for input neurons is [8]:
Where: represents the threshold of neurons i. means the weight from neuron j to neuron i. n is the number of neurons in the network.
According to
If a Hopfield neural network contains n neurons, then the output layer is a n-bit binary
number, ( =8) state of energy can be calculated. In the previous, the network is a
three neurons network, so there are =8 states. The stored pattern to which the network
converges depends on the input pattern and the connection weight matrix.
A state transition diagram for the Hopfield network could be drawn like the following,
but it must be arranged in order of decreasing computational energy:
Suppose the energy of each state shows below:
S1<S2<S3<S4<S5<S6<S7<S8
19
Figure 1.7 State Transitions to Stable State
Where: means the state i.
The network eventually finds a stable state, according to a minimum of computational
energy, so S1 is the stable state.
1.6 Information Measurement
The image compression quality depends on the image resolution, which defined the
smallest number of discernible line pairs per unit distance [9]. The higher resolution in
the reconstruction means the less compression, the compression ratio is smaller, the better
quality, the lower MSE, and higher SNR is.
• Compression Ratio
This is a standard to quantify the amount of the image has been compressed. If original
image takes M bits, and the reconstructed takes N bits, then the compression ratio is
defined like
∑∑ ∑∑= = = =
−−−+−=n
i
n
j
n
i
n
jjxixFjFijxixFiFjE
1 1 1 1
22 )]()()[1)(1()]()([
20
For example, at first each pixel represents 8-bit, after compression the representation
becomes 4-bit, in other words, the compression radio =8/4=2.
• Mean Square Error(MSE)
If a original image f(x,y) was compressed and a new image f’(x,y) construct after
reconstruction. In general, there will be a difference (loss of information) between
corresponding pixels in the two images. The mean square error (MSE) [10], assuming an
image of size M by N pixels, is given by:
M‐1 N‐1
MSE= (1/MN) ∑ ∑ (f(x, y) – f’(x, y)) 2
X=0 y=0
The lower MSN means the better reconstruction in a quantitative sense. Ideally zero.
• Signal to Noise Ratio
The signal-to-noise ratio (SNR) of a reconstructed image effectively interprets all the
errors introduced by the compression as ‘noise’, and the original image f(x,y) as ‘signal’.
Conventionally the SNR is expressed in decibel (dB), and the formula for this like:
The higher SNR is corresponding to the better quality of the reconstructed image.
• Peak Signal‐to‐Noise Radio
Another measure used is peak signal-to-noise radio (PSNR). Here, ‘peak signal’ is the
maximum grey level possible under the original resolution. If the resolution of original
image is 8-bit, 255 is the maximum grey level, hence,
Where: PSNR is expressed in decibels (dB).
21
The higher PSNR is corresponding to the better quality of the reconstructed image.
In this dissertation, I used SNR to measure whether the reconstructed image is good or
not.
1.7 Outline of Dissertation
This dissertation is divided into 8 parts.
Chapter 1: The backgroup study. This chapter discusses the developments of compression
technology, objectives and significance of this project, some measurement to define the
quality of image compression.
Chapter 2: Block Truncation Coding. It contains the principle of BTC algorithm, result
analyze, conclusion.
Chapter 3: Hopfield Neural Network basic on block truncation coding. This chapter
shows the history and principle of Hopfield Neural Network. Analyse the result and gives
a conclusion.
Chapter 4: Comparisons. This is focus on the difference of BTC and HNNBTC
Chapter 5: Color Image Compression. As we know, color image has many difference
models, like RGB, YUV. This chapter is to use BTC and HNNBTC to compressing color
image no matter in RGB model or YUV model.
Chapter 6: Variable Block Size. Before, image cuts into a fixed block size (4*4,8*8 or
16*16), sometimes, the reconstructed image cannot reach a better quality, under this
situation, it can be cut into variable block. If in this area, there are a lot of information
22
then use a smaller block size (4*4). If the area contains unimportant information, then
represent it by a large size (8*8).
Chapter 7: Error control. It is no doubt that when in the transmission, some mistake will
happens.
Chapter 8: Conclusion and Future Work. Main idea is to point out some improvement.
1.8 Test Images
Here list a few image will be test in my programming [11].
Figure 1.8 Cameraman Figure 1.9 Pirate Figure 1.10 Woman_darkhair
Figure 1.11 Lena_color Figure 1.12 Mandril_color
23
Chapter 2 Block Truncation Coding
2.1 Introduction
Among various kinds of lossy compression, Block Truncation Coding (BTC) is a simple
image coding technique [12]. It has the following advantages.
(1) Being easy to implement compared with other block based compression methods
such as transform coding [13] and vector quantization [14];
(2) High quantization;
(3) Good image quality for reconstruction;
(4) Fast computing speed, 5 times than DCT coding;
(5) Relatively high compression ratios
In its original form, BTC was designed in such a manner that the reconstructed block
preserved the first and second moments of the original block [3]. In recent years, several
efforts have been made to improve the coding efficiency of the basic BTC technique:
Arce and Gallagher, Jr. [15] showed that the truncated block is well approximated by
wide-sense Markoff statistics, simultaneity, and improved coding efficiency by using
median filter roots. In order to get higher compression ratios, a modified BTC technique
that combined BTC with vector quantization was put forward by Udpikar and Raina [16].
Healy and Mitchell [17] carried out use an inter-frame system to reduce the bit rate.
Compression ratios in the range of 5: to 6:1 have been reported [15][16][17].
Many difference BTC methods have been defined by using distinct kinds of quantization
and error criteria, such as full-band Absolute Moment Block Truncation Coding
(AMBTC), Sub-banding Absolute Moment Block Truncation Coding (SAMBTC). Even
though they can’t reach the highest compression ratios, simple implement, faster, more
efficient attract human’s sight [18] [19].
24
2.2Basic BTC Algorithm
Block Truncation Coding was first developed by Delp and Mitchell [12], has been
developed by Purdue University [20]. This technique used a one bit (two-level)
nonparametric quantizer adaptive over local regions of the image. It could keep well local
statistics characteristic of the image [20].
In this image compression scheme, first and foremost, divided image into small n x n
nonparametric blocks, which are coded individually.
Let M=n X n, x(1), x(2),…..x(N) be the pixel values of a block of the original image. The
first two sample moments and the sample variance are, respectively,
=
Where x(i) is the value if the image pixel values.
Next, design the one bit quantizer, a threshold , and two reconstructed levels a and b (output).
If output=b
If output=a
Normally, set the as the threshold value , then
M = (M-q) * a + q*b
M = (M-q)* + q*
Where M is n x n, q is the number of pixels which are greater than the threshold value.
Using the equation above, it can get output a and b.
I=1, 2, 3, 4, 5……M
25
a= ‐
b=
However, in 1985, Udpikar and Raina [21] made a improvement of the BTC technique.
Only the first- order statistical information is preserved, namely the mean of the pixels
less than a threshold , and is the pixels that are greater than or equal to the
threshold. The new output levels are defined as,
a = = where x (i)
b = = where
Where M is n x n, q is the number of pixels which are greater than the threshold value [20].
Threshold values of quantizer and two reconstructed levers will change as statistical
character of a block changes. In other words, encoding is a processing aim at local region.
Furthermore, after the quantization, block will be representing by a n x n mapping matrix.
This matrix consists of pixel classifications (bitmap), and representative intensity for each
class.
Finally, the receiver reconstructed the image block by calculating ‘a’ and ‘b’, the put
these values in accordance with the code in the bit map [22] [23] [24] [25].
2.3 Process
26
Step 1: Open Image
Basic on the grey level image is a two-dimensional image, and color is a three-dimensional image. When a image is read, then it will judge it is a grey level or color image. If read a ‘cameraman.tif’ image, then program will show that:
Step 2: Determine the block size.
This is a new idea of my programming. Define a function called ‘range’. Before encoding, the system will ask which block size we want.
Step 3: Calculate the average of each block.
The first 4*4 block of ‘cameraman.tif’ image is like the following:
X=
Now, set the average of this matrix as the threshold value.
= (156+157+160+159+156+157+159+158+158+157+156+156+16/+157+154+154)/16=157.125.
Step 4: Building a bitmap
“1” replace those values are bigger than or equal to , otherwise uses “0”, hence, this 4*4 block’s bitmap becomes:
bitmap =
Step 5: Work out two reconstructed level.
This succeeds automation implementation. This means,
three programs can be represented by one function.
Here, choose a 4*4 block.
27
Work out two reconstructed levels, in simpler words, calculate the average value of “1”
and “0”. Individually, a=156, and b=159. These values will send with bitmap together to
receiver. The reconstructed block will change into
X’ =
It goes without saying that a lot of data is compressed. The original image takes 16 x 8
bits to represent. However, if using BTC, it only needs 16 bits (bit map), and two 8 bit
(“1” & “0”) reconstructed level.
Compression ratio is
= = 4:1
At this time, the data rate changes from 8 bit per pixel to 2 bit per pixel.
For example, open “cameraman.tif”.
Fihgure2.1 Original Image Figure2.2 Bitmap
On the left, it is the original image. Then, cut the image into 4*4 block, the right side is a
bitmap of the original image which made by ‘1’ and ‘0’.
28
After that, use the bitmap to decode the image in order to get the reconstructed image.
It seems that these two images are the same. Because the image is too large, I made them
smaller.
However, if zoom in the image, it is easy to find out there are lots difference between
original and reconstructed images.
Especially, when two blocks are adjacent, the differences become more visual.
Meanwhile, edge will drop down sharply, which, it seems not so smooth. If focus on the
area of the cameraman’s head of and the sky, it is very clearly that there are so many
blocks. The disadvantage of using BTC will cause the contouring artifacts. I will continue
discuss about this later.
Figure 2.5 Many Blocks
Figure 2.3 Reconstructed Image Figure2.4 Difference between Original and Reconstructed image
29
Also, if use the original image to subtract the compressed image, then I can get the following figure.
Figure 2.6 Errors Figure 2.6 is using the original image (Figure 2.1) to minus the reconstructed image
(Figure 2.3). The same pixel part becomes to black (‘0’), which is to say the darker area
the smaller errors it has. Because the color of coat and cameraman’s head is black, these
areas have fewer changes. From the above figure, it is no doubt that the most errors occur
between the cameraman and backgroup, which means the edge is a fluctuated period.
The darker area is the smaller errors it has.
Because the color of coat and cameraman’s head
is black, these areas have fewer changes. From
the above figure, it is no doubt that the most
errors occur between the cameraman and
backgroup, which means the edge, is a
fluctuated period.
30
2.4 Program Flowchart
Principle of program is graphically shown in the following figure.
Figure2.7 Programming Flowchart
31
2.5Result Analyze
2.5.1 Cameraman
Figure2.8 Original Image Figure2.9 Histogram of Original
Figure2.10 4*4 block size Figure2.11 Histogram of 4*4 block size
32
Co
Figure2.12 8*8 Block Size Figure2.13 Histogram of 8*8 Block Size
Figure2.14 16*16 Block Size Figure2.15 Histogram of 16*16 Block Size
33
Conclusion:
(1) From the above four figures, some part of image became blur, for example,
cameraman’s hair change from smooth into frizzled, the contour line of face is less
resolution as the block size increases. To sum up, the smaller block size, the better
quality image is.
(2) Histogram figures, the x axis is the sample pixel values, which from 0 to 255 for
the unsigned integer 8 format. And the other axis is number of pixels against grey
level. By these figures, they show that pixel value distributions will changes as the
block size changes. 16*16 block size distribute is unevenly than 4*4 block size.
Figure2.16 Original Image Figure2.17 4*4 Block Size
Figure2.18 8*8 Block Size Figure2 .19 16*16 Block Size
34
Conclusion:
(1)When zoom in four images with the same times, it can be saw the block size change
larger and larger. The image became blur because some information is redundant by BTC
technique. The larger block, the more information lost, and this kind of loss cannot be
restored.
In brief, I made up a form to sum up this ‘cameraman’ image.
Image Correlation Size(KB) (theoretical)
(actual)
MSE SNR
Original 100% 257 1 1 4*4 99.47% 65 4:1 3.9538:1 40.8640 26.4010 8*8 98.88% 41 6.4:1 6.2682:1 86.1284 23.1629 16*16 98.13% 35 7.5294:1 7.3428:1 142.5833 20.9737
Table 2.1 Parameter for ‘Cameraman’
a.MSE
Where: f(x,y) is the original image, f’(x,y) is the reconstructed image
By this formula, if more data are redundant, in other words, f’(x,y) is smaller, this lead to
MSE changes to smaller. It is a standard to evaluate the compression radio, or it is
computed the loss information between relevant pixels in the two images. MSE is higher,
the reconstruction in a quantitative sense will worse.
MSE drops down when block size rises, the quality of reconstructed image decrease. This
answered to the description of the figures above.
b.SNR
35
The lower MSE, the higher SNR for most cases. The higher SNR is corresponding to the
better quality of the reconstructed image. 26.4010, 23.1629, 20.9737, SNR of the 4*4 is
the highest in three compressions, which means, 4*4 has the better quality in theory. In
fact, the better compress quality in human’s eye is 4*4 block size image.
2.5.2 Pirate
Figure 2.20 Original Figure 2.21 Intensity Profile of Original
Note: On the right, the figure shows the intensity profile of the red line in the original image on the left side.
Figure 2.22 4*4 Block Size Figure 2.23 Intensity Profile of 4*4 Block Size
36
Figure 2.24 8*8 Block Size Figure 2.25 Intensity Profile of 8*8 Block Size
Figure 2.26 16*16 Block Size Figure 2.27 Intensity Profile of 16*16 Block Size
37
Conclusion:
Here, I used a function called ‘improfile’ in Matlab. It calculated the intensity value, and
the plot them along a line or a multiline path in an image.
By these figures, the peak value are nearly the same, however, valleys are becomes lower
and lower as the block changes. More important, the difference of the neighbor pixel
value turn into smaller, because the block is larger, more pixels contain into a block. This
is the reason why those figures on the right hand looks like the same light and dark levels
for some neighbor pixels, the value are equal continued for a few pixels. And the
advantage of these reconstructed levels is that the bias components of neighboring blocks
are strongly correlated [26].
Also, under the action of the image cut into blocks, and more pixel value change into the
average value (class’0’&class’1’), the number of valleys is turn to less and less, the
compression ratio is increased.
The larger block used, the more contour of block will come out in the reconstructed
image.
Figure 2.28 Original Figure 2.29 4*4 Block Size
38
Figure 2.30 8*8 Block Size Figure 2.31 16*16 Block Size
(1)The first original image is so clear that eyeball were looking sideways. It looks like he
was finding his prey. The next image is so-so, the eyeball were still there but not so plain
than the previous one. However, when the block size gets into 8*8, it cannot see the
pirate’s eyeball, and the man seems blind. This is also the same information include in
the next image. It proved what I said before: larger block is, the worse quality
reconstructed image is.
Thus, this is a form to sum up this ‘pirate’ image.
Image Correlation Size(KB) (theoretical)
(actual)
MSE SNR
Original 100% 257 1 1 4*4 99.10% 65 4:1 3.9538:1 40.7506 25.5814 8*8 98.20% 41 6.4:1 6.2682:1 80.9846 22.5988 16*16 96.93% 35 7.5294:1 7.3428:1 137.2873 20.3065
Table 2.2 Parameter of ‘Pirate’
2.5.3 Woman_darkhair
Figure 2.32 Original
39
Figure 2.33 4*4 Block Size Figure 2.34 Errors
Figure 2.35 8*8 Block Size Figure 2.36 Errors
Figure 2.37 16*16 Block Size Figure 2.38 Errors
40
Conclusion:
On the left side are the compression images, on the other hand are the errors between
original and reconstructed image, it is used the original image (Figure 2.32 original)
minus the reconstructed image separately (Figure 2.33, Figure 2.35, Figure 2.37). If both
of them have the same pixel value, then the result becomes ‘0’ (black). The larger pixel
value difference increase, the dark level will decrease. It goes without saying that the
more errors happen in the last image, more redundancies, the worse quality it is.
Simultaneous, I made a form to compare those reconstructed images.
Image Correlation Size(KB) (theoretical)
(actual)
MSE SNR
Original 100% 257 1 1 4*4 99.86% 65 4:1 3.9538:1 10.7125 31.5849 8*8 99.63% 41 6.4:1 6.2682:1 28.6020 27.3199 16*16 99.07% 35 7.5294:1 7.3428:1 70.4786 23.4033
Table 2.3 Parameter of ‘Woman_darkhair’
2.6 Summary
It is needless to say block truncation coding is a well-known lossy image compression
technique. First is to divide an image into a number of fixed size blocks. Next encode
each block, contain calculate the average of two representative values, and built up a
bitmap. The technique is known to preserve high PSNR, but achieves low compression
radio [26].
Compare three grey level images by difference block sizes, histogram, and intensity
profile, errors between original and reconstructed image, we can find the following point:
41
(1) The larger block used, the worse quality compression is, the higher compression
ratio is.
(2) When the block size increase, the contouring artifacts becomes obviously.
(3) Fast computing
42
Chapter 3 Hopfield Neural Networks
3.1 A short Introduction about Neural Network
Neural network is based on a model of the basic cells of the brain: neurons. Date back to
the 1940, Warren McCulloch and Walter Pitts developed such a model-referred to in
these notes as the McCulloch Pitts (MCP) neuron [27]. In 1949 Donald Hebb showed that
neural network could exhibit the learning ability (he proposed a training algorithm low of
the neural network). In the 1950s and 1960s, researchers developed the first artificial
neural network circuits (the most common form of realization nowadays is computer
simulation). However in the late 1960s, Marvin Minsky and Seymour Papert published a
book which highlighted the inability of such networks to carry out even simple tasks.
This led to disillusionment in the field, and research into neural networks lapsed into
obscurity. In 1982 John Hopfield recognized that the stability of the ‘response’ (or stable
states) of a group of neurons may result in a good way that makes advanced memories.
This analysis was based on a definition of ‘energy’ in a network, and a proof that the
network operates by minimizing this energy and settling into stable states. In 1986, the
error backpropagation learning algorithm that may be widely applied in train multi-layer
networks was proposed by Runmelhuelhart, Hinton and Williams [28] [29].
Since the field that the research of the neural networks became expand and mature, and
neural networks are now engaged in a wide variety of application areas, such as process
control, financial forecasting, image processing ,speech processing etc.
43
3.2 Basic Algorithm
Obviously, neurons in the Hopfield [30] [31] [32] Network are highly and selectively
interconnected so as to give rise to collective computational properties and create
networks with good computational efficiency. The most important point of using
Hopfield Neuron Networks is to collective computational propertied emerge from the
existence of an energy function of the states of the neuron outputs, namely the
computational energy [33].
It should be reminded that the number of neurons is equal to the pixels.
The computational energy function of the Hopfield Network has the following form [5]:
Where: , i=1,2,…..n, are the outputs of the network;
is the connection strength between neuron i and neuron j;
, i =1,2,…..n is the external input to neuron i.
Where is the total input to neuron i.
Now, this is a formula shows the relationship between neurons and the external inputs.
0
2
2
=∂∂
−=
∂∂∂
−=
Vji
i
jiij
VEI
VVET
44
There is a feedback in the Hopfield Neuron Network. In other words, this is used to
asynchronously update the network at discrete random time, and achieve the stable state.
An energy function E describes the lowest energy state corresponds to an effective
classification of the pixels in the network [30].
From this formula, it is no doubt that the difference between the pixel values must be
small enough, or else the network will not be stabilized [34].
The image is cut into n*n blocks for coding, and a Hopfield network with n*n neurons is
used for each block in turn. The synaptic interconnection strengths and external input on
the neurons are given by [5]
The network will be iterated until the stable state reached, which means, the value of
does not change any more.
Hence, we can get the quantiser output levers of class ‘0’ and class ‘1’.
Where q represent the number of pixels in class =1.
∑ ∑ ∑ ∑= = = =
−−−+−=n
i
n
j
n
i
n
jjxixFjFijxixFiFjE
1 1 1 1
22 )]()()[1)(1()]()([
( ) ( )[ ]
( ) ( )[ ]∑=
−=
−−=n
ji
ij
jxixI
jxixT
1
2
2
2
4
45
3.3 Program Flowchart
Figure 3.1 Flowchart
46
3.4 Result Analyse
3.4.1 Cameraman
Figure 3.2 Original Figure 3.3 Histogram of Original
47
Figure 3.4 4*4 Block Size Figure 3.5 Histogram of 4*4 Block Size
Figure 3.6 8*8 Block Size Figure 3.7 Histogram of 8*8 Block Size
Figure 3.8 16*16 Block Size Figure 3.9 Histogram of 16*16 Block Size
48
Figure 3.10 Original Figure 3.11 4*4 Block Size
Figure 3.12 8*8 Block Size Figure 3.13 16*16 Block Size
Image Correlation Size(KB) (theoretical)
(actual)
MSE SNR
Original 100% 257 1 1 4*4 99.86% 65 4:1 3.9538:1 38.5849 26.6502 8*8 98.85% 41 6.4:1 6.2682:1 85.9758 24.0707 16*16 97.75% 35 7.5294:1 7.3428:1 138.3457 21.1757
Table 3.1 Parameter of ‘Cameraman’
Conclusion:
From Table 3.1, it is easy to find that the compression ratio is quite close to the theory,
however, the computation of image compression is more than BTC. Normally, it takes 5
minutes for 4*4 block size, 30 minutes for 8*8 block size. If choosing the 16*16 block
49
size, the time will reach more than 1 hour. Certainly, if we choose smaller images, like
128*128, the computation time will be less. Or, we can resize the image into half of it,
this will reduce the time to run the programming.
3.4.2 Pirate
Figure 3.14 Original Figure 3.15 Intensity Profile of Original
Figure 3.16 4*4 Block Size Figure 3.17 Intensity Profile of 4*4 Block Size
50
Figure 3.18 8*8 Block Size Figure 3.19 Intensity Profile of 8*8 Block Size
Figure 3.20 16*16 Block Size Figure 3.21 Intensity Profile of 16*16 Block Size
51
Figure 3.22 Original Figure 3.23 4*4 Block Size
Figure 3.24 8*8 Block Size Figure 3.25 16*16 Block Size
Image Correlation Size(KB) (theoretical)
(actual)
MSE SNR
Original 100% 257 1 1 4*4 99.18% 65 4:1 3.9538:1 37.1992 25.9774 8*8 98.22% 41 6.4:1 6.2682:1 80.9846 22.8592 16*16 96.58% 35 7.5294:1 7.3428:1 137.2873 20.5148
Table 3.2 Parameter of ’Pirate’
Conclusion:
Table 3.2 showed that the MSE is increasing, SNR falling down whilst the block size
becomes larger and larger. From Figure3.22 to Figure 3.25, zoom in the same times of
52
the same part corresponding to the respective images, the contouring artifacts turns into
more visual. Contour of the block becomes more obviously.
3.4.3 Woman_darkhair
Figure 3.26 4*4 Block Size Figure 3.27 Errors
Figure 3.28 8*8 Block Size Figure 3.29 Errors
Figure 3.30 16*16 Block Size Figure 3.31 Errors
53
Image Correlation Size(KB)
(theoretical)
(actual)
MSE SNR
Original 100% 257 1 1 N/A N/A 4*4 99.87% 65 4:1 3.9538:1 10.1232 31.8307 8*8 99.63% 41 6.4:1 6.2682:1 28.1911 27.4399 16*16 98.90% 35 7.5294:1 7.3428:1 83.1571 22.6829
Table 3.3 Parameter of ‘Woman_darkhair’
3.5 Summary
As the block size is increasing, the correlation between original and reconstructed image
decrease, the compression ratio and MSE are getting higher and higher as the SNR falling
down sharply, the quality becomes worse and worse.
Computation takes up more and more time as the block size changes into bigger, for
example, 16*16 block size will cost more than 1 hour.
54
Chapter 4 Comparisons
4.1 Algorithm Compression
Even though most step for compression are the same between BTC and HNNBTC.
However, the difference of BTC and HNNBTC is how to find this threshold. In BTC, it
uses a threshold value to separate the pixel values in the image, if values are equal or
bigger than the threshold value, set ’1’, otherwise set ‘0’, normally the threshold is the
average value of this block. But in HNNBTC, the threshold is completely different as the
result of stable state. The stable state depends on the energy of the image.
Meanwhile, there is quite bad in run the HNNBTC to compress the image. In my
programming, I don’t I can use the toolbox in Matlab, I did not how to detect the stable
state in Hopfield Neuron Network by automate. However, I thought out a way that is to
set a maximum operation times. Because when the networks reached its stable state,
when the training is continued, the stable state won’t change. This is why my HNNBTC
programming cost so much time. Block size is bigger, the time to detect the stable time
wills longer, for example, 8*8 block size will take at least 15 minutes to encode.
4.2 Image Result Comparisons
Figure 4.1 to Figure 4.6 are some reconstructed images by difference compression
technologies. The first group is using BTC, and the second group is by HNNBTC. No
matter using BTC or HNNBTC, the original images are the same. And the order of each
group from left to right is following by 4*4, 8*8, 16*16 block size.
55
Figure 4.1 Group 1 Cameraman (BTC)
Figure 4.2 Group 2 Cameraman (HNNBTC)
Figure 4.3 Group 1 Pirate (BTC)
Figure 4.4 Group 2 Pirate (HNNBTC)
56
Figure 4.5 Woman_darkhair (BTC)
Figure 4.6 Woman_darkhair (HNNBTC)
Now, compare each compression method with the same size 4*4 block size.
(actual)
Correlation (%)
MSE SNR (dB)
Image
BTC HNNBTC BTC HNNBTC BTC HNNBTC BTC HNNBTCCameraman 3.9538:1 3.9538:1 99.47 99.86 40.8640 38.5849 26.4010 26.6502
Pirate 3.9538:1 3.9538:1 99.10 99.18 40.7506 37.1992 25.5814 25.9774 Woman_darkhair 3.9538:1 3.9538:1 99.86 99.87 10.7125 10.1232 31.5849 31.8307
Table 4.1 Compare BTC and HNNBTC
Table 4.1 illustrated no matter BTC or HNNBTC, both of them have the same compression radio in theory and actual values.
57
Table 4.2 MSN and SNR
Table 4.2 illustrated no matter ‘cameraman’, ‘pirate’, or ‘woman_darkhair’, MSE in BTC
is always higher than HNNBTC, but it is opposite in SNR, HNNBTC is greater than BTC.
This is to say HNNBTC consistently gives better results than BTC.
4.3 Summary
No matter using BTC or HNNBTC, the larger block size chooses, the less number of
blocks will be bring out, the lower resolution reconstructed image is. Especially for
Hopfield Neuron Network, which threshold is base on the stable state. The computation
of calculate the stable state during times operation in the encoding part.
The result of using Hopfield Neuron Network gives a better image quality, is
optimization than Block Truncation Coding. However, it still causes the contouring
artifacts. Because in each block, it only has two reconstructed levels, when in
58
compression, lots of data are redundant. As the result of this, the contour edge of the
reconstructed image will become clear after decoding. Also, there is another disadvantage
of both of them. The bit rate is high, for example, when the block size is 4*4, the bit rate
= (8+8+16)/16=2 bpp. Higher bit rate is, more expensive the cost will be in the
transmission. About how to reduce the bit rate, I will discuss in the chapter 6.
59
Chapter 5 Color Image Compression
5.1Color Image Compression Base on RGB Color Space
5.1.1 Introduction
As we know, each pixel in a color image has three frequency spectrum, they
are ’R’,’G’,’B’. And each component takes one byte to storage, in other words, a color
pixel needs 3 bytes. So, there are different color pixels.
Red, green, blue are called primary colors, and they can be added to produce the
secondary colors. For example, red plus blue will produce magenta, cyan can be made by
green and blue, add red and green together will have yellow [35]. These results have
shown below.
Figure 5.1 RGB
Meanwhile, RGB model is based on a Cartesian coordinate system. Figure 5.2 is showed
that the color subspace of interest. The primary colors are at three corners, the secondary
are at three other corners.
RGB model applied in display in the screen.
60
Figure 5.2 RGB Cube Figure 5.3 RGB 24‐bit Color Cube
5.1.2 Flowchart
Figure 5.4 Flowchart
61
Open the ‘Lena_512_Color’, and cut the image into 4*4 blocks.
Figure 5.5 Lena
Encoding, and get the bitmap.
Figure 5.6 Bitmap __ R G B
Decoding
Figure 5.7 Decoding __ R G B
Reconstructed Image
Combine the ‘R’, ‘G’, ‘B’ three matrix together.
62
Figure 5.8 Reconstructed Figure 5.9 Differences Figure 5.5 Original
Figure 5.9 shows the differences between the original and reconstructed image.
5.1.3 Result Analyse
The following images are divided into two groups. The first one is a group that used
Block Truncation Coding to compress the image called ‘lena_color’ (Figure 1.11) in
different block sizes. The other group is used Hopfield Neuron Networks base on Block
Truncation Coding.
First Group using BTC:
Figure 5.10 4*4 Block Size Figure 5.11 8*8 Block Size Figure 5.12 16*16 Block Size
Compression Ratio:
The original image is 770KB, after compression, Figure 5.10 is 193KB, Figure 5.11 is
121KB, Figure 5.12 is 103KB. is the compression for 4*4 block size,
represents for 8*8 block size, the compression of 16*16 block size is .
63
121 = 6.3636
103 = 7.4757
Second Group using HNNBTC
Figure 5.13 4*4 Block Size Figure 5.14 8*8 Block Size Figure 5.15 16*16 Block Size
Figure 5.16 BTC
Figure 5.17 HNNBTC
(actual)
MSE SNR Image ‘lena’
BTC HNNBTC BTC HNNBTC BTC HNNBTC 4*4 3.9896:1 3.9896:1 31.7350 29.2244 67.9618 69.9618 8*8 6.3636:1 6.3636:1 64.5222 62.7686 58.7167 59.6671
16*16 7.4757:1 7.4757:1 116.6808 113.9560 50.9980 51.6106 Table 5.1 Compare BTC & HNNBTC in RGB Color Space
From the data in Table 5.1, no matter in 4*4 block size, 8*8, or 16*16, MSE (HNNBTC)
is always lower than BTC, SNR (HNNBTC) is greater than BTC. Again, HNNBTC
64
perform a better reconstructed image quality than BTC when compress color image in
RGB color space.
5.2 Color Image Compression Base on YUV Color Space
5.2.1 YUV Color Space
YUV Color Space always applied in the transmission of the television signal [36]. ‘Y’
represents the black-and white (luminance) component, ‘U’ and ‘V’ means the color
difference signals. If RGB translate into YUV follows by the formulas below:
Y = 0.299R + 0.587G + 0.114B
U = - 0.147R -0.289G + 0.436B = 0.492(B-Y)
V = 0. 615R – 0.515G – 0.100B = 0.877(R-Y)
To change YUV Color Space into RGB Color Space:
R= Y + V
G= Y – 0.192U – 0.509V
B= Y + U
5.2.2 Flowchart
65
Figure 5.18 Flowchart
Open the ‘Lena_512_Color’, and cut the image into 4*4 blocks, meanwhile, change RGB
Color Space into YUV Color Space.
Figure 5.19 Original Image Figure 5.20 YUV Color Image
66
As I mentioned before, screen display color image used RGB in general. So does Matlab.
This is why the right YUV color image looks like.
Encode three channels individually.
Figure 5.21 Bitmap_ Y U V
Decoding
Figure 5.22 Encoding_Y U V ‘Y’ represents the luminance, ‘U’ and ‘V’ are the color difference.
Combine three channels together, and change YUV back to RGB.
Figure 5.23 Reconstructed YUV Color Image Figure 5.24 Reconstructed Image
67
5.2.3 Result Analyse
The following images are divided into two groups. The first one is a group that used
Block Truncation Coding to compress the image called ‘lena_color’ (Figure 1.11) in
different block sizes. The other group is used Hopfield Neuron Networks base on Block
Truncation Coding.
5.2.3.1 Lena_color
First Group using BTC:
Figure 5.25 4*4 Block Figure 5.26 8*8 Block Figure 5.27 16*16 Block
Second Group using HNNBTC:
Figure 5.28 4*4 Block Size Figure 5.29 8*8 Block Size Figure 5.30 16*16 Block Size
The first group using BTC:
68
Figure 5.31 4*4 Block Size Figure 5.32 8*8 Block Size Figure 5.33 16*16 Block Size
Seconding group using HNNBTC:
Figure 5.34 4*4 Block Size Figure 5.35 8*8 Block Size Figure 5.36 16*16 Block Size
(actual)
MSE SNR Image ‘lena’
BTC HNNBTC BTC HNNBTC BTC HNNBTC 4*4 3.9896:1 3.9896:1 36.1131 32.9077 66.2780 67.4891 8*8 6.3636:1 6.3636:1 72.1813 71.5945 57.2552 58.8914
16*16 7.4757:1 7.4757:1 130.2377 127.7450 49.5659 50.0693 Table 5.2 Compare BTC & HNNBTC in YUV Color Space
Table 5.3 shows HNNBTC offered a better image quality than BTC again by compare the
MSE and SBR. More important, HNNBTC has the same compression ratio with BTC.
69
5.2.3.2 Madril_color
Group 1 using BTC
Figure 5.37 4*4 Block Size Figure 5.38 8*8Block Size Figure 5.39 16*16 Block Size Group 2 using HNNBTC
Figure 5.40 4*4 Block Size Figure 5.41 8*8 Block Size Figure 5.42 16*16 Block Size
YUV model is widely used for digital video. In this color space, luminance (brightness
or intensity) information is stored as a single component (Y). Information of chrominance
(color) is made into two color difference (U and V). As the experiment result, HNNBTC
is as a better performance compression than BTC.
5.2.4 A Way to Achieve More Compression in YUV Color Space
As general speaking, ‘U’ and ‘V’ are sampled by a factor of two to four in the spatial
dimensions.
70
As the result of ‘U’ and ‘V’ are much less sensitive to the human visual system than the
luminance ‘Y’, we can compress ‘U’ and ‘V’ channel into half, and keep the ‘Y’ channel,
than the compression ratio will increase. Because the most important channel is ‘Y’,
which represents the black-and white (luminance) component, ‘u’ and ‘V’ are just the
color difference.
Now, it is going to show the bitmap (encoding), decoding, and reconstructed results. All
groups’ original images are cut into 4*4 block size.
Bitmap
The first group is using BTC to compress color image base on YUV model, reduced ‘U’
and ‘V’ channel into half size, and Kept ‘Y’ channel with the original size.
The second group is using BTC to compression color image base on YUV model but kept
all channels as the original size.
The third group is based on RGB model, and ‘R’, ’G’, ’B’ three matrix didn’t change.
Group 1:
Figure 5.43 Bitmap_YUV’
71
Group 2:
Figure 5.44 Bitmap_Y U V
Group 3:
Figure 5.45 Bitmap_R G B
Decoding
Group 1:
Figure 5.46 Decoding_Y U V’
72
Group 2:
Figure 5.47 Deocidng_Y U V Group 3:
Figure5.48 Decoding_R G B
Reconstructed
Figure 5.49 Group 1 Group 2 Group 3
73
Errors
Figure 5.50 Group 1 Group 2 Group 3
Obviously, the contour of ‘lena’ using RGB model is clear than using YUV, no matter in
bitmap, or decoding part. When image are compression, the better quality of
reconstructed image is the compression using RGB model. By compare these three errors
images, it is easy to find that lots of color data using YUV model are redundant than
RGB model, this cause the quality of the reconstructed image.
After compression, I found that the size of reconstructed image of group 1 is 97KB,
group 2 is 193KB, and group 3 is 193KB. Original image is 770KB. It is obvious that the
compression ratio has increase, the quality becomes worse.
Model Compression Ratio MSE SNR YUV’ 7.9381:1 45.7508 63.1960 YUV 3.9896:1 36.1131 66.2780 RGB 3.9896:1 31.7350 67.9618
Table 5.3 Comparison
Notice: YUV’ means group 1, the ‘U’,’V’ channel has reduced into half size.
Through Table 5.3, it finds that the compression ratio becomes larger if sample only half
of the ‘Y’, ‘U’ channel.
74
5.3 Color Image Compression Base on YIQ Color Space
Well, there is another color space called YIQ, which defines by National Television
Systems Committee (NTSC) [36]. It is widely used in television in the United States,
which color coordinate is derived from the YUV format. A main advantage of this format
is that grayscale information is separated from color data. Due to this, the same signal can
be used for both color and black and white sets.
In this space, a color pixel data consists of three components. ‘Y’ represents the
luminance, ‘I’ stands for in-phase, and ‘Q’ stands for quadrature-phase. ‘Y’ component
contains grayscale information. Another two components make up chrominance.
The relationship between YIQ and RGB is,
Step 1: Open a RGB image, translate into YIQ color space.
Figure 5.51 RGB Image Figure 5.52 YIQ Image
75
Figure 5.53 Y Channel Figures 5.54 I Channel Figure 5.55 Q Channel
Step 2: Segment image into 4*4 blocks, calculate the average pixel value (quantized) of
the whole image. If pixel value greater or equal to average, set ‘1’, otherwise, set into
‘0’. And then get a bitmap.
Figure 5.56 Bitmap_Y I Q
Step 3: Decoding base on compute the reconstructed level average and bitmap.
Figure 5.57 Decoding_Y I Q
76
Step 4: Combine three channels in order to get the reconstructed YIQ image, and then
translate back into RGB image.
Figure 5.58 Reconstructed YIQ Image Figure 5.59 Reconstructed Image
Result Analyze:
The following images are compressed in the same image by the same block size, but
different method. Figure 5.60 is used BTC, Figure 5.62 is zoomed in left eye of the
Figure 5.60. Figure 5.61 compressed by HNNBTC, the Figure 5.63 belong to Figure 6.63.
Figure 5.60 BTC Figure 5.61 HNNBTC
Figure 5.62 BTC Figure 5.63 HNNBTC
77
Method Block Size MSE SNR BTC 4*4 36.5491 66.1217 HNNBTC 4*4 34.7460 66.7808 Table 5.4 Compare BTC & HNNBTC in YIQ Color Space
Conclusion: The smaller MSE, the larger SNR. Using BTC has a higher MSE (36.5491) than
HNNBTC (66.1217), which means the worse quality reconstructed. This can be found by
compare the SNR too. SNR 66.7808 (HNNBTC) is higher than 66.1217 (BTC). By color
image compression base on YIQ color space, it can prove that HNNBTC has a better
performance than BTC again.
5.4 Color Image Compression Base on HSV Color Space
Compress color image can be using difference color space, not only in RGB, YUV, YIQ,
but also in HSV space (hue, saturation, value) [4]. This color space was found by Alvy
Ray in 1978. People who always used this space to selecting colors (e.g. of paints or inks)
from a color wheel or palette. This is because it corresponds better to how people
experience color than the RGB color space does.
The following figure illustrates the HSV color space [36].
78
Figure 5.64 HSV Color Space
As the result of hue varies from 0 to 1.0, the corresponding colors order from red through
yellow, green, cyan, blue, magenta, and back to red, then there are actually red values
both at 0 and 1.0. The corresponding colors (hues) vary from unsaturated (shades of gray)
to fully saturated (no white component) as saturation varies from 0 to 1.0. Value,
brightness, changes from 0 to 1.0, this is the reason why the corresponding colors become
increasingly brighter.
Saturation can be recognized as the purity of a color. Value is roughly equivalent to
brightest.
Figure 5.65 RGB to HIS Conversion
79
Figure 5.66 RGB Image Figure 5.67 HSV Image
Figure 5.68 HSV_H S V
Result Analyze:
The following images are compressed in the same image by the same block size, but
different method. Figure 5.69 is used BTC, Figure 5.71 is zoomed in left eye of the
Figure 5692. Figure 5.70 compressed by HNNBTC, the Figure 5.72 belong to Figure 6.70.
Figure 5.69 BTC Figure 5.70 HNNBTC
80
Figure 5.71 BTC Figure 5.72 HNNBTC
Method Block Size MSE SNR BTC 4*4 33.8781 67.1104 HNNBTC 4*4 31.0753 68.2355 Table 5.5 Compare BTC & HNNBTC in YIQ Color Space
Conclusion: In MSE, 31.0753 < 33.8781, and 68.2355 >67.1104, so color image compression using
HNNBTC in HSV color space has the better quality than is Using BTC. This can prove
that H
NNBTC has a better performance than BTC again.
5.5 Conclusion & Comparisons
Figure 5.73 RGB Figure 5.74 YUV
81
Figure 5.75 YIQ Figure 5.76 HSV Image Color
Space Block Size MSE
BTC HNNBTC SNR
BTC HNNBTC Lena_color_512 RGB 4*4 Block 31.7350 29.2744 67.9618 69.9618 Lena_color_512 YUV 4*4 Block 36.1131 32.9077 66.2780 67.4891 Lena_color_512 YIQ 4*4 Block 36.5491 34.7460 66.1217 66.7808 Lena_color_512 HSV 4*4 Block 33.8781 31.0753 67.1104 68.2355
Table 5.6 Compare Color Image Compressions in Different Color Space
Table 5.7 MSE & SNR in Image Compression Using Different Color Space
Figure 5.82 illustrated the same image compare in different compression technology in
different color space. No matter in RGB, YUV, or YIQ, HSV color space, it is easy to
find that HNNBTC has a better performance than BTC. It proves that again HNNBTC is
optimization than BTC.
82
However, the image quality is quite approach using BTC and HNNBTC base on YIQ and
HSV color space. In YIQ, SNR (BTC) is 66.1217, for HNNBTC it is 66.7808. It is not
too much different between these two techniques.
Meanwhile, SNR (RGB) 69.9618, SNR (YUV) 67.4891, SNR (YIQ) 66.7808, SNR
(HSV) 68.2355, by comparing these data, using the same technique to compress the same
color image by different color space, it finds that RGB has a higher reconstructed image
quality than color space.
83
Chapter 6 Variable Block Size
6.1 Principle
As discussed before, even HNNBTC can find an optimum to make a better quality than
BTC, the bit rate still the same, the cost in the translation are more expensive. If it uses
the variable block size to replace the fixed block size, than will reduce the bit rate, and
solve this problem [37].
At the same time, because HNNBTC only used two reconstructed levels to reconstruct
the image, some detail in the image cannot be reverted. Suppose, if an area A has much
more information than area B, both of them used the same block size, this will cause the
contouring artifacts, especially in the area A. Two reconstructed levels are not enough for
area A.
Under this situation, if we want a better quality of reconstructed image, it is better to use
the variable block size to replace the fixed block. When the area A has more information,
than use a smaller block size, like 4*4 block. Otherwise, use a normal block size.
8x8block cut into 4x4block by 4
two representative values two representative values 64‐bit bitmap 16‐bit bitmap multiply by 4
84
It should comprises a block size marker (1 bit), which includes a lot details about the
block, such as cut into how many n*n blocks, what the value of n is. An advantage of
using block size marker is to distinguish the usage of difference block sizes in
decoding.
Variable block size approach exploits local image content for improved compression.
6.2 Programming Flow chart
Figure 6.1 Flowchart
85
Here it used a function called standard division. STD is the standard deviation of the
whole image, STD’ is the standard deviation for an 8*8 block.
BSM represents block size marker.
If the block STD less than the STD of the whole image, then the block won’t be change,
keep the original size to calculate the two level’s average value; while if the block STD
equals to or more than the STD of the whole image, then the block would be divided into
4*4*4 (in this case) for the block size, which means it contains four blocks, each block is
4*4. Then use the same theory to get the bitmap and two levels average value.
6.3 Practical Works
Here, I used a 128*128 grey level image called ‘cameraman’.
Figure 6.2 Original Figure 6.3 Bitmap
In my programming, first step is to cut the image into a set of 8*8 block size. However, if the standard deviation of the block is greater than the standard deviation of the whole image, then divide this block into 4*4 block size, which number of the 4*4 block is 4. At the same time, sign a block size marker (BSM).
According to the average of two classes of different block size, and the matrix of BSM, it proved that my encoding part is right. (These data are only part from the image.)
Figure 6.4 Average of two classes about 8*8 & 4*4 Block size
86
Decoding should be easy, however, all the data are loss in the transmission. I need more time to find out what is wrong in my programming. This becomes one part of my future work.
Chapter 7 Conclusion and Future Work
7.1 Conclusion This project aims to implement and develop an image compression technology using
Hopfield Neuron Network. The first basic aim is to understand the principle of the image
compression, and then using Matlab to implement the compression.
The process of doing this project can be divided into 7 objectives.
1. Investigations on the BTC
Investigations on the Block Truncation Coding are the first objective of this
project work as it is the basic algorithm. As presented in Chapter 2, BTC is simple
but effective and have a satisfying result in reconstructed image quality.
2. Investigations on the Hopfield Neuron Network
As the project title is ‘Image Compression Using Hopfield Neuron Network’,
Hopfield Neuron Network plays an important role here. It has a better performance
than BTC in compression.
3. Investigations of the Matlab Software
To implement the project procedures, Matlab software is suggested to use. As the
result of this, the Matlab software system is investigated. One of the advantages for
this function toolbox it has. There are many image compression and neuron network
algorithms, especially color space, in this software.
87
4. Design a Matlab Program
When finished investigate the principle of BTC and HNNBTC, two Matlab
programs are written to perform the grey level images compression, BTC and
HNNBTC respectively. And each program has 4*4, 8*8, 16*16 block size, in other
words, I used a function called ‘range’, we can choose any one of these three block
size. All the results are implemented by Matlab.
5. Improvement of the Project in grey level images
Grey level images are easy to implement using different methods BTC and
HNNBTC. Meanwhile, MSE and SNR, compression ratio are the standard that use
to define a ‘good’ quality of the compression. Detail sees in Chapter 2 and Chapter
3, 4.
6. Do color images compression in different color space
No matter in RGB, YUV, or YIQ, even HSV color space, I proved that HNNBTC
has a better performance than BTC. There are presented in the Chapter 5.
7. Improvement more compression in YUV color space as asked
As the ‘Y’ represent the luminance, ‘U’ and ‘V’ are the color difference, in more
simple, more information are in Y channel, it can be more redundant in U and V
channel. Under my Project Supervisor’s help, I succeed make the compression ratio
into 7.9381:1, and the original ratio is 3.9896:1. More detail is in Chapter 5, 5.2.5.
8. Encoding in variable block size
It is the most difficult part, when I finish the encoding, trying to decoding, all the
data are missed in the transmission. The deadline is on the way, so as the exam, I
don't have enough time to find out what I made mistake in the decoding.
88
7.2 Future Work Base on the process of this project, future work, as described below, would form a useful
contribution to the further development to the system.
1. Finish the decoding part in variable block size. This should also include
experiments to compress the color images.
2. Error control coding. As we know, there will have some channel error during the
transmission. Therefore, error control codes use to detect and correct channel
errors becomes very important. (62, 56) Hamming code can be used, as it has
capable of correcting one error in 56 bits.
3. Meanwhile, if using smaller bits to represent one pixel, like 7 bits per pixel to
instead of 8 bits per pixel. This means compression ratio will be increasing
through reducing brightness resolution.
4. Using this technique not only combined with error control, also the hardware
service.
89
Reference
[1]Martin, Lik-kwan Shark, “Hand out of Digital Image Processing”, 2008.
[2] R.J.CLARKE, “Digital Compression of still Images and Video”, 1995, pp.7-17.
[3]S. Cavalieri, A. Di Stefano and O. Mirabella, “Optimal path determination in a
graph by Hopfield neural network”. Neural Networks 7, 1994, pp. 397–404.
[4] Rafael C. Gonzalez and Richard E. Woods, “Digital Image Processing”, 2002.
[5] Claude E. Shannon, “A Mathematical Theory of Communication, Bell System
Technical Journal”, 1948, Vol. 27, pp. 379–423, 623–656.
[6] T. X. CHEN, J.H.XIE and J.M.HUANG, “Ensure Structure of Tree Video Coding
Benefits Technologies”, Conference on Technology and Management, 2000, pp.387-393.
[7] Martin Roy Verlay and Mike Peak, “Handout of Artificial Neural Networks”, 2009.
[8] G. Qiu, M.R. Varley and T.J. Terrell, "Improved Block Truncation Coding using
Hopfield Neural Network", Electronics Letters, October 1991 (0013-5194),vol. 27, no. 21,
pp1924-1926,.
[9] J. J. Y. Huang and P. M. Schulte’s, “Block quantization of correlated Gaussian
random variables”, IEEE Trans. Comm., Sep. 1963, vol. 11, no. 9, pp.289–296.
[10] K. K. Ma, “Put Absolute Moment Block Truncation Coding in Perspective”, 1997,
IEEE Trans. Commun., vol.45, no.3:284-286.
90
[11]Rafael C. Gonzalez and Richard E. Woods, Second Edition. October 2007, ’Digital
Image processing’.
[12]E.J.Delp and O.R. Mitchell, ”Image Compression Using Block Truncation Coding”,
IEEE Trans. On Comm.,1979, Vol. COM-27, No.9, pp. 1335-1342.
[13] Gray, R. M., “Vector quantization”, IEEE Assp Magazine, April 1984, vol. 1, pp.4-
29
[14] Clarke R. J., “Transform coding of images”, 1985, Academic Press, London.
[15] G.R. Arce and N.C. Gallagher, Jr., “BTC image coding using median filter roots”,
IEEE Trans. On Communications, June 1983, vol. COM-31, No.6, pp. 784-793.
[16] Udpikar, V.R. and Raina, J.P., “BTC image coding using vector quantization”, IEEE
Trans. Commun., March 1987, vol. COM-35, pp.352-356,
[17] Healy, D. J. and Mitchell, O. R., “Digital video bandwidth compression using BTC”,
IEEE Trans. Commun., June 1983, vol. COM-31, pp.784-792.
[18]K. K. Ma., “Sub-band Coding of Digital Video”, School of Electrical and Electronic
Engineering, Nan yang Technological University, June 2000, pp. 1-3.
[19] M. D. lema and O.R. Michel. “Absolute Moment Block Truncation Coding and its
Application to Colour Images”, IEEE Trans.On Common., Oct .1984, vol.32, pp.1148-
1157.
[20] R.J.CLARKE, “Digital Compression of still Images and Video”, 1995, pp.7-17.
91
[21] V. Udpikar and J. P. Raina, “Modifield algorithm for block truncation coding of
monochrome images”, Electronic Letters, 1985, Vol.21, No.20, pp. 900-902.
[22]Khalid Kamali. “Fractal Video Compression”, Faculty of Engineering and
Surveying, the University of Southern Queensland. Oct, 2005, pp.2-15.
[23]M. Ghanbari, “an introduction to standard codes”, Video Coding. The Institution of
Electrical Engineers, 1999.
[24] Y.C.Hu, “Improved moment preserving block truncation coding for image
compression”, Electron. Lett. Sep.2003, Vol.39, (19) .
[25] K.Somasundaram and I.Kaspar Raj, “An Image compression Scheme based on
Predictive and Interpolative Absolute Moment Block Truncation Coding”, GVIP Journal,
Dec, 2006, Volume 6, Issue 4, pp 33-37.
[26] Bibhas Chandra Dhara, Bhabatosh Chanda,”Block truncation coding using pattern
fitting”, 2004
[27]Lippmann,R.P., “An introduction to computing with neural nets”, IEEE ASSP Mag.,
1997, pp 4-22.
[28]Bdeini,L. and Tonazzini, ”A.neural network use in maximum entropy mage
restoration”, Image and Vision Computing,1990, pp108-114.
[29]Roth, M.W. “Neural-network technology and its applications”, Heuristics, 1989,
pp46-62.
92
[30] J.J. Hopfield, “Neurons networks and physical systems with emergent collective
computational abilities”, Proc. Natl. Acad. Sci. USA, 1982, Vol 79, pp. 2554-2558.
[31] J. J. Hopfield, “Neural with graded response have collective computational
properties like those of two-state neurons”, Proc. Natl. Acad. Sci. USA, 1984, Vol. 81,
pp. 3088-3092.
[32] J. J. Hopfield and D. W. Tank, “Neural computational of decisions in optimization
problems”, Biol. Cybern. 52, pp. 141-152, 1985
[33] G. Qiu, "An Investigation of Neural Networks for Image Processing Applications",
PhD thesis, University of Central Lancashire, 1993
[34] G. Qiu, M.R. Varley and T.J. Terrell, "Variable Bit Rate Block Truncation Coding
for Image Compression using Hopfield Neural Networks", Proc. 3rd International
Conference on Artificial Neural Networks, Brighton, May 1993, pp233-237 (IEE
Conference Publication No. 372)
[35] K.-K.Ma, S.A. Rajala, “Sub band Absolute Moment Block Truncation Coding,
Optical Engineering”, Special Issue on Visual Communications and Image Processing,
1996, vol.35, no.1: 213-231.
[36] Help in Matlab
[37]M.R. Varley and X. Mo, "Error-resilient Image Coding using Hopfield Neural
Network Block Truncation Coding Scheme", Proc. IEE Colloquium on Data Compression:
Methods and Implementations, Savoy Place, London, November 1999, pp7/1-7/7 (0963-
3308)
93
Appendix A. Statement of Work (SOW) University of Central Lancashire Department of Technology
Image Compression using Hopfield Neural Networks
B.Eng. (Hons.) Digital Communication
Issue 1,Nov 2008
Zongbin Zheng
1. Aim The aim of this project is to understand what is the image compression, what is the
Hopfield neural network, how to implement the artificial intelligence in the data
processing. As it is a whole year project, doing a good project schedule is absolutely
necessarily, like most project, it should be split into several parts, solve each problem
step by step, make sense of project management in practice, outline the risk of the
project.
2. Background Image compression is a one kind of the information technology application. It uses data
of collection, storage, transaction, transmission and utilization for all the information. In
the book [1] ‘Digital Image Processing’ mentions that image compression is using the
pixels in the image as the data, reduce the redundancy data to achieve the predict
efficiency image. Thus the image compression includes lossy and lossyless, lossy means
the compressed image is different from the original image which lost some data; the
other is no change in the constructed image, like the medical application.
Hopfield Neural Network is a feedback net. It offers the binary output, single layer of
neurons, it obtains the only stable state when the neurons asynchronously renovate. It
also applied by many direction researches such as feature detectors, hand design and
domain specific. In this project is using the Hopfield neural network model in image
compression.
3. Work Breakdown Structure
94
Work flows; include how the project goes, the project structure, and the frameworks in details.
statement of works Gantt chart Group meeting image compression
Learning what is the Image compression and its function Choosing the coding method for programming Using the MATLAB 6.5/C++ to programming for the image compression
Hopfield neural network Learning what is the Hopfield neural network and its function Using the MATLAB 7.0/C++ to program Connected the excel with MATLAB for the neural networks
Developing different coding method for the image compression Comparing the efficiency among every coding method Reporting
Interim report preparation Final report preparation
Viva
4. Dependencies
All the software need include MATLAB 6.5 and visual 6.0 for C++.
5. Risk Management
Analyse the information from the Internet. How to deal with such many knowledge is important for some are valueless and some are worth.
Register the process, because this is a project, lots of data must be memorized, they can’t loss. It is better to save in different document on that day so that data can be found easily.
Efficiency, the efficiency of programming, it will effect a change of the state of project. Maybe there are some problems about the images that it can’t be shown for the result. The most difficult is about the designing and flowchart of software. Feedback/development, it takes more time to think about what else methods can carry out the function, and which is better. Simultaneity, find out more way to optimize this project.
6. Deliverables
Include these two:
Item Due date
1. Interim report 28 NOV 2008
2. Final report (2 copies and CD) 27 Apr 2009
95
Appendix B Gantt chart