a_low-cost_vlsi_implementation_for_efficient_removal_of_impulse_noise-qr9

9
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 18, NO. 3, MARCH 2010 473 A Low-Cost VLSI Implementation for Efficient Removal of Impulse Noise Pei-Yin Chen, Member, IEEE, Chih-Yuan Lien, Student Member, IEEE, and Hsu-Ming Chuang Abstract—Image and video signals might be corrupted by im- pulse noise in the process of signal acquisition and transmission. In this paper, an efficient VLSI implementation for removing im- pulse noise is presented. Our extensive experimental results show that the proposed technique preserves the edge features and ob- tains excellent performances in terms of quantitative evaluation and visual quality. The design requires only low computational complexity and two line memory buffers. Its hardware cost is quite low. Compared with previous VLSI implementations, our design achieves better image quality with less hardware cost. Synthesis results show that the proposed design yields a processing rate of about 167 M samples/second by using TSMC 0.18 m technology. Index Terms—Image denoising, impulse noise, pipeline architec- ture, VLSI. I. INTRODUCTION I N SUCH applications as printing skills, medical imaging, scanning techniques, image segmentation, and face recog- nition, images are often corrupted by noise in the process of image acquisition and transmission. Hence, an efficient denoising technique is very important for the image processing applications [1]. Recently, many image denoising methods have been proposed to carry out the impulse noise suppression [2]–[17]. Some of them employ the standard median filter [2] or its modifications [3], [4] to implement denoising process. However, these approaches might blur the image since both noisy and noise-free pixels are modified. To avoid the damage on noise-free pixels, an efficient switching strategy has been proposed in the literature [5]–[17]. In general, the switching median filter [5] consists of two steps: 1) impulse detection and 2) noise filtering. It locates the noisy pixels with an impulse detector, and then filters them rather than the whole pixels of an image to avoid the damage on noise-free pixels. Generally, the denoising methods for impulse noise suppression can be classified into two categories: lower-complexity techniques [6]–[13] and higher-complexity techniques [14]–[17]. The former uses a fixed-size local window and requires a few line buffers. Furthermore, its computational complexity is low and can be comparable to Manuscript received March 14, 2008; revised September 14, 2008 and De- cember 02, 2008. First published May 29, 2009; current version published Feb- ruary 24, 2010. This work was supported in part by the National Science Council under Grant 96-2221-E-006-027-MY3 and Grant97-2622-E-006-012-CC3 and by the use of Shared Facilities supported by the Program of Top 100 Universi- ties Advancement, Ministry of Education, Taiwan. The authors are with the Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan 701, Taiwan (e-mail: py- [email protected]; [email protected]; [email protected]). Digital Object Identifier 10.1109/TVLSI.2008.2012263 conventional median filter or its modification [2]–[4]. The latter yields visually pleasing images by enlarging local window size adaptively [15], [16] or doing iterations [14]–[17]. In this paper, we focus only on the lower-complexity denoising techniques because of its simplicity and easy implementation with the VLSI circuit. In [6], Zhang and Karim proposed a new impulse detector (NID) for switching median filter. NID used the minimum ab- solute value of four convolutions which are obtained by using 1-D Laplacian operators to detect noisy pixels. A method named as differential rank impulse detector (DRID) is presented in [7]. The impulse detector of DRID is based on a comparison of signal samples within a narrow rank window by both rank and absolute value. In [8], Luo proposed a method which can efficiently remove the impulse noise (ERIN) based on simple fuzzy impulse detection technique. An alpha-trimmed mean- based method (ATMBM) was presented in [9]. It used the alpha- trimmed mean in impulse detection and replaced the noisy pixel value by a linear combination of its original value and the me- dian of its local window. In [10], a decision-based algorithm (DBA) is proposed to remove the corrupted pixel by the me- dian or by its neighboring pixel value according the proposed decisions. For real-time embedded applications, the VLSI implementa- tion of switching median filter for impulse noise removal is nec- essary and should be considered. For customers, cost is usually the most important issue while choosing consumer electronic products. We hope to focus on low-cost denoising implemen- tation in this paper. The cost of VLSI implementation depends mainly on the required memory and computational complexity. Hence, less memory and few operations are necessary for a low-cost denoising implementation. Based on these two fac- tors, we propose a simple edge-preserved denoising technique (SEPD) and its VLSI implementation for removing fixed-value impulse noise. The storage space needed for SEPD is two line buffers rather than a full frame buffer. Only simple arithmetic operations, such as addition and subtraction, are used in SEPD. We proposed a useful impulse noise detector to detect the noisy pixel and employ an effective design to locate the edge of it. The experimental results demonstrate that SEPD can obtain better performances in terms of both quantitative evaluation and visual quality than other state-of-the-art lower-complexity impulse de- noising methods [6]–[13]. Furthermore, the VLSI implementa- tion of our method also outperforms previous hardware circuits [11]–[13], [19], [20] in terms of quantitative evaluation, visual quality, and hardware cost. The rest of this paper is organized as follows. In Section II, the proposed SEPD is introduced. The VLSI implementation of 1063-8210/$26.00 © 2009 IEEE

Upload: monisha-orisis

Post on 03-Apr-2015

124 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: A_Low-Cost_VLSI_Implementation_for_Efficient_Removal_of_Impulse_Noise-QR9

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 18, NO. 3, MARCH 2010 473

A Low-Cost VLSI Implementation for EfficientRemoval of Impulse Noise

Pei-Yin Chen, Member, IEEE, Chih-Yuan Lien, Student Member, IEEE, and Hsu-Ming Chuang

Abstract—Image and video signals might be corrupted by im-pulse noise in the process of signal acquisition and transmission.In this paper, an efficient VLSI implementation for removing im-pulse noise is presented. Our extensive experimental results showthat the proposed technique preserves the edge features and ob-tains excellent performances in terms of quantitative evaluationand visual quality. The design requires only low computationalcomplexity and two line memory buffers. Its hardware cost is quitelow. Compared with previous VLSI implementations, our designachieves better image quality with less hardware cost. Synthesisresults show that the proposed design yields a processing rate ofabout 167 M samples/second by using TSMC 0.18 m technology.

Index Terms—Image denoising, impulse noise, pipeline architec-ture, VLSI.

I. INTRODUCTION

IN SUCH applications as printing skills, medical imaging,

scanning techniques, image segmentation, and face recog-

nition, images are often corrupted by noise in the process

of image acquisition and transmission. Hence, an efficient

denoising technique is very important for the image processing

applications [1]. Recently, many image denoising methods

have been proposed to carry out the impulse noise suppression

[2]–[17]. Some of them employ the standard median filter [2]

or its modifications [3], [4] to implement denoising process.

However, these approaches might blur the image since both

noisy and noise-free pixels are modified. To avoid the damage

on noise-free pixels, an efficient switching strategy has been

proposed in the literature [5]–[17].

In general, the switching median filter [5] consists of two

steps: 1) impulse detection and 2) noise filtering. It locates the

noisy pixels with an impulse detector, and then filters them

rather than the whole pixels of an image to avoid the damage

on noise-free pixels. Generally, the denoising methods for

impulse noise suppression can be classified into two categories:

lower-complexity techniques [6]–[13] and higher-complexity

techniques [14]–[17]. The former uses a fixed-size local

window and requires a few line buffers. Furthermore, its

computational complexity is low and can be comparable to

Manuscript received March 14, 2008; revised September 14, 2008 and De-cember 02, 2008. First published May 29, 2009; current version published Feb-ruary 24, 2010. This work was supported in part by the National Science Councilunder Grant 96-2221-E-006-027-MY3 and Grant97-2622-E-006-012-CC3 andby the use of Shared Facilities supported by the Program of Top 100 Universi-ties Advancement, Ministry of Education, Taiwan.

The authors are with the Department of Computer Science and InformationEngineering, National Cheng Kung University, Tainan 701, Taiwan (e-mail: [email protected]; [email protected]; [email protected]).

Digital Object Identifier 10.1109/TVLSI.2008.2012263

conventional median filter or its modification [2]–[4]. The latter

yields visually pleasing images by enlarging local window size

adaptively [15], [16] or doing iterations [14]–[17]. In this paper,

we focus only on the lower-complexity denoising techniques

because of its simplicity and easy implementation with the

VLSI circuit.

In [6], Zhang and Karim proposed a new impulse detector

(NID) for switching median filter. NID used the minimum ab-

solute value of four convolutions which are obtained by using

1-D Laplacian operators to detect noisy pixels. A method named

as differential rank impulse detector (DRID) is presented in

[7]. The impulse detector of DRID is based on a comparison

of signal samples within a narrow rank window by both rank

and absolute value. In [8], Luo proposed a method which can

efficiently remove the impulse noise (ERIN) based on simple

fuzzy impulse detection technique. An alpha-trimmed mean-

based method (ATMBM) was presented in [9]. It used the alpha-

trimmed mean in impulse detection and replaced the noisy pixel

value by a linear combination of its original value and the me-

dian of its local window. In [10], a decision-based algorithm

(DBA) is proposed to remove the corrupted pixel by the me-

dian or by its neighboring pixel value according the proposed

decisions.

For real-time embedded applications, the VLSI implementa-

tion of switching median filter for impulse noise removal is nec-

essary and should be considered. For customers, cost is usually

the most important issue while choosing consumer electronic

products. We hope to focus on low-cost denoising implemen-

tation in this paper. The cost of VLSI implementation depends

mainly on the required memory and computational complexity.

Hence, less memory and few operations are necessary for a

low-cost denoising implementation. Based on these two fac-

tors, we propose a simple edge-preserved denoising technique

(SEPD) and its VLSI implementation for removing fixed-value

impulse noise. The storage space needed for SEPD is two line

buffers rather than a full frame buffer. Only simple arithmetic

operations, such as addition and subtraction, are used in SEPD.

We proposed a useful impulse noise detector to detect the noisy

pixel and employ an effective design to locate the edge of it. The

experimental results demonstrate that SEPD can obtain better

performances in terms of both quantitative evaluation and visual

quality than other state-of-the-art lower-complexity impulse de-

noising methods [6]–[13]. Furthermore, the VLSI implementa-

tion of our method also outperforms previous hardware circuits

[11]–[13], [19], [20] in terms of quantitative evaluation, visual

quality, and hardware cost.

The rest of this paper is organized as follows. In Section II,

the proposed SEPD is introduced. The VLSI implementation of

1063-8210/$26.00 © 2009 IEEE

Authorized licensed use limited to: IEEE Xplore. Downloaded on May 13,2010 at 11:42:31 UTC from IEEE Xplore. Restrictions apply.

Page 2: A_Low-Cost_VLSI_Implementation_for_Efficient_Removal_of_Impulse_Noise-QR9

474 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 18, NO. 3, MARCH 2010

Fig. 1. 3 � 3 mask centered on � .

SEPD is described briefly in Section III. In Section IV, the im-

plementation of reduced SEPD is introduced. The implementa-

tion results and comparison are provided in Section V. Conclu-

sions are presented in Section VI.

II. PROPOSED SEPD

Assume that the current pixel to be denoised is located at

coordinate and denoted as , and its luminance values

before and after the denoising process are represented as

and , respectively. If is corrupted by the fixed-value im-

pulse noise, its luminance value will jump to be the minimum

or maximum value in gray scale. Here, we adopt a 3 3 mask

centering on for image denoising. In the current , we

know that the three denoised values at coordinates ,

and are determined at the previous de-

noising process, and the six pixels at coordinates , ,

, , and are not

denoised yet, as shown in Fig. 1. A pipelined hardware archi-

tecture is adopted in the design, so we assume that the denoised

value of is still in the pipeline and not available. Using

the 3 3 values in , SEPD will determine whether is a

noisy pixel or not. If positive, SEPD locates a directional edge

existing in and uses it to determine the reconstructed value

; otherwise, .

SEPD is composed of three components: extreme data

detector, edge-oriented noise filter and impulse arbiter. The

extreme data detector detects the minimum and maximum

luminance values in , and determines whether the lumi-

nance values of and its five neighboring pixels are equal

to the extreme data. By observing the spatial correlation, the

edge-oriented noise filter pinpoints a directional edge and uses

it to generate the estimated value of current pixel. Finally, the

impulse arbiter brings out the proper result. Fig. 2 shows the

pseudo code of SEPD. The three components of SEPD are

described in detail in the following subsections.

A. Extreme Data Detector

The extreme data detector detects the minimum and max-

imum luminance values ( and ) in

those processed masks from the first one to the current

one in the image. If a pixel is corrupted by the fixed-value

impulse noise, its luminance value will jump to be the min-

imum or maximum value in gray scale. If is not equal to

, we conclude that is a noise-free

pixel and the following steps for denoising are skipped. If

Fig. 2. Pseudo code of our scaling method.

is equal to or , we set to 1, check

whether its five neighboring pixels are equal to the extreme

data, and store the binary compared results into , as shown in

Fig. 2.

B. Edge-Oriented Noise Filter

To locate the edge existed in the current , a simple edge-

catching technique which can be realized easily with VLSI cir-

cuit is adopted. To decide the edge, we consider 12 directional

differences, from to , as shown in Fig. 3. Only those

composed of noise-free pixels are taken into account to avoid

possible misdetection. If a bit in is equal to 1, it means that

the pixel related to the binary flag is suspected to be a noisy

pixel. Directions passing through the suspected pixels are dis-

carded to reduce misdetection. In each condition, at most four

Authorized licensed use limited to: IEEE Xplore. Downloaded on May 13,2010 at 11:42:31 UTC from IEEE Xplore. Restrictions apply.

Page 3: A_Low-Cost_VLSI_Implementation_for_Efficient_Removal_of_Impulse_Noise-QR9

CHEN et al.: LOW-COST VLSI IMPLEMENTATION FOR EFFICIENT REMOVAL OF IMPULSE NOISE 475

Fig. 3. Twelve directional differences of SEPD.

directions are chosen for low-cost hardware implementation. If

there appear over four directions, only four of them are chose ac-

cording to the variation in angle. Fig. 4 shows the mapping table

between and the chosen directions adopted in the design.

If , , , and are all

suspected to be noisy pixels “ ” , no edge

can be processed, so (the estimated value of )

is equal to the weighted average of luminance values

of three previously denoised pixels and calculated as

. In other conditions

except when “ ” the edge filter calculates the

directional differences of the chosen directions and locates the

smallest one among them, as shown in Fig. 2. The

smallest directional difference implies that it has the strongest

spatial relation with , and probably there exists an edge

in its direction. Hence, the mean of luminance values of the

two pixels which possess the smallest directional difference is

treated as .

For example, if is equal to “10011,” it means that ,

and are suspected to be noisy values. Therefore,

, and are discarded because they contain

those suspected pixels (see Fig. 3). The four chosen directional

differences are , , and (see Fig. 4). Finally, is

equal to the mean of luminance values of the two pixels which

possess the smallest directional difference among , ,

and .

C. Impulse Arbiter

Since the value of a pixel corrupted by the fixed-value im-

pulse noise will jump to be the minimum/maximum value in

gray scale, we can conclude that if is corrupted, is equal

to or . However, the converse is not true.

If is equal to or , may be cor-

rupted or just in the region with the highest or lowest luminance.

In other words, a pixel whose value is or

might be identified as a noisy pixel even if it is not corrupted. To

Fig. 4. Thirty-two possible values of � and their corresponding directions inSEPD.

overcome this drawback, we add another condition to reduce the

possibility of misdetection. If is a noise-free pixel and the

current mask has high spatial correlation, should be close

to and is small. That is to say, might be a

noise-free pixel but the pixel value is or

if is small. We measure and compare it

with a threshold to determine whether is corrupted or not.

The threshold, denoted as , is a predefined value. Obviously,

the threshold affects the performance of the proposed method. A

more appropriate threshold can achieve a better detection result.

However, it is not easy to derive an optimal threshold through

analytic formulation. According to our experimental results, we

set the threshold as 20. If is judged as a corrupted pixel,

the reconstructed luminance value is equal to ; other-

wise, .

III. VLSI IMPLEMENTATION OF SEPD

SEPD has low computational complexity and requires only

two line buffers, so its cost of VLSI implementation is low. For

better timing performance, we adopt the pipelined architecture

which can produce an output at every clock cycle. In our imple-

mentation, the SRAM used to store the image luminance values

is generated with the 0.18 m TSMC/Artisan memory compiler

[21]. According to the simulation, we found that the access time

for SRAM is about 6 ns. Since the operation of SRAM access

belongs to the first pipeline stage of our design, we divide the

remaining denoising steps into 6 pipeline stages evenly to keep

the propagation delay of each pipeline stage around 6 ns.

Fig. 5 shows block diagram of the 7-stage pipeline architec-

ture for SEPD. The architecture consists of five main blocks:

line buffer, register bank, extreme data detector, edge-oriented

Authorized licensed use limited to: IEEE Xplore. Downloaded on May 13,2010 at 11:42:31 UTC from IEEE Xplore. Restrictions apply.

Page 4: A_Low-Cost_VLSI_Implementation_for_Efficient_Removal_of_Impulse_Noise-QR9

476 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 18, NO. 3, MARCH 2010

Fig. 5. Block diagram of VLSI architecture for SEPD.

Fig. 6. Architecture of register bank in SEPD.

noise filter and impulse arbiter. Each of them is described briefly

in the following subsections.

A. Line Buffer

SEPD adopts a 3 3 mask, so three scanning lines are

needed. If are processed, three pixels from ,

and , are needed to perform the denoising process. With

the help of four crossover multiplexers (see Fig. 5), we realize

three scanning lines with two line buffers. As shown in Fig. 5,

Line Buffer-odd and Line Buffer-even are designed to store the

pixels at odd and even rows, respectively.

To reduce cost and power consumption, the line buffer is im-

plemented with a dual-port SRAM (one port for reading out data

and the other port for writing back data concurrently) instead of

a series of shifter registers. If the size of an image is ,

the size required for one line buffer is bytes in which 3

represents the number of pixels stored in the register bank.

B. Register Bank

The register bank (RB), consisting of 9 registers, is used to

store the 3 3 pixel values of the current mask . Fig. 6 shows

its architecture where each 3 registers are connected serially in

a chain to provide three pixel values of a row in , and Reg4

keeps the luminance value of the current pixel to be de-

noised. Obviously, the denoising process for does not start

until enterers from the input device. The nine values

stored in RB are then used simultaneously by subsequent ex-

treme data detector and noise filter for denoising.

Once the denoising process for is completed, the recon-

structed pixel value generated by the arbiter is outputted

and written into the line buffer storing to replace .

Fig. 7. Two examples of the interconnections between two line buffers and RB.

When the denoising process shifts from to , only 3

new values are needed to be read

into RB (Reg2, Reg5 and Reg8, respectively) and other 6 pixel

values are shifted to each one’s proper register. At the same

time, the previous value in Reg8 (the previous input value from

the input device, ) is written back to the line buffer

storing for subsequent denoising process.

The selection signals of the four multiplexers are all set to 1

or 0 for denoising the odd or the even rows, respectively. Two

examples are shown in Fig. 7 to illustrate the interconnections

between the two line buffers and RB. Assume that we denoise

, and set all four selection signals to 0, those samples of

and are stored in Line Buffer-odd and Buffer-even,

respectively. The samples of are inputted from the input

device, as shown in Fig. 7(a). The previous value in Reg8 and

the denoised results generated by the arbiter are written back to

Line Buffer-odd and Line Buffer-even, respectively. After the

denoising process of has been completed, Line Buffer-odd

is now full with the whole samples of , while those de-

noised samples of are all stored in Line Buffer-even. To

denoise , we set all selections signals to 1. Thus the pre-

vious value in Reg8 and the denoised results generated by the ar-

biter are written back to Line Buffer-even and Line Buffer-odd,

respectively, as shown in Fig. 7(b). After the denoising process

of has been completed, Line Buffer-even is now full with

the whole samples of , while those denoised samples of

are stored in Line Buffer-odd.

C. Extreme Data Detector

Fig. 8 shows the 3-stage pipeline architecture of the extreme

data detector in which P represents the pipeline register and EC

is the equality comparator that will output logic 1 if both two

input values are identical. The 2-stage min-max tree module is

used to find and . Two columns of EC

Authorized licensed use limited to: IEEE Xplore. Downloaded on May 13,2010 at 11:42:31 UTC from IEEE Xplore. Restrictions apply.

Page 5: A_Low-Cost_VLSI_Implementation_for_Efficient_Removal_of_Impulse_Noise-QR9

CHEN et al.: LOW-COST VLSI IMPLEMENTATION FOR EFFICIENT REMOVAL OF IMPULSE NOISE 477

Fig. 8. Architecture of register bank in SEPD.

Fig. 9. Two-stage pipeline architecture of edge-oriented noise filter.

units are used to determine whether the lower six pixels in

are equal to or , respectively. rep-

resents the eight neighboring pixels’ values of in . The

six OR gates are employed to generate the binary comparison

results and . When , it means is a noise-free

pixel and the following operations can be skipped. In this condi-

tion, we disable those registers storing and to avoid

unnecessary power consumption.

D. Edge-Oriented Noise Filter

Fig. 9 shows the 2-stage pipeline architecture of the edge-ori-

ented noise filter in which the unit is used to output the

absolute value of difference of two inputs. Using from the de-

tector, the mapping module implements the table shown in Fig. 4

to locate the four directions which are composed of noise-free

pixels. Four directional differences are calculated with the four

units. Then the smallest one is determined by using the

Fig. 10. Architecture of impulse arbiter.

TABLE IPART OF THE SCHEDULING OF THE SEPD CHIP

minimum tree unit. After that, the mean of luminance values of

the two pixels which possess the smallest directional difference

can be obtained. When “ ” the final multi-

plexer will output . When

“ ” the multiplexer will output the mean of the two

pixel values which possess .

E. Impulse Arbiter

Fig. 10 shows the architecture of the impulse arbiter in which

comparator CMP is used to output logic 1 if the upper input

value is greater than the lower one . The final

multiplexer is used to output generated by the noise filter

when is corrupted, or when is noise-free.

In the design, one clock cycle is needed to fetch a value from

one line buffer and then load it into RB. Three clock cycles are

needed for the extreme data detector. To complete their func-

tions, the edge-oriented noise filter and impulse arbiter require

two and one clock cycles, respectively. Totally, seven clock cy-

cles are needed to perform the denoising processing of a pixel.

Table I shows part of the scheduling of these operations. If the

width of an image is , the output latency is clock cy-

cles in which 7 means the number of pipeline stage.

IV. IMPLEMENTATION OF REDUCED SEPD

In SEPD, we consider 12 directional differences to decide

the proper edge. When more edges are considered, more com-

plex computations are required. To further reduce the cost of

implementation, we modify SEPD and propose another design,

named as reduced SEPD (RSEPD). Only three directional dif-

ferences, , and as shown in Fig. 11, are considered in

RSEPD. As demonstrated in Section V, RSEPD offers slightly

poorer image quality but requires much lower cost than SEPD.

Fig. 12 shows the pseudo code of RSEPD.

Authorized licensed use limited to: IEEE Xplore. Downloaded on May 13,2010 at 11:42:31 UTC from IEEE Xplore. Restrictions apply.

Page 6: A_Low-Cost_VLSI_Implementation_for_Efficient_Removal_of_Impulse_Noise-QR9

478 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 18, NO. 3, MARCH 2010

Fig. 11. Three directional differences in RSEPD.

Fig. 12. Pseudo code of RSEPD method.

The VLSI architecture of RSEPD is also composed of five

main blocks. RB, line buffer and impulse arbiter of RSPED

are identical to those of SEPD; however, there are some differ-

ence between the extreme detector and noise filter and those of

SEPD:

1) In the extreme data detector of RSEPD, only one neigh-

boring pixel is considered for edge preservation. As

shown in Fig. 13, the extreme data detector of RSEPD con-

tains four EC units where two EC units are used to deter-

mine and another two are used to determine .

2) In the noise filter of RSEPD, three directional differences

are considered. The mapping module used in SEPD be-

comes unnecessary and is removed. As shown in Fig. 14,

Fig. 13. Extreme data detector in RSEPD.

Fig. 14. Noise filter in RSEPD.

TABLE IIPART OF THE SCHEDULING OF THE RSEPD CHIP

we can apply three units to calculate three differ-

ences, and then determine the smallest one with the min-

imum tree unit.

3) In RSEPD, both the extreme data detector and noise filter

need three clock cycles to complete their functions. They

work in parallel because there exists no data dependency

between them. As shown in Fig. 10, the impulse arbiter

requires one clock cycle to complete its job. Totally,

five clock cycles are needed for RSEPD to perform the

denoising processing of a pixel. Table II shows part of

the scheduling of these operations. The output latency

of RSEPD is clock cycles in which 5 means the

number of pipeline stage.

Authorized licensed use limited to: IEEE Xplore. Downloaded on May 13,2010 at 11:42:31 UTC from IEEE Xplore. Restrictions apply.

Page 7: A_Low-Cost_VLSI_Implementation_for_Efficient_Removal_of_Impulse_Noise-QR9

CHEN et al.: LOW-COST VLSI IMPLEMENTATION FOR EFFICIENT REMOVAL OF IMPULSE NOISE 479

Fig. 15. Comparisons of PSNR for the reference images. (a) Lena; (b) boat; (c)couple; (d) peppers; (e) airplane; (f) goldhill.

V. IMPLEMENTATION RESULTS AND COMPARISONS

To verify the characteristics and performances of various

denoising algorithms, a variety of simulations are carried

out on the six well-known 512 512 8-bit gray-scale test

images: Lena, Boat, Couple, Peppers, Airplane, and Goldhill.

In the simulations, images are corrupted by impulse noise

(salt-and-pepper noise), where “salt” and “pepper” noise are

with equal probability. Furthermore, a wide range of noise ra-

tios varied from 10% to 90% with increments of 10% is tested.

Totally, eight recent lower-complexity denoising methods (NID

[6], DRID [7], ERIN [8], ATMBM [9], DBA [10], RVNP [11],

AMF [12], NAVF [13]), three higher-complexity denoising

Fig. 16. Results of different methods in restoring 60% corrupted image “Lena”.(a) Noise-free image; (b) noisy image; (c) NID [6]; (d) DRID [7]; (e) ERIN [8];(f) ATMBM [9]; (g) DBA [10]; (h) RVNP [11]; (i) AMF [12]; (j) NAVF [13];(k) MND [15]; (l) BDND [16]; (m) GPF [17]; (n) SEPD; (o) RSEPD.

methods (MND [15], BDND [16], GPF [17]), and our methods

(SEPD, and RSEPD) are compared in terms of objective testing

(quantitative evaluation) and subjective testing (visual quality)

where the parameters or thresholds of these methods are set as

suggested.

We employ the peak signal-to-noise ratio (PSNR) to illustrate

the quantitative quality of the reconstructed images for various

methods. As shown in Fig. 15, it can be observed that the per-

formances of SEPD are always the best, even the noise ratio is

as high as 90%. To explore the visual quality, we show the re-

constructed images of different denoising methods in restoring

60% corrupted image “Lena” in Fig. 16. Obviously, NID, ERIN,

Authorized licensed use limited to: IEEE Xplore. Downloaded on May 13,2010 at 11:42:31 UTC from IEEE Xplore. Restrictions apply.

Page 8: A_Low-Cost_VLSI_Implementation_for_Efficient_Removal_of_Impulse_Noise-QR9

480 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 18, NO. 3, MARCH 2010

TABLE IIICOMPARISONS OF RESTORATION RESULTS FOR NOISE-FREE “LENA” IMAGE

Hint: mis-detected pixels are those noise-free pixels which are detected as

the noisy pixels by the detection approach.

AMF and GPF produce noticeable black and white dots. DRID,

ATMBM, DBA and NAVF are not good enough with regard to

edge preservation. RVNP brings out blur reconstructed image.

Our SEPD and RSEPD produce visually pleasing images as

the higher-complexity MND and BDND do. Moreover, we also

verify these methods on the noise-free “Lena” image, and the

result is shown in Table III. It can be observed that our SEPD

and RSEPD still perform well.

The proposed VLSI architectures of SEPD and RSEPD were

implemented by using Verilog HDL. We used SYNOPSYS

Design Vision to synthesize the designs with TSMC’s 0.18

m cell library. The layouts for the designs were generated

with SYNOPSYS Astro (for auto placement and routing),

and verified by MENTOR GRAPHIC Calibre (for DRC and

LVS checks), respectively. Nanosim was used for post-layout

transistor-level simulation. Finally, SYNOPSYS PrimePower

was employed to measure the total power consumption. The

synthesis results show that the SEPD chip contains 6352 gate

counts and the RSEPD chip contains 3719 gate counts. Both

of them work in a clock period of 6 ns and can achieve a pro-

cessing rate of about 167 mega pixels per second. Furthermore,

SEPD and RSEPD are also implemented on the Altera Cyclone

II EP2C8F256C6 FPGA board for verification. The operating

clock frequencies of SPED and RSPED are 112.79 MHz

(with 1249 logic elements) and 105.95 MHz (with 706 logic

elements), respectively. Since our design just requires simple

operations (fixed bit-width adding, subtracting, or comparing),

the denoised results of SEPD with software program and with

the VLSI circuit are identical, as shown in Fig. 17.

To explore the advantage of low cost for our design, we com-

pared our RSEPD with three recent denoising chips [11]–[13].

They were realized with different VLSI implementations, so it

is very difficult to compare our chip with them directly. Hence,

Fig. 17. Results of SEPD with software program and VLSI circuit at differentnoise ratio. (a) “Peppers” by software; (b) “peppers” by circuit.

TABLE IVFEATURES OF FIVE DENOISING IMPLEMENTATIONS

we have implemented our design by using three different

technologies, respectively. Table IV shows the detailed com-

parisons for various implementations. As demonstrated, our

RSEPD requires the lowest hardware cost and works faster than

RVNP [11], AMF [12] and NAVF [13]. Furthermore, RSEPD

obtains better image quality than those methods [11]–[13] as

shown in Figs. 15 and 16. In [11], the author claims that their

Authorized licensed use limited to: IEEE Xplore. Downloaded on May 13,2010 at 11:42:31 UTC from IEEE Xplore. Restrictions apply.

Page 9: A_Low-Cost_VLSI_Implementation_for_Efficient_Removal_of_Impulse_Noise-QR9

CHEN et al.: LOW-COST VLSI IMPLEMENTATION FOR EFFICIENT REMOVAL OF IMPULSE NOISE 481

chip achieves better filtering quality with lower circuit com-

plexity than other chips [19], [20]. Therefore, we can conclude

that RSEPD outperforms other chips [11]–[13], [19], [20] in

terms of quantitative evaluation, visual quality and hardware

cost.

VI. CONCLUSION

In this paper, an efficient VLSI implementation for impulse

noise removal is presented. The extensive experimental results

demonstrate that our design achieves excellent performance

in terms of quantitative evaluation and visual quality, even

the noise ratio is as high as 90%. For real-time applications, a

7-stage pipeline architecture for SEPD and a 5-stage pipeline

architecture for RSEPD are also developed and implemented.

As the outcome demonstrated, RSEPD outperforms other

chips [11]–[13], [19], [20] with the lowest hardware cost. The

architectures work with monochromatic images, but they can

be extended for working with RGB color images and videos.

REFERENCES

[1] W. K. Pratt, Digital Image Processing. New York: Wiley-Inter-science, 1991.

[2] T. Nodes and N. Gallagher, “Median filters: Some modifications andtheir properties,” IEEE Trans. Acoust., Speech, Signal Process., vol.ASSP-30, no. 5, pp. 739–746, Oct. 1982.

[3] S.-J. Ko and Y.-H. Lee, “Center weighted median filters and their ap-plications to image enhancement,” IEEE Trans. Circuits Syst., vol. 38,no. 9, pp. 984–993, Sep. 1991.

[4] H. Hwang and R. Haddad, “Adaptive median filters: New algorithmsand results,” IEEE Trans. Image Process., vol. 4, no. 4, pp. 499–502,Apr. 1995.

[5] T. Sun and Y. Neuvo, “Detail-preserving median based filters in imageprocessing,” Pattern Recog. Lett., vol. 15, no. 4, pp. 341–347, Apr.1994.

[6] S. Zhang and M. A. Karim, “A new impulse detector for switchingmedian filter,” IEEE Signal Process. Lett., vol. 9, no. 11, pp. 360–363,Nov. 2002.

[7] I. Aizenberg and C. Butakoff, “Effective impulse detector based onrank-order criteria,” IEEE Signal Process. Lett., vol. 11, no. 3, pp.363–366, Mar. 2004.

[8] W. Luo, “Efficient removal of impulse noise from digital images,”IEEE Trans. Consum. Electron., vol. 52, no. 2, pp. 523–527, May2006.

[9] W. Luo, “An efficient detail-preserving approach for removing im-pulse noise in images,” IEEE Signal Process. Lett., vol. 13, no. 7, pp.413–416, Jul. 2006.

[10] K. S. Srinivasan and D. Ebenezer, “A new fast and efficient decision-based algorithm for removal of high-density impulse noises,” IEEE

Signal Process. Lett., vol. 14, no. 3, pp. 189–192, Mar. 2007.[11] S.-C. Hsia, “Parallel VLSI design for a real-time video-impulse

noise-reduction processor,” IEEE Trans. Very Large Scale Integr.

(VLSI) Syst., vol. 11, no. 4, pp. 651–658, Aug. 2003.[12] I. Andreadis and G. Louverdis, “Real-time adaptive image impulse

noise suppression,” IEEE Trans. Instrum. Meas., vol. 53, no. 3, pp.798–806, Jun. 2004.

[13] V. Fischer, R. Lukac, and K. Martin, “Cost-effective video filtering so-lution for real-time vision systems,” EURASIP J. Appl. Signal Process.,vol. 2005, no. 13, pp. 2026–2042, Jan. 2005.

[14] Z. Wang and D. Zhang, “Progressive switching median filter for theremoval of impulse noise from highly corrupted images,” IEEE Trans.

Circuits Syst. II, Analog Digit. Signal Process., vol. 46, no. 1, pp.78–80, Jan. 1999.

[15] R. H. Chan, C.-W. Ho, and M. Nikolova, “Salt-and-pepper noise re-moval by median-type noise detectors and detail-preserving regular-ization,” IEEE Trans. Image Process., vol. 14, no. 10, pp. 1479–1485,Oct. 2005.

[16] P.-E. Ng and K.-K. Ma, “A switching median filter with boundarydiscriminative noise detection for extremely corrupted images,” IEEE

Trans. Image Process., vol. 15, no. 6, pp. 1506–1516, Jun. 2006.[17] N. I. Petrovic and V. Crnojevic, “Universal impulse noise filter based

on genetic programming,” IEEE Trans. Image Process., vol. 17, no. 7,pp. 1109–1120, Jul. 2008.

[18] R. H. Chan, C. Hu, and M. Nikolova, “An iterative procedure for re-moving random-valued impulse noise,” IEEE Signal Process. Lett., vol.11, no. 12, pp. 921–924, Dec. 2004.

[19] C.-Y. Lee, P.-W. Hsieh, and J.-M. Tsai, “High-speed median filter de-signs using shiftable content-addressable memory,” IEEE Trans. Cir-

cuits Syst. Video Technol., vol. 4, no. 6, pp. 544–549, Dec. 1994.[20] C.-T. Chen, L.-G. Chen, and J.-H. Hsiao, “VLSI implementation of a

selective median filter,” IEEE Trans. Consum. Electron., vol. 42, no. 1,pp. 33–42, Feb. 1996.

[21] Artisan Components Inc., USA, “0.18 �m TSMC/Artisan memorycompiler,” 2003. [Online]. Available: http://www.artisan.com

Pei-Yin Chen received the B.S. and Ph.D. degreesin electrical engineering from National ChengKung University, Tainan, Taiwan, in 1986, and1999, respectively, and the M.S. degree in electricalengineering from Penn Sate University, UniversityPark, in 1990.

He is currently an Associate Professor with theDepartment of Computer Science and InformationEngineering, National Cheng Kung University. Hisresearch interests include VLSI chip design, videocompression, fuzzy logic control, and gray prediction.

Chih-Yuan Lien received the B.S. and M.S. degreesin computer science and information engineeringfrom National Taiwan University, Taipei, Taiwan,in 1996 and 1998, respectively. He is currentlypursuing the Ph.D. degree in computer science andinformation engineering from National Cheng KungUniversity, Tainan, Taiwan.

His research interests include image processing,VLSI chip design, and video coding system.

Hsu-Ming Chuang received the B.S. degree in com-puter science from National Tsing-Hua University,Hsinchu, Taiwan, in 2005, and the M.S. degree incomputer science and information engineering fromNational Cheng Kung University, Tainan, Taiwan, in2007.

His research interests include VLSI chip design,data compression, and image processing.

Authorized licensed use limited to: IEEE Xplore. Downloaded on May 13,2010 at 11:42:31 UTC from IEEE Xplore. Restrictions apply.