fir filter design based on successive approximation of vectors

16
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 62, NO. 15, AUGUST 1, 2014 3833 FIR Filter Design Based on Successive Approximation of Vectors Eduardo A. B. da Silva, Senior Member, IEEE, Lisandro Lovisolo, Member, IEEE, Alessandro J. S. Dutra, Member, IEEE, and Paulo S. R. Diniz, Fellow, IEEE Abstract—We present a novel method for the design of nite im- pulse response (FIR) lters with discrete coefcients that belong in the sum of powers-of-two (POT) space. The importance of this class of lters cannot be overstated, given the ever-increasing number of applications for which a specic hardware implementation is needed. Filters that have coefcients that belong to such a class are also referred to as multiplierless lters, given that the oper- ations performed by the lter can all be implemented by using appropriately designed shifts of the input data, making them a perfect choice whenever implementation simplicity and processing speed are the ultimate goal. To produce such a design, we employ a vector successive approximation technique successfully used in data compression that has a very low computational complexity, the Matching Pursuits Generalized BitPlanes algorithm (MPGBP). We derive optimality conditions for the approximation dictionary. We compare lters obtained with the proposed method with those derived in previous works. Based on this comparative analysis, we show that this new and powerful way of producing the lters’ co- efcients is also among the simplest available in the literature. Index Terms—Signal processing, lter design, successive approx- imation methods, VQ. I. INTRODUCTION I T has long been a subject of interest to design FIR lters whose coefcients belong to a constrained class of num- bers, in particular the sums of powers-of-two (SOPOT) class [1]–[4], as opposed to using the high-precision oating-point coefcients generated by conventional methods. This approach is particularly useful for hardware implementations [2], [5], [6], commonly referred to as multiplier-less design [1], [3], [5], [7], [8], in which all coefcients are generated from a combination of shifts and adds/subtracts, thereby eliminating the hardware cost associated with a general multiplier. In addition, such de- sign strategy together with a generic hardware implementation allows to adapt and program lters on the y [5]. In the search for such useful representations, several methods have been developed [1], [3], [5]–[14]. For example, integer programming has been used to produce solutions to min- imization problems using discrete constraints in the lter Manuscript received July 24, 2013; revised March 19, 2014; accepted April 29, 2014. Date of publication May 16, 2014; date of current version July 10, 2014. The associate editor coordinating the review of this manuscript and ap- proving it for publication was Prof. Anthony Man-Cho So. E. A. B. da Silva, A. J. S. Dutra, and P. S. R. Diniz are with the Universidade Federal do Rio de Janeiro, Rio de Janeiro 29145–970, Brazil. L. Lovisolo is with the Universidade do Estado do Rio de Janeiro, Rio de Janeiro, RJ, 21945-970, Brazil (e-mail: [email protected]). This paper has supplementary downloadable multimedia material available at http://ieeexplore.ieee.org provided by the authors. An implementation of the proposed SOPOT lter design is provided. Color versions of one or more of the gures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identier 10.1109/TSP.2014.2324992 coefcient values for both the weighted minimax and weighted least-squares objective functions [11], [12]. In either case, the lter designs obtained are optimal for a given word length, but the design processes are extremely complex in computational terms. More recently, a method using linear programming to optimize the coefcients directly in the sub-expression space was proposed [9] that yields a considerable reduction in the number of required sums to attain a given specication. How- ever, SOPOT lters impose additional restrictions on the design of quantized coefcients lters [15]–[18], an important aspect is how to allocate the available limited hardware non-uniformly among coefcients in order to obtain the desired performance [14]. Due to the complex nature of the SOPOT-Filter coef- cients optimization other common optimization approaches have also been tried, as for example Genetic Algorithms [8], [13]. A common drawback of these approaches rests on the design time [14] that impairs the practical use of SOPOT pro- grammable lters. A different optimization approach is used in [7], in which an innite-precision prototype lter is designed to exceed the prescribed specications, thereby producing a margin of error for the coefcient quantization. A non-linear optimization is then performed to determine the nal lter representation. In [19] following some relaxation on the lter specications an heuristic for designing lters is presented that obtain SOPOT lters. In particular [20] emphasizes the problem of obtaining long lters using the available methods, proposing an approach based on the design of extrapolated impulse responses for FIR lters. In this paper, we propose a design method also based on the approximation of an innite precision prototype lter, designed to match or exceed the prescribed specications. We employ a technique, initially developed for data compression applica- tions, that decomposes vectors in generalized bitplanes using successive approximations. This technique allows us to dene the representation dictionary in such a way that the design com- putational complexity is kept to a minimum. A striking feature of the new proposed design method is its simple implementa- tion such that no sophisticate optimization or exhaustive search programs are required. The paper is organized as follows: Section II presents a brief review of successive approximation vector representation tech- niques, including the Matching Pursuits Generalized Bit-Plane (MPGBP) algorithm [21], which will serve as the basis for our proposed approximation method. It also presents some novel theoretical results specic to the variation of MPGPB used in this paper. The proposed method for the successive approximation design of SOPOT FIR lters is then described in Section III. Section IV presents a theoretical analysis of the performance of the proposed MPGBP variation in the context 1053-587X © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Upload: ufrj

Post on 12-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 62, NO. 15, AUGUST 1, 2014 3833

FIR Filter Design Based on SuccessiveApproximation of Vectors

Eduardo A. B. da Silva, Senior Member, IEEE, Lisandro Lovisolo, Member, IEEE,Alessandro J. S. Dutra, Member, IEEE, and Paulo S. R. Diniz, Fellow, IEEE

Abstract—We present a novel method for the design of finite im-pulse response (FIR) filters with discrete coefficients that belong inthe sum of powers-of-two (POT) space. The importance of this classof filters cannot be overstated, given the ever-increasing numberof applications for which a specific hardware implementation isneeded. Filters that have coefficients that belong to such a classare also referred to as multiplierless filters, given that the oper-ations performed by the filter can all be implemented by usingappropriately designed shifts of the input data, making them aperfect choice whenever implementation simplicity and processingspeed are the ultimate goal. To produce such a design, we employa vector successive approximation technique successfully used indata compression that has a very low computational complexity,theMatching Pursuits Generalized BitPlanes algorithm (MPGBP).We derive optimality conditions for the approximation dictionary.We compare filters obtained with the proposed method with thosederived in previous works. Based on this comparative analysis, weshow that this new and powerful way of producing the filters’ co-efficients is also among the simplest available in the literature.

Index Terms—Signal processing, filter design, successive approx-imation methods, VQ.

I. INTRODUCTION

I T has long been a subject of interest to design FIR filterswhose coefficients belong to a constrained class of num-

bers, in particular the sums of powers-of-two (SOPOT) class[1]–[4], as opposed to using the high-precision floating-pointcoefficients generated by conventional methods. This approachis particularly useful for hardware implementations [2], [5], [6],commonly referred to as multiplier-less design [1], [3], [5], [7],[8], in which all coefficients are generated from a combinationof shifts and adds/subtracts, thereby eliminating the hardwarecost associated with a general multiplier. In addition, such de-sign strategy together with a generic hardware implementationallows to adapt and program filters on the fly [5].In the search for such useful representations, several methods

have been developed [1], [3], [5]–[14]. For example, integerprogramming has been used to produce solutions to min-imization problems using discrete constraints in the filter

Manuscript received July 24, 2013; revised March 19, 2014; accepted April29, 2014. Date of publication May 16, 2014; date of current version July 10,2014. The associate editor coordinating the review of this manuscript and ap-proving it for publication was Prof. Anthony Man-Cho So.E. A. B. da Silva, A. J. S. Dutra, and P. S. R. Diniz are with the Universidade

Federal do Rio de Janeiro, Rio de Janeiro 29145–970, Brazil.L. Lovisolo is with the Universidade do Estado do Rio de Janeiro, Rio de

Janeiro, RJ, 21945-970, Brazil (e-mail: [email protected]).This paper has supplementary downloadable multimedia material available

at http://ieeexplore.ieee.org provided by the authors. An implementation of theproposed SOPOT filter design is provided.Color versions of one or more of the figures in this paper are available online

at http://ieeexplore.ieee.org.Digital Object Identifier 10.1109/TSP.2014.2324992

coefficient values for both the weighted minimax and weightedleast-squares objective functions [11], [12]. In either case, thefilter designs obtained are optimal for a given word length, butthe design processes are extremely complex in computationalterms. More recently, a method using linear programming tooptimize the coefficients directly in the sub-expression spacewas proposed [9] that yields a considerable reduction in thenumber of required sums to attain a given specification. How-ever, SOPOT filters impose additional restrictions on the designof quantized coefficients filters [15]–[18], an important aspectis how to allocate the available limited hardware non-uniformlyamong coefficients in order to obtain the desired performance[14]. Due to the complex nature of the SOPOT-Filter coef-ficients optimization other common optimization approacheshave also been tried, as for example Genetic Algorithms [8],[13]. A common drawback of these approaches rests on thedesign time [14] that impairs the practical use of SOPOT pro-grammable filters. A different optimization approach is used in[7], in which an infinite-precision prototype filter is designedto exceed the prescribed specifications, thereby producing amargin of error for the coefficient quantization. A non-linearoptimization is then performed to determine the final filterrepresentation. In [19] following some relaxation on the filterspecifications an heuristic for designing filters is presentedthat obtain SOPOT filters. In particular [20] emphasizes theproblem of obtaining long filters using the available methods,proposing an approach based on the design of extrapolatedimpulse responses for FIR filters.In this paper, we propose a design method also based on the

approximation of an infinite precision prototype filter, designedto match or exceed the prescribed specifications. We employa technique, initially developed for data compression applica-tions, that decomposes vectors in generalized bitplanes usingsuccessive approximations. This technique allows us to definethe representation dictionary in such a way that the design com-putational complexity is kept to a minimum. A striking featureof the new proposed design method is its simple implementa-tion such that no sophisticate optimization or exhaustive searchprograms are required.The paper is organized as follows: Section II presents a brief

review of successive approximation vector representation tech-niques, including theMatching Pursuits Generalized Bit-Plane(MPGBP) algorithm [21], which will serve as the basis forour proposed approximation method. It also presents somenovel theoretical results specific to the variation of MPGPBused in this paper. The proposed method for the successiveapproximation design of SOPOT FIR filters is then describedin Section III. Section IV presents a theoretical analysis of theperformance of the proposed MPGBP variation in the context

1053-587X © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

3834 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 62, NO. 15, AUGUST 1, 2014

of SOPOT FIR filter design, used to establish some designconstraints. Section V describes in detail the design algorithmalong with an analysis of its complexity, a relevant discussionwhen greedy algorithms as the one employed here are used forpractical purposes. Examples and performance comparisonsagainst other existing methods are discussed in Section VII. Ourconclusions and future directions are detailed in Section VIII.Appendix A contains the convergence proof for the algorithmpresented in Section III.

II. SUCCESSIVE APPROXIMATION OF VECTORS

Several methods for the approximation of vectors have beensuccessfully proposed for use in image and video coding fields[22]. The existence of particular constraining factors in thoseapplications, such as the ever present trade-off on permissibleencoding rate vs reproduction quality [23], have led us to conjec-ture that the use of those algorithms for approximating the coef-ficients of FIR filters, where the main constraint lies on whetheror not the filter frequency response satisfies the specifications[24], might enjoy similar success.We shall start this section with some brief definitions to es-

tablish the notation that will be used in the remaining of thearticle. We then proceed to review the Matching Pursuits algo-rithm [25] and, in Section II.C, we show how it can be modifiedinto a more structured decomposition method with the additionof the generalized bit-planes concept.

A. Notation

For a vector represents its -th component. If andare -dimensional complex-valued vectors, the inner productof and will be denoted and has the usual definition

(1)

where is the complex conjugate of ([26]). Thenorm of is then defined as

(2)

We describe an -th order FIR filter both in terms of its im-pulse response

(3)

and its vector notation

(4)

so that when the filter coefficients are real-valued scalars,is an -dimensional column vector, i.e., .

B. The Matching Pursuits Algorithm

The Matching Pursuits (MP) algorithm presents a suboptimalsolution to the NP-hard problem of finding the best expansionfor a function given a redundant, overcomplete, dictionary ofwaveforms or codebook [25], [27]–[29]. Instead of searchingover all the possible combinations for the lowest possible dis-tortion, the MP method builds an expansion based on an iter-ative search. In each iteration, one searches for the dictionaryword (element or atom) that is closest to the current represen-tation residue, in terms of an appropriately defined distortion

Fig. 1. MP -th decomposition step.

measure—in the first iteration the residue is the function itself.In this article, we consider a simpler, particular instance of theexpansion problem, that of approximating an -dimensionalvector given a large, redundant dictionary, that expands .Formally, if a vector is to be represented using a dictio-nary , theMPmethodproceeds as described in Algorithm 1.

Algorithm 1: The Matching Pursuits (MP) Algorithm [25]

1. Start with2. Repeat until a stop criterion is met:

a) Find the closest codeword, i.e., findsuch that

b) Choose

c) Letd) Increment

3. Stop.

At each step , including the initial one, the approximationresidue will be denoted and the expansion rule produces

(5)

Throughout this work one employs a subscripted symbol as into denote the value of selected in iteration . Although

subscripted symbols are also used for indexing elements in a set,as in the dictionary definition, their appropriate interpretationwill be clear from the context.The expansion rule in (5) provides the representation

(6)

In (5), is orthogonal to , as depicted in Fig. 1. There-fore, we can write

(7)

Equation (7) can be further extended to include all termsof the decomposition up to the -th step, yielding

. For more details, the reader isreferred to [25]. We present a modified version of the MP algo-rithm which is used as the main engine in the impulse responseapproximation method presented in this work.

C. Matching Pursuits With Generalized Bit-Planes

Each step of the approximation process in the MP algorithmproduces two pieces of information, namely the index of the

DA SILVA et al.: FIR FILTER DESIGN BASED ON SUCCESSIVE APPROXIMATION OF VECTORS 3835

closest dictionary (codebook) codeword and the length of theprojection of the residue onto that codeword. That is, usingdecomposition steps the MP algorithm provides the -termsignal representation:

(8)

where and indexes the dictio-nary codeword selected at the -th decomposition step. If oneexpands to , one can guaranteethat for . There are some works that dealwith MP approximation ratio [29], [30] and convergence [31].However, in practical applications, the length information,

given by , needs to be quantized, and the above citedapproximation ratio and convergence analyses need to be ex-tended. In [21] a version of the MP algorithm has been proposedsuch that this length is quantized as an integer power of aparameter , that is,

(9)

The value of above is chosen such that is closerto than it is to or . Applying the nearestneighbor optimal quantization rule [23], this is equivalent to

(10)

(11)

(12)

where is the smallest integer larger than or equal to . Thecomplete algorithm is described in Algorithm 2 below. Notethat, in order to avoid error accumulation, the residue in stepis computed using the quantized value of the projection (step

2c in Algorithm 2). We refer to this algorithm as the MatchingPursuits with Generalized Bit-Planes (MPGBP) algorithm. Thisname comes from the fact that we refer to the set of suchthat as the generalized bitplane .

Algorithm 2: The MPGBP Algorithm

1. Given , start with2. Repeat until a stop criterion is met:

a) Choose such that

b) Choose

c) Replace

d) Increment .3. Stop.

The MPGBP algorithm produces, after steps, an approxi-mation for given by

(13)

where . As pointed out in the commentary after (8),since must be such that if then necessarily

. Also, each has unit-norm. The convergence prop-erties of the MPGBP algorithm are given by Theorem 1.Theorem 1 (MPGBP Convergence): Given a dictionary

such that if then necessarily and also ,then for and any input vector the MPGBP Algorithm(Algorithm 2) converges, and the error incurred in the approxi-mation is bounded by

(14)

(15)

(16)

is the largest angle between any vector from the con-sidered signal space and its closest codeword [21], [32], [33].The proof of Theorem 1 is presented in Appendix A.A suboptimal version of the MPGBP has been presented

without proof in [21] for video coding. In that setting, the valueof was given by

(17)

III. FIR FILTER REPRESENTATION USING SOPOTS

We now turn to the presentation of the complete filter de-sign method, which includes the prototype filter design, its ap-proximation and the representation of its coefficients as sums ofpowers-of-two.

A. FIR Filter Approximation

Given a set of frequency response specifications, corresponding to the passband and stop-band

edges and their allowed undulations respectively, we startby designing an order prototype filter , whichwill be represented as . Ourproposal is to approximate the coefficients of using sumsof powers-of-two through the MPGPB algorithm. If we use

, then (13) becomes, for coordinate of

(18)

We can see from the above equation that if one restricts thecoordinates of the codewords to be either , 1 or0, (18) gives a binary representation of the coordinates of .Therefore, in order to approximate the coefficients of using theMPGBP algorithm, we should define the approximation dictio-nary

(19)

3836 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 62, NO. 15, AUGUST 1, 2014

in which the codewords , and each codeword hascomponents equal to and components equal to 0,i.e., they are permutations of the form

(20)

An example of such a codeword, for and is thevector .At first, one could argue that such a dictionary cannot be used

with the MPGPB algorithm because its codewords have norm. However, if in (12) and step 2b of Algorithm

2, one replaces by ,

then Algorithm 2 l still generate an approximation of as in(13) as this replacement is equivalent to using unit-norm atoms.However, this replacement can be dropped in implementation.This is so, because all codewords defined according to (20) forgiven and , have the same norm, not affecting the largestcorrelated codeword criterion of MP. Therefore, each of the co-ordinates of ends up being approximated as in(18).The vector to be approximated using Algorithm 2 is initial-

ized with the filter prototype . Since and the norm ofthe code-vectors is , the computation of in step 2.b of Al-gorithm 2 becomes

(21)

Due to the particular structure of the codewords (20), the deter-mination of the codeword closest to the residual (step2a of Algorithm 2) does not require an exhaustive search as inordinary vector quantization algorithms [23]. It can be accom-plished in an efficient way, as described next.Since the coordinates of the code-vectors are either , 1

or 0, the inner product is equivalent to adding thecoordinates of corresponding to the coordinates of equalto 1 and subtracting the coordinates of corresponding to thecoordinates of equal to . If one wants the largest innerproduct, two conditions must be satisfied:i) The corresponding coordinates of and shouldhave the same sign.

ii) The coordinates of largest magnitude should correspondto either 1 or .

With these conditions in mind, a fast algorithm for finding theclosest vector to a vector can be as described in Algo-rithm 3 below.

Algorithm 3

Fast Algorithm for finding the closest codeword to in withcodewords having coordinates and coordinates0 (permutations of (20)):1. Sort the absolute values of the coordinates of the currentrepresentation of in decreasing order of magnitude andstore the indexes of the largest ones.

2. The approximation codeword is obtained by setting thosecoordinates whose indexes were stored in step 1 to

if the coordinate is positive and to if the coordinate isnegative. The remaining coordinates are set to zero.

TABLE IEXAMPLE OF USE OF THE MPGBP ALGORITHM

TABLE IICOMMON-TERMS REDUCTION EXAMPLES

Example 1: Vector approximation using the MPGBP algo-rithm.—Suppose the vector is to beapproximated using the proposed method using a codebookwith . In this case, , which implies that thecodebook is comprised of codewords which are permutationsof . Initially, we set . The steps taken bythe algorithm are presented in Table I. From Table I, is ap-proximated as

(22)

We see that, in this particular example, the method produces anerror-free representation of the input vector in just 3 steps, usingat most one adder per filter coefficient (e.g.,

).

B. Common-Terms Reduction Step

Since , a representation using sums ofpowers-of-two is in general not unique. In filter design, we aimat the representation corresponding to the smallest complexity,and therefore, with the smallest number of terms. However,the MPGBP algorithm may generate a representation for filtercoefficients that does not correspond to the smallest numberof powers-of-two, that is, it contains terms that may be com-bined into a different power-of-two, as illustrated in Table II.Therefore, in order to generate a representation with the min-imum number of terms, we must carry out some post-processingto the output of the MPGBP algorithm. We refer to it as theCommon-terms Reduction Step.This post-processing operation starts from the representation

of the coefficient after the th iteration, , andfinds the minimal representation according to Algorithm 4. Thisalgorithm finds a minimal representation for numbers that canbe expressed exactly as a finite sum of powers-of-two, the onesstored in in the algorithm. This is the case for , sinceit is output by the MPGBP algorithm.

DA SILVA et al.: FIR FILTER DESIGN BASED ON SUCCESSIVE APPROXIMATION OF VECTORS 3837

Algorithm 4: Common-Terms Reduction Algorithm

For each , in :

1. Set2. Repeat until

a)b) Evaluatec) Let

The search for the best approximation for a given filter speci-fication may include testing for prototypes of different ordersand dictionaries having different values of , the number ofnon-zero codeword components. The implementation chosenmay be the one that yields the lowest implementation com-plexity in terms of quantity of SOPOTs used by the filter, whilesatisfying the given specifications.The design process may be stopped at any time. However,

usual stop criteria include the approximation error and/or thenumber of additions/subtractions used in the approximation. Itis also worth noticing that the number of POTs used in the ap-proximation of distinct components is not fixed, a constraintusually found in other optimization methods. One should notethat sometimes a higher order filter may end up with a lowercomplexity representation considering the number of adders re-sulting from the design process. In the examples given in thispaper, the filter prototypes are designed using the Parks-Mc-Clellan algorithm [24]. In addition, a global output multiplieris determined to ensure that the average passband gain is keptat 0 dB to allow for straightforward comparisons related to theobtained filter selectivity.

IV. PERFORMANCE BOUNDS AND DESIGN ISSUES

When approximating an impulse response using (18)through the MPGBP algorithm, the main objective is tominimize the number of sum of powers-of-two. This can beaccomplished by minimizing the number of steps necessaryto achieve a given approximation error , that is,

(23)

From Theorem 1, (14) and (15), the approximation error aftersteps is bounded by , where, for ,

(24)

Therefore, in order to minimize one must have a codebookwith the smallest possible value of . In this section wefind for the codebooks having codewords as in (20) anddiscuss design issues.

A. Computation of

We saw in Section III.A that the MPGBP algorithm will usea codebook whose codewords are permutations of the form in(20), i.e.,

(25)

In this section we determine the minimum attainable valuefor given and —the number of non-zero terms and

dimension . Since the codebook contains all the permutationsof the form described by (20), we can suppose, without any lossof generality, that the vector making an angle ofwith the codeword closest to it can be represented as

(26)

(27)

that is, the vector components are sorted in decreasing orderand they are all non-negative. This is so because (note that forthe sake of the computation, we can restrict to be on ahyper-sphere, that is, to have a predetermined norm [33]):i) For any permutation of in (26) the closest codewordcan be obtained by applying the same permutation to ,with the value of inner product, and therefore of ,remaining the same.

ii) If we change the sign of some of the components of ,then the closest codeword can be obtained from bychanging their coordinates to whenever the corre-sponding coordinates are negative.

iii) The values of the coordinates of corresponding to thecoordinates of that are zero do not affect their innerproduct. In addition, the largest inner product occurs ifthe coordinates of corresponding to the non-zero co-ordinates of are the largest ones.

iv) If was such that

where , such that, then and (and

thus, the angle) would remain unaltered.To approximate the vector as defined in (26) using code-

words from a codebook , with elements as in (20), that havea fixed norm , we consider that has unit norm without anyloss of generality and, therefore, we may write

(28)

This assumption leads to the following expression for(assuming that both and belong to ):

(29)

since is defined as in (26) and (the codeword in the code-book) as in (20).To verify the behavior of , i.e., to look for points in which

attains its maximum or minimum value, subject to theconstraint imposed by (28), we define the Lagrangian functional

as:

(30)

3838 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 62, NO. 15, AUGUST 1, 2014

and equate its partial derivatives with relation to both andto zero. By doing that, we obtain:

(31)

(32)

Combining the results of (31) and (32), we have that

(33)

Since , we obtain , which isclearly a maximum.However, according to the Kuhn-Tucker conditions [34], in

order to guarantee the minimization of the objective functionwe have also to look for solutions lying on the boundary of theconvex region delimited by for any , or forany , or . Considering (27), this region is delimited bythe cases

(34)

(35)...

...

(36)

(37)

where . The above conditions can be restated as, and

with and . Note that impliesthat . Therefore, the -dimensional vector maythen be written as

(38)

(39)

Since we have assumed that , the above equations imply

(40)

(41)

and therefore(42)

(43)

From (39) and (41), the above value occurs for

(44)

Once again, to find the minimum value of forand , the Lagrangian functional

becomes given by

(45)

and taking derivatives with respect to and we obtain:

(46)

(47)

From (46) and (47), the unit-norm condition (40)leads to

(48)

From the above equation we obtain the following values for thecomponents and :

(49)

(50)

for . Applying these at (42), one obtains

(51)

Its derivative with respect to is:

(52)

The above equation implies that the derivative in (52) is zero forall if , being negative for all values of if . If

, from (51), is

(53)

which is a maximum value for .

DA SILVA et al.: FIR FILTER DESIGN BASED ON SUCCESSIVE APPROXIMATION OF VECTORS 3839

For , since the derivative in of is negative,then its minimum value will occur for the maximum value of ,that is , since . In this case (52) becomes

(54)

From (38), (49) and (50), the above value occurs for

(55)

To find the minimum we still have to test the boundarycondition . In this case, we have that

(56)

(57)

and the Lagrangian functional to be minimized is

(58)

and taking derivatives with respect to we obtain:

(59)

and substituting in (56) we have that

(60)

Therefore, from (57), we have that

(61)

Since then in the above equation decreases withand as , then its minimum value occurs for ,

that is,

(62)

From (38) and (60), the above value occurs for .Then, from (43), (54) and (62), the minimum value of

is given by

(63)

Since , then one has

(64)

Therefore, is given by

(65)

B. Codebook Choice

The codebook used to perform the approximation of an im-pulse response is as described in (20). Then, each step ofthe MPGBP algorithm adds powers-of-two to the filter com-plexity (in fact this is just a lower bound because if we use thecommon-term reduction step in Subsection III-B the complexitytends to be lower). Therefore, if we perform steps of theMPGBP algorithm, the number of powers-of-two used to im-plement the filter is bounded by .If we want the approximation error after steps

to be bounded by , then, from (14), should be such that

, and thus . Usingthe value of from (24) this relation becomes

(66)

and thus an upper bound on number of power-of-two becomes

(67)

Above, denotes the smallest integer larger than or equal to. Substituting from (65) we have

(68)

The expression in (68) is a decreasing function of forand an increasing function of for . Therefore,

its minimum value of occurs for (in fact, eitherthe smallest integer larger than it or the largest integer smallerthan it).Fig. 2 shows a plot of as a function of for .

It can be seen that, although the minimum is at ,the derivative for is much smaller than the derivativeone for . This behavior suggests that the number ofpowers-of-two generated by the algorithm is not to be very sen-sitive to the value of provided that .This result tells us that, in the absence of the common-term

reduction step (Subsection III-B), the optimal -dimensionalcodebook should be made up of unit-norm codewords that arepermutations of

(69)

3840 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 62, NO. 15, AUGUST 1, 2014

Fig. 2. Number of powers-of-two necessary for a given approximation error infunction of for the dimension of equal to 100.

where denotes the largest integer smaller than or equal to. However, its performance for values of smaller thanshould be similar.

It is important to notice that, besides the common-term re-duction, there is another factor that contributes to the numberof powers-of-two to be smaller than the figures given by (68).For deducing (68) it was assumed that the angle between theresidual and the codebook vector closest to it is equal toin every iteration, a worst case scenario. Therefore, in practice,the actual number of powers-of-two employed in a filter designtends to be smaller than the one predicted by (68).

V. COMPLEXITY FIGURES

A non-ignored problem of greedy algorithms is their com-plexity. A priori, if the dictionary has no structure, in aMatchingPursuit iteration one has to perform a full-search in order to findthe codeword closest to a given input vector. In this case, theresidue has to be projected into all the dictionary elements tosearch for the most similar one. Assuming that the signal spacedimension is and that the dictionary has elements,inner products of dimension are necessary, leading tomultiplications and additions for computing all theinner products and comparisons for sorting theinner products values (in average).Due to this complexity, several strategies have been applied

for reducing the computational cost, in general at the expenseof memory. For example, in [25] a fast MP algorithm is dis-cussed. In it, instead of computing the inner products betweenthe residue and the dictionary elements at each MP iteration,inner products are computed just in the first MP iteration. Dueto the orthogonality between the residue and the selected code-word in each MP iteration, in the subsequent iterations the innerproducts of the residue and the codewords are updated usingthe inner products of the selected codeword and the other dic-tionary elements. In other works, based on the same reasoning(orthogonality between selected codeword and the residue) dic-tionaries with pre-defined structures favoring complexity reduc-tion have been employed. However, in our case, MPGBP usesin-loop quantization [32]. The coefficients are quantized to theclosest power of , which destroys the orthogonality betweenthe residue and the codeword. This prevents the use of the above

mentioned fast algorithms, which tends to contribute to an in-creased complexity.In our help comes the fact that, as seen in Section IV in the

SOPOT FIR design algorithm here presented, the dictionary el-ements employed have elements equal to andzeros. This reduces the problem of computing the inner prod-ucts to sorting the residue coordinates in function of the absolutevalues and adding the largest ones. This strategy automati-cally indicates the elements of the -dimensional codewordthat must be , the sign depending on the actual signs of theresidue coordinate values. Therefore, in average, each MPGBPiteration has a complexity of comparisons for sortingthe residue coordinates. This leads to a fast, efficient algorithm,suitable for hardware implementation.

Algorithm 5

Process to find a set of filters satisfying a prescribedspecification. Each candidate uses a bit-depth for coefficientsrepresentations of at most . defines the range oftested filter lengths. It ends up providing a set of possible filterimplementations of different lengths obtained for different’s, having different numbers of POTs and bit-depths. From the

generated designs one can choose a specific implementation.1. Compute the smallest filter length that allowsto design a Parks-McClellan filter satisfying the givenspecification.2. For to2.1 For to2.1.1 ;2.1.2 Set with the maximum number of MPGBPiterations2.1.3 Parks-McClellan design according to thespecifications and length2.1.4While , apply the MPGBP step toObs: This is composed of steps (2.(a)–2.(c)) in Algorithm2, that have fast implementation using algorithm III. Inaddition, since is either symmetric or antisymmetric,the residue length is “half” the filter length.2.1.4.1 If satisfies the given specification Then

Obs: This can be done by computing andcomparing it against the prescribed specifications2.1.4.1.1 Compute the bit-depth of and the

number of POTs used in ;Obs: The quantity of non-null bits (POTs) in the coefs.

of is , where for

simplicity denotes the -th bit of coefficient

.

2.1.4.1.2 Save as valid SOPOT filter;2.1.4.2 ;

The process for obtaining several SOPOT filters satisfying adesired specification is presented inAlgorithm 5. One starts withthe desired filter specifications, and using a Parks-McClellanfilter design approach one finds the minimum filter lengthrequired to satisfy the prescribed specifications. Given onetries filter designs with lengths within the range and

DA SILVA et al.: FIR FILTER DESIGN BASED ON SUCCESSIVE APPROXIMATION OF VECTORS 3841

, where defines a range of filterlengths to be tried. The designed filters are either symmetric oranti-symmetric. In this work we are interested not only in gen-erating filters but also in working out the effect of (number ofnon-null elements in the codewords of the dictionary). In orderto do so we set tests for ranging from 1 to ,in accordance with what has been previously discussed. Forthese ranges of and ( is the tested filter length and

specifies the dictionary used in the approximationprocess) one designs filters through the Parks-McClellan opti-mization and finds their SOPOTs approximations by means ofMPGPBwith dictionaries having ’s and zeros1.Each approximation is evaluated with respect to the prescribedspecifications. One ends with a set of filters that match theprescribed filter specifications. From this set of filters one maychose a particular design.For an MPGBP residue of length , using the dictio-

naries discussed here, the complexity of the required searchfor the best matching atom in an MPGBP iteration is onaverage operations for sorting the residue coeffi-cients—simple operations not requiring any multiplicationactually. The residue update requires operations, updatingthe largest coefficients of the residue. Let us assume thatat each MPGBP step the filter response is evaluated using anFFT of length , therefore step 2.1.4.1 in Algorithm 5)requires (comparisonsof the obtained response to the prescribed filter specification)operations. Counting the number of POTs used in requiresat most sums as explicitly exemplified of step 2.1.4.1.1in Algorithm 5, is the largest bit-depth allowed. Therefore,assuming at most MPGBP steps, and that the filters areeither symmetric or anti-symmetric, the number of operationsin Algorithm 5 is bounded by

(70)

VI. ALGORITHM BEHAVIOR

The proposed algorithm is capable of finding a good sum ofpowers-of-two representation of a filter prototype in fast way.Being so, the algorithm can be setup for searching for the bestfilter design by varying the following parameters:1) The filter length ;2) Dictionary specification/design, that is the quantity of non-null coordinates in a given dimension (which equatesthe filter length).;

3) The smallest power of two allowed—this impacts on thefilter input/output word-length/bit-depth;

4) The total quantity of adders employed in the obtained filter;5) The total quantity of MP iterations allowed, that restrictsthe complexity of the design process.

We present some results considering these different aspects.A low-pass filter is considered {whose} passband and stop-bandedges are located at and and themaximum allowed ripples are , i.e.,

1Note that the dictionaries are defined considering half the filter length or halfthe filter length plus 1 as dimension, due to their symmetric or anti-symmetricnature.

Fig. 3. Success of the filter design for the low-pass filter specifications and al-gorithm parameters range in Section VI—black satisfactory design obtained,white unsatisfactory design obtained.

Fig. 4. Iterations employed in the MPGBP design for the filter specificationsand algorithm parameters range in Section VI.

Fig. 5. Adders required for the filters designed usingMPGBP for thefilter spec-ifications and algorithm parameters range in Section VI.

dB. The filter prototype has been designed using the Parks-Mc-Clellan algorithm [24]. For these specifications, the minimumorder leading to a design attaining the specifications is deter-mined to be . Different filter are designed with dif-ferent dictionaries varying the length from 173 up to 250, withmaximum bit-depth of 32 bits, allowing at most 1000 adders,and at most 400 MP iterations. The dictionaries are composedof atoms having out of coefficients equal to , for dif-ferent values of , spanning from 6 to 24.Fig. 3 shows for which values of and a satisfactory

filter design is achieved for the specifications and value rangesdiscussed above. The black region indicates a successful filterdesign, i.e., the algorithm obtains a filter design satisfying thefilter specifications within the restrictions in parameter ranges.The white region indicates unsuccessful filter design. As canbe observed there is neighborhood between a region where nosatisfactory filter is obtained to the one in which satisfactoryfilters (i.e., that satisfy the specifications) are always obtained.The values of greater than 217 have been omitted from Fig. 3since they always lead to a successful design.To better understand the algorithm behavior we measure

some of the parameters of the designed filters in a region where

3842 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 62, NO. 15, AUGUST 1, 2014

Fig. 6. Smallest power-of-two required for the filters designed for the specifi-cations and MPGBP algorithm parameters range in Section VI. The higher themagnitude of the smallest power-of-two, the larger the word length necessaryfor implementing the filter.

Fig. 7. Magnitude responses for the prototype and the SA-designed FIR filters( and ) in Example 1. (Passband detail shown separately).

a satisfactory design is always obtained . Fig. 4shows how the quantity of MP iterations employed in thedesign vary with and . This figure shows that the value ofthat gives the smallest number of iterations is around .

This is in agreement with (65) and (66), from which one candeduce that the smallest number of iterations is obtained for

. We can notice that, given , less iterations tend tobe employed when is close to and smaller than . Fig. 5shows the variation of the number of adders (powers-of-two)of the designed filters with and . This corroborates theresult in presented in (68) and Fig. 2. It says that, for a givenapproximation error, the value of that minimizes the neces-sary number of powers-of-two, if no common-terms reductionis employed, is , but smaller values of yieldsimilar performance. Again, it is important to notice that thisresult might not correspond exactly to what is observed inthe experiment due to two main reasons. The first one is that,in our results, the common-term reduction step, as describedin Subsection III-B, is indeed employed. In addition, (68)supposes that the angle between the residual to be coded andthe closest codeword is in every iteration; this is a worstcase, and thus provides only a lower bound on performance.However, as can be seen from Fig. 5, (68) gives a good indi-cation of the algorithm’s performance. Fig. 6 concludes thisexperiment by evaluating the word length necessary for thefilter implementation. It shows the smallest power-of-two usedin the designed filter, since the smaller the magnitude of thesmallest power-of-two, the longer the word length.

Fig. 8. Magnitude responses for the prototype and the SA-designed FIR filters( and ) in Example 1.(Passband detail shown separately).

VII. DESIGN EXAMPLES

In this section we compare the performance of the proposedalgorithm against that of other methods in the literature for thedesign of SOPOTs filters. The filter prototypes for our design2

are obtained using the Parks-McClellan algorithm [24].1) Example 1—Low Pass Filter [4], [7], [14]: Consider

the specifications in [7], Example 1, also in [14] and [4]. Alow-pass filter is desired in which the passband and stop-bandedges are located at and and the max-imum allowed passband and stop-band ripples are

, corresponding to a 50 dB stop-band attenuation. Forthe Parks-McClellan prototype design, the minimum requiredorder is . Since the filters are symmetric it suf-fices to apply the MPGBP method at half of its coefficients.We employ the design method for orders up to andtherefore a search space of at most coefficients, andvalues of , searching for the minimal implemen-tation complexity (in terms of number of sum of powers-of-two/adders).The best results presented in [7] indicate that the specifica-

tions can be attained by using an order filter, with37 adders. They argue that the number of adders can be low-ered by using a common sub-expression elimination post-designmethod. In our results we do not consider the application of thissub-expression elimination method since, although it might re-duce the number of adders, it does not help to grasp the proposedmethod behavior. As can be seen from Table III, the proposedmethod consistently obtains designs not only comparable butwith lower complexities to the one in [7], using a minimum of28 adders for a filter of length 33, while for a filter of length29, 34 adders are required. The amplitude responses for twoMPGBP-designed filters, along with their passband details, canbe seen in Figs. 7 and 8. The impulse response of the particularfilter obtained with and is presented in Table IV.In order to compare some characteristics of the designed fil-

ters we employ the Normalized Peak Ripple (NPR) as in [14].For the specifications considered, in [14] the NPR areand dB for filters of lengths 33 and 35 respectivelyusing an average of 2 adders per coefficient. As can be seenin the last two columns in Table III the NPR for the filters ob-tained using the proposed approach are comparable to those in[14] using less Adders/Coeff (quantity of adders divided by the

2Software implementing the examples presented is available at: http://www.prosaico.uerj.br/MPGBP_filter_approx.tgz

DA SILVA et al.: FIR FILTER DESIGN BASED ON SUCCESSIVE APPROXIMATION OF VECTORS 3843

TABLE IIIIMPLEMENTATION PARAMETERS FOR SPECIFICATIONS OF EXAMPLE 1

TABLE IVCOEFFICIENTS FOR THE SA-DESIGNED FIR FILTER OF EXAMPLE 1.

Fig. 9. Normalized Peak Ripple (dB) in function of the quantity of adders percoefficient (Adders/Coeff) for the filter with specifications of Example 1, con-sidering filter lengths in .

quantity of coefficients of the SOPOT approximation). In orderto further investigate the NPR of the filters designed using theproposed approach, we present in Fig. 9 a graph of the best at-tainable NPR as a function of Adders/Coeff for filters designedaccordingly to the specifications in this example. For obtainingthese results we consider several approximations of the filters oflength for a given dictionary and a varying number of decom-position steps not looking for the smallest complexity (quantityof adders). This graph can be compared to the one in Fig. 2 of[14], there NPR starts around dB for an average numberof Adders/Coeff of 1, it is easy to see that our results present amuch better performance in terms of this metric.The proposed algorithm was implemented in Matlab, for the

48 filters ( and ) filterapproximations the Average CPU time/Filter Approximation is0.076274 (secs) running on a KUbuntu 64 bits OperatingSystem on a Intel(R) Core(TM) i7-3537U CPU 2.00 GHz. Ifone considers just the design approximations that satisfy thefilter requirements this time drops to 0.058603, whatconsiders in 44 of the 48 cases. It should be pointed thatthese times consider the evaluations by means of FFT

Fig. 10. Histogram of computing times ((at each MPGBP step) for obtaining filters satis-

fying the specifications in Example 1. The histogram on the top consider suc-cessful and unsuccessful approximation cases, while the graph on the bottomconsiders just the satisfactory ones. These are obtained running theMatlab algo-rithm on a KUbuntu 64 bits Operating System on a Intel(R) Core(TM) i7-3537UCPU 2.00 GHz.

computations (of length in this example) and theParks-McClellan prototypes generation. Fig. 10 tries tosummarize the required computing time by presenting thehistogram of the times elapsed in the filter approximationsconsidered. The graph at the top presents the histogram ofthe design times

for the 48 filters, while the graph at the bottompresents the same metric for the filters ( for varyingquantities of MPGPB decomposition steps) satisfying thefilter specifications.2) Example 2—Low Pass Filter [19], [35]: This example

employs the specification considered in [19], [35]. A low-passfilter is desired in whose passband and stop-band edges are lo-cated at and and the maximumallowed passband and stop-band ripples are dB and

dB. For the Parks-McClellan prototype design, theminimum required order is . We investigate designsusing the proposed approach for with dic-tionaries defined by , since .Our shortest filter satisfying the specifications is obtained for

using 84 adders and the one using the least number ofadders has length 51 using 73 adders. In [19] this filter is designusing 59 taps and 71 adders while in [35] a 60 tap filter is ob-tained using 87 adders. A sample of the parameters of satisfac-tory filters designed is presented in Table V ordered in functionof the quantity of adders employed in the filter implementation.The coefficients of the smaller complexity filter are presentedin Table VI. The frequency response of this filter is presented inFig. 11 and the coefficients of the original filter and the approxi-mation obtained, together with the approximation error, are pre-sented in Fig. 12.3) Example 3—Band Pass Filter: In order to evaluate the

proposed approach for non low-pass filters, we propose the de-sign of a band-pass filter whose specifications are:— passband edges: ,— stop-band edges:— and the maximum allowed passband and stop-band ripplesare , corresponding to a 60 dBstop-band attenuation.

3844 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 62, NO. 15, AUGUST 1, 2014

Fig. 11. Magnitude responses for the prototype and the SA-designed FIR filters( and ) in Example 2. (Passband detail shown separately).

Fig. 12. Prototype and the SA-designed FIR filters samples ( and) in Example 2. Approximation error (differences between samples of

the original and the approximated filter) is shown at the bottom.

TABLE VIMPLEMENTATION PARAMETERS FOR SPECIFICATIONS OF EXAMPLE 2

For the prototype design, the minimum required order leadingto a design attaining the specifications is determined to be

. We employ the design method for values ofand order up to , searching for the

minimum implementation complexity in terms of number ofsum of powers-of-two/adders.The implementation with lowest complexity is obtained by

setting and designing an order filter. For thisdesign, whose coefficients are given in Table VIII, the number ofrequired adders is 66, with the use of 12 different power-of-twofactors. The amplitude response for the resulting filter, alongwith its passband details, can be seen in Fig. 13.

TABLE VICOEFFICIENTS FOR THE SA-DESIGNED FIR FILTER OF EXAMPLE 2.

TABLE VIIIMPLEMENTATION PARAMETERS FOR SPECIFICATIONS OF EXAMPLE 3

Fig. 13. Magnitude responses for the prototype and the SA-designed FIR filters( and ) in Example 3. (Passband detail shown separately).

TABLE VIIICOEFFICIENTS FOR THE SA-DESIGNED FIR FILTER OF EXAMPLE 3.

DA SILVA et al.: FIR FILTER DESIGN BASED ON SUCCESSIVE APPROXIMATION OF VECTORS 3845

Fig. 14. Histogram of computing times ((at each MPGBP step) for obtaining filters satis-

fying the specifications in Example 3. The histogram on top consider successfuland unsuccessful approximation cases, while the graph at the bottom considersjust the satisfactory ones. These are obtained running the Matlab algorithm ona KUbuntu 64 bits Operating System on a Intel(R) Core(TM) i7-3537U CPU2.00 GHz.

Fig. 14 presents the times required for computing the filters inthis example. As the filters are longer than the ones in Example1 the computing times increase. In this case from the 110 de-signs tried, 100 provide an implementation satisfying the filterspecifications.

VIII. CONCLUSION

We have presented a novel method for the design of Sumof Powers of Two (SOPOT) FIR digital filters. The pre-sented method attempts to minimize the number of sums ofpowers-of-two employed in the filter implementation. It isbased on the decomposition of their impulse response in gener-alized bit-planes. The decomposition in generalized bitplanesemploys successive approximation of the impulse responseusing vectors from a dictionary. The dictionary used for filterdesign has only vectors with coefficients equal toor 0, and its elements are weighted by powers-of-two whenapproximating an impulse response. The decomposition ismade using the Matching Pursuits in Generalized Bitplanes(MPGBP) algorithm, that has been initially developed for videocompression applications. A novel variation of the MPGBPalgorithm has been proposed, and a Theorem giving bounds forits performance has been proved. This theorem states that theapproximation performance of the dictionaries in the MPGBPalgorithm is determined by , the largest angle betweenany vector in and its closest vector in the dictionary. In thispaper we also give an analytical expression for the value of

for the codebooks used, leading to the choice of optimalcodebooks for a given filter order. Unlike its counterparts,that resort to costly optimization techniques, the proposedalgorithm has extremely low computational complexity. Yet,experimental results show that the designs generated giveresults similar to the best ones in the literature. In addition,the proposed approach is much faster to design the rathersophisticate filters and requires no sophisticate setup of initialparameters. As seen, it can be employed to efficiently approx-imate different filters responses satisfying different passbandand stop-band specifications of any order, therefore one can

employ it to design filters matching different implementationcriteria as quantity of SOPOTs, bit-depth, length.

APPENDIX APROOF OF MPGBP CONVERGENCE

Theorem 1 (MPGBP Convergence): Given a dictionarysuch that if then necessarily and also ,then for and any input vector the MPGBP Algorithm(Algorithm 2) converges, and the error incurred in the approxi-mation is bounded by

(71)

(72)

(73)

Proof: If we can prove that such that the residualsat steps and follow the constraint

(74)

(75)

Since and , the above equation wouldimply (71). Therefore, in order to prove Theorem 1 it suffices toprove (74) and to show that the minimum value of for which(74) is valid is given by (72).The argument for this proof is based on Fig. 15, where we

have the following correspondences:a.1) The vector to be approximated at step , is given

by .a.2) The direction of the unit-norm vector , used

to approximate , is the one of .a.3) The coefficient of , that is, the magnitude of theapproximation of the residue at step , is given by thelength of .

a.4) The residue is given by .a.5) The position of the points and isbased on the fact that .

From the above correspondences, we see that in (74)above is equal to . Therefore, the loci of points such that

is constant is the circle of Appolonius [36] for which theratio of the distance to the focus to the distance to the focusis equal to . This circle is illustrated in Fig. 15 as the dashedcircumference passing through point .In order to carry out the proof of Theorem 1, we will use

some properties of Appolonian circles that are described withreference to Fig. 16, that shows circles of Appolonius of fociand and several ratios of the distance to to the distanceto . The following properties should be noted [36]:

b.1) For the circle of Appolonius has infinite radius,and is the bisector of segment .

3846 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 62, NO. 15, AUGUST 1, 2014

Fig. 15. Proof of Theorem 1.

b.2) For the circle of Appolonius is identical to point. As increases up to 1, its radius increases but it remains

contained in half-plane to the right of the bisector of .b.3) For the circle of Appolonius in contained in thehalf-plane to the left of the bisector of . As increases,its radius decreases, and for it shrinks to point .b.4) The circles of Appolonius for different values ofnever intersect.

If the point in Fig. 15 is the projection of onto , wehave that is equal to (note that this value is al-ways positive because is such that both and belong to, for every ). Therefore, if is approximated with coeffi-

cient , then, from (10), we have that

(76)

Referring to Fig. 15, this is equivalent to saying that is closerto than it is to both and . Equation (76) implies thatpoint is contained in the region between the lines and

. In addition, since the angle between and is smallerthan , we have that is contained within the quadrilateral

.If we define as the intersection of the bisector of segment, and segment , as , we have that. Since

(77)

and is inside therefore belongs to the half-plane tothe right of the bisector of segment . Thus, it belongs to anAppolonian circle of foci and that is in the half-plane to theright of the bisector of . From Fig. 16 this is equivalent to

Fig. 16. Circles of Appolonius of foci and and several ratios of thedistance to to the distance to .

saying that . Using (74) and (75) this provesthe convergence of the MPGBP algorithm, since it shows thatthe residue magnitude goes to zero as the number of stepsgoes to infinity.Now, in order to complete the proof, we have to show that

in (72) is an upper limit for the . We can do this based on twoproperties demonstrated above:

c.1) belongs to a circle of Appolonius to the right ofthe bisector of . This implies that the larger is, thelarger is the radius of the circle of Appolonius it belongs to(see Fig. 16).c.2) is contained in the quadrilateral .

Property c.2 implies that the Appolonius circle of largest radiussatisfying property c.1 will necessarily pass through one of thevertexes of . Therefore, an upper limit for will bethe maximum among the for the vertexes and .

DA SILVA et al.: FIR FILTER DESIGN BASED ON SUCCESSIVE APPROXIMATION OF VECTORS 3847

(78)

(79)

(80)

• (81)

(82)

(83)

Since and , we have that

(84)

Then, from (78), (81), (82) and (83),

(85)

This completes the proof.It is important to note that this proof can be trivially adapted

to the case where . Referring to Fig. 15, wejust need to change the common factor in the lengths of

and by . The rest of theproof, as well as (71) and (72), remain unchanged.

REFERENCES[1] H. Samueli, “An improved search algorithm for the design of multipli-

erless fir filters with powers-of-two coefficients,” IEEE Trans. CircuitsSyst., vol. 36, pp. 1044–1047, Jul. 1989.

[2] J. Evans, “Efficient fir filter architectures suitable for FPGA implemen-tation,” IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process.,vol. 41, pp. 490–493, Jul. 1994.

[3] M. Macleod and A. Dempster, “Multiplierless fir filter design algo-rithms,” IEEE Signal Process. Lett., vol. 12, pp. 186–189, Mar. 2005.

[4] Y. C. Lim, R. Yang, D. Li, and J. Song, “Signed power-of-two termallocation scheme for the design of digital filters,” IEEE Trans. CircuitsSyst. II, Analog Digit. Signal Process., vol. 46, pp. 577–584,May 1999.

[5] K. Yeung and S. Chan, “Multiplier-less fir digital filters using pro-grammable sum-of-power-of-two (SOPOT) coefficients,” in Proc.IEEE Int. Conf. Field-Programmable Technol. (FPT), Dec. 2002, pp.78–84.

[6] Y. J. Yu, T. Saramaki, and Y. C. Lim, “An iterative method for opti-mizing fir filters synthesized using the two-stage frequency-responsemasking technique,” in Proc. 2003 Int. Symp. Circuits Syst. (ISCAS),May 2003, vol. 3, pp. III-874–III-877, vol. 3.

[7] J. Yli-Kaakinen and T. Saramaki, “A systematic algorithm for the de-sign of multiplierless FIR filters,” in Proc. IEEE Int. Symp. CircuitsSyst. (ISCAS), 2001, vol. 2, pp. 185–188, 6–9, vol. 2.

[8] R. Cemes and D. Ait-Boudaoud, “Genetic approach to design of multi-plierless fir filters,” Electron. Lett., vol. 29, pp. 2090–2091, Nov. 1993.

[9] Y. J. Yu and Y. C. Lim, “Design of linear phase FIR filters in subex-pression space using mixed integer linear programming,” IEEE Trans.Circuits Syst. I, Reg. Papers, vol. 54, pp. 2330–2338, Oct. 2007.

[10] J. Yli-Kaakinen and T. Saramaki, “A systematic algorithm for the de-sign of lattice wave digital filters with short-coefficient wordlength,”IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 54, pp. 1838–1851, Aug.2007.

[11] Y. C. Lim, S. Parker, and A. Constantinides, “Finite word length FIRfilter design using integer programming over a discrete coefficientspace,” IEEE Trans. Acoust., Speech, Signal Process., vol. 30, pp.661–664, Aug. 1982.

[12] Y. C. Lim and S. Parker, “FIR filter design over a discretepowers-of-two coefficient space,” IEEE Trans. Acoust., Speech,Signal Process., vol. 31, pp. 583–591, Jun. 1983.

[13] P. Gentili, F. Piazza, and A. Uncini, “Efficient genetic algorithm designfor power-of-two fir filters,” in Proc. Int. Conf. Acoust., Speech, SignalProcess. (ICASSP), May 1995, vol. 2, pp. 1268–1271, vol. 2.

[14] D. Li, Y. C. Lim, Y. Lian, and J. Song, “A polynomial-time algorithmfor designing fir filters with power-of-two coefficients,” IEEE Trans.Signal Process., vol. 50, pp. 1935–1941, Aug. 2002.

[15] F. Brglez, “Digital filter design with short word-length coefficients,”IEEE Trans. Circuits Syst., vol. 25, no. 12, pp. 1044–1050, 1978.

[16] D. M. Kodek, “Design of optimal finite wordlength fir digital filtersusing integer programming techniques,” IEEE Trans. Acoust., Speech,Signal Process., vol. 28, no. 3, pp. 304–308, 1980.

[17] D. Kodek and K. Steiglitz, “Filter-length word-length tradeoffs in firdigital filter design,” IEEE Trans. Acoust., Speech, Signal Process., vol.28, no. 6, pp. 739–744, 1980.

[18] D.M.Kodek, “Performance limit of finite wordlength fir digital filters,”IEEE Trans. Signal Process., vol. 53, no. 7, pp. 2462–2469, 2005.

[19] J. Skaf and S. P. Boyd, “Filter design with low complexity coeffi-cients,” IEEE Trans. Signal Process., vol. 56, no. 7, pp. 3162–3169,2008.

[20] Y. Cao, K. Wang, W. Pei, Y. Liu, and Y. Zhang, “Design of high-orderextrapolated impulse response fir filters with signed powers-of-two co-efficients,” Circuits, Syst,, Signal Process., vol. 30, no. 5, pp. 963–985,2011.

[21] R. Caetano, E. A. B. da Silva, and A. G. Ciancio, “Video coding usinggreedy decompositions on generalised bit-planes,” Electron. Lett., vol.38, pp. 507–508, 2002.

[22] S. Mallat, A Wavelet Tour of Signal Processing, Second Edition(Wavelet Analysis & Its Applications), 2nd ed. New York, NY, USA:Academic, 1999.

[23] K. Sayood, Introduction to Data Compression, 2nd ed. San Mateo,CA, USA: Morgan Kaufmann, 2000.

[24] P. S. R. Diniz, E. A. B. da Silva, and S. L. Netto, Digital SignalProcessing: System Analysis and Design, 2nd ed. Cambridge, U.K.:Cambridge Univ. Press, 2010.

[25] S. Mallat and Z. Zhang, “Matching pursuits with time-frequency dic-tionaries,” IEEE Trans. Signal Process., vol. 41, pp. 3397–3415, Dec.1993.

[26] W. Rudin, Real and Complex Analysis. New York, NY, USA: Mc-Graw-Hill, May 1986.

[27] F. Bergeaud and S. Mallat, “Matching pursuit of images,” in Proc.IEEE-SP Int. Symp. Time-Frequency Time-Scale Analysis, Oct. 1994,pp. 330–333.

[28] G. M. Davis, S. G. Mallat, and Z. Zhang, “Adaptive time-frequencydecompositions,” Opt. Eng., vol. 33, no. 7, pp. 2183–2191, 1994.

[29] G. Davis, S. Mallat, and M. Avellaneda, “Adaptive greedy approxima-tions,” Construct. Approx., vol. 13, no. 1, pp. 57–98, 1997.

[30] R. A. DeVore and V. N. Temlyakov, “Some remarks on greedy algo-rithms,” Adv. Comput. Math., vol. 5, no. 1, pp. 173–187, 1996.

3848 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 62, NO. 15, AUGUST 1, 2014

[31] R. Gribonval and P. Vandergheynst, “On the exponential convergenceof matching pursuits in quasi-incoherent dictionaries,” IEEE Trans. Inf.Theory, vol. 52, no. 1, pp. 255–261, 2006.

[32] L. Lovisolo, E. A. da Silva, and P. S. Diniz, “On the statisticsof matching pursuit angles,” Signal Process., vol. 90, no. 12, pp.3164–3184, 2010.

[33] L. Lovisolo and E. A. B. da Silva, “Uniform distribution of points on ahyper-sphere with applications to vector bit-plane encoding,” in Proc.Inst, Electr. Eng.—Vision, Image, Signal Process., 2001, vol. 148, pp.187–193.

[34] M. S. Bazaraa, H. D. Sherali, and C. M. Shetty, Nonlinear Program-ming: Theory and Algorithms, 2nd ed. New York, NY, USA: Wiley,1993.

[35] F. Xu, C. H. Chang, and C. C. Jong, “Design of low-complexity fir fil-ters based on signed-powers-of-two coefficients with reusable commonsubexpressions,” IEEE Trans. Comput.-Aided Design Integr. CircuitsSyst., vol. 26, no. 10, pp. 1898–1907, 2007.

[36] C. S. Ogilvy, Excursions in Geometry. London, U.K.: Oxford Uni-versity Press, 1969.

Eduardo A. B. da Silva (M’95–SM’05) was born inRio de Janeiro, Brazil. He received the ElectronicsEngineering degree from the Instituto Militar de En-genharia (IME), Brazil, in 1984, the M.Sc. degree inelectrical engineering from Universidade Federal doRio de Janeiro (COPPE/UFRJ) in 1990, and the Ph.D.degree in electronics from the University of Essex,England, in 1995.In 1987 and 1988, he was with the Department

of Electrical Engineering at Instituto Militar de En-genharia, Rio de Janeiro, Brazil. Since 1989, he has

been with the Department of Electronics Engineering (the undergraduate depart-ment), UFRJ. He has also been with the Department of Electrical Engineering(the graduate studies department), COPPE/UFRJ, since 1996. His teaching andresearch interests lie in the fields of digital signal and image and video pro-cessing. In these fields, he has published over 200 peer reviewed papers. Hewon the British Telecom Postgraduate Publication Prize in 1995, for his paperon aliasing cancellation in subband coding. He is also co-author of the bookDig-ital Signal Processing—System Analysis and Design, published by CambridgeUniversity Press, in 2002, which has also been translated to the Portuguese andChinese languages, whose second edition has been published in 2010.He has served as associate editor of the IEEE TRANSACTIONS ON CIRCUITS

AND SYSTEMS—PART I, in 2002, 2003, 2008 and 2009, of the IEEETRANSACTIONS ON CIRCUITS AND SYSTEMS—PART II in 2006 and 2007, andof Multidimensional, Systems and Signal Processing, Springer, since 2006. Hehas been a Distinguished Lecturer of the IEEE Circuits and Systems Societyin 2003 and 2004. He was Technical Program Co-Chair of ISCAS 2011. Heis a senior member of the IEEE, of the Brazilian Telecommunication Society,and also a member of the Brazilian Society of Television Engineering. Hisresearch interests lie in the fields of signal and image processing, signalcompression, digital TV and pattern recognition, together with its applicationsto telecommunications and the oil and gas industry.

Lisandro Lovisolo was born in Neuquen, Argentina.He received the Electronics Engineering degreefrom the Universidade Federal do Rio de Janeiro, in1999, the M.Sc. degree in electrical engineering in2001, and the D.Sc. degree in electrical engineeringboth from Universidade Federal do Rio de Janeiro(COPPE/UFRJ). Since 2003 he has been with theDepartment of Electronics and Communications En-gineering (the undergraduate department), UERJ. Hehas also been with the graduate studies in ElectronicsProgram, since 2008. His research interests lie in

the fields of digital signal and image processing, communications, machinelearning, and computer science.

Alessandro J. S. Dutra (M’05) received the Elec-tronic Engineering degree from the Instituto Militarde Engenharia (IME), Rio de Janeiro, Brazil, in1993, the M.Sc. degree in electrical engineeringfrom Universidade Federal do Rio de Janeiro(COPPE/UFRJ) in 1999, and the Ph.D. degree inelectrical engineering from Rensselaer PolytechnicInstitute (RPI), Troy, NY, in 2010.He was a Post-Doctoral Researcher in the Depart-

ment of Electrical Engineering, COPPE/UFRJ. Hewas part of the team that worked in the development

of the Brazilian Digital Television System. Since 2012, he has been with GEGlobal Research, Brazil Technology Center, where he is a member of theSmart Systems research group. His research interests lie in the fields of datacompression, digital signal and image processing, and machine learning forsignal processing.

Paulo S. R. Diniz (F’00) was born in Niterói, Brazil.He received the electronics eng. degree (Cum Laude)from the Federal University of Rio de Janeiro (UFRJ)in 1978, the M.Sc. degree from COPPE/UFRJ in1981, and the Ph.D. from Concordia University,Montreal, P.Q., Canada, in 1984, all in electricalengineering.Since 1979 he has been with the Department of

Electronic Engineering (the undergraduate depart-ment) UFRJ. He has also been with the Programof Electrical Engineering (the graduate studies

department), COPPE/UFRJ, since 1984, where he is presently a Professor. Heserved as Undergraduate Course Coordinator and as Chairman of the GraduateDepartment. He has received the Rio de Janeiro State Scientist award, from theGovernor of Rio de Janeiro state.From January 1991 to July 1992, he was a visiting Research Associate in

the Department of Electrical and Computer Engineering of University of Vic-toria, Victoria, BC, Canada. He also held a Docent position at Helsinki Univer-sity of Technology. From January 2002 to June 2002, he was a Melchor ChairProfessor in the Department of Electrical Engineering of University of NotreDame, Notre Dame, IN, USA. His teaching and research interests are in analogand digital signal processing, adaptive signal processing, digital communica-tions, wireless communications, multirate systems, stochastic processes, andelectronic circuits. He has published several refereed papers in some of theseareas and wrote the text books Adaptive Filtering: Algorithms and PracticalImplementation, 4th ed., Springer, NY, 2013, and Digital Signal Processing:System Analysis and Design, 2nd ed., Cambridge University Press, Cambridge,U.K., 2010 (with E. A. B. da Silva and S. L. Netto), and the monograph BlockTransceivers: OFDM and Beyond, Morgan & Claypool, New York, NY, 2012(W. A. Martins, and M. V. S. Lima).He has served as the Technical Program Chair of the 1995 MWSCAS held

in Rio de Janeiro, Brazil. He was the General co-Chair of the IEEE ISCAS2011, and Technical Program Co-Chair of the IEEE SPAWC 2008. He has alsoserved Vice President for region 9 of the IEEE Circuits and Systems Societyand as Chairman of the DSP technical committee of the same Society. He isalso a Fellow of IEEE. He has served as associate editor for the followingjournals: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: ANALOG AND

DIGITAL SIGNAL PROCESSING from 1996 to 1999, IEEE TRANSACTIONS ONSIGNAL PROCESSING from 1999 to 2002, and the Circuits, Systems and SignalProcessing Journal from 1998 to 2002. He was a distinguished lecturer of theIEEE Circuits and Systems Society for the year 2000 to 2001. In 2004 he servedas distinguished lecturer of the IEEE Signal Processing Society and receivedthe 2004 Education Award of the IEEE Circuits and Systems Society. He alsoholds some best-paper awards from conferences and from an IEEE journal. Hehas also received the 2014 Charles Desoer Technical Achievement Award fromthe IEEE Circuits and Systems Society.