fir filter using distributed arithmetic v3

8/3/2019 FIR Filter Using Distributed Arithmetic v3

1/16

FIR Filter Using Distributed Arithmetic


2/16

Introduction

Distributed Arithmetic (DA) is a different approach for implementing digital filters. The

basic idea is to replace all multiplications and additions by a table and a shifter-

accumulator. DA relies on the fact that the filter coefficients are known, so multiplyingc[n]x[n] becomes a multiplication with a constant. This is an importance difference and a

prerequisite for a DA design.

Sysgen has a built-in DA token which we will not use to implement our design becausewe will learn how to integrate VHDL code to System Generator by using black boxes and

co-simulation tokens.

Finally, we will download the DA design to the VirtexII Pro board and use it to run

hardware verification.

Distributed Arithmetic

Distributed Arithmetic (DA) can be used to compute sum of products. Many DSPalgorithms like convolution and correlation are formulated in a sum of products (SOP)

fashion. Consider the following sum of products:

[ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ]111100,1

0

+++===

=

NxNcxcxcnxncxcyN

n

Further assume that the coefficients c[n] are known values and that the variable x[n] canbe represented by

[ ] [ ]

=

=1

0

2B

b

b

b nxnx with xb[n] [ ]1,0 ,

where [ ]nxb represents the bth bit position of the numbers binary representation. TheSOP can be represented as:

[ ] [ ]

=

=

==1

0

1

0

2,N

n

B

b

b

b nxncxcy

Expanding the summations yields to:

[ ] [ ] [ ] [ ] 002

21

1 2020200, xxxcxcyB

BB

B ++==

[ ] [ ] [ ] [ ]

[ ] [ ] [ ] [ ]( )00

2

2

1

1

0

0

2

2

1

1

2121211

2121211

+++

+++

NxNxNxNc

xxxc

B

B

B

B

B

B

B

B


3/16

Redistributing the terms we have:

[ ] [ ] [ ] [ ] [ ] [ ]( ) 1111

2111100,

++==

B

BBB NxNcxcxcxcy

[ ] [ ] [ ] [ ] [ ] [ ]( )

[ ] [ ] [ ] [ ] [ ] [ ]( ) 0000

2

222

2111100

2111100

+++

+++

NxNcxcxc

NxNcxcxcB

BBB

In more compact form:

[ ] [ ]

=

=

==1

0

1

0

2,B

b

N

n

b

b nxncxcy

The key is to realize that the second summation can be mapped to a Look Up Table

(LUT). The coefficients c[n] are known and the [ ]nxb values are either 1 or 0 then eachSOP is just a combination of the c[n]s for which a true table can be constructed.Suppose we have:

[ ] [ ] [ ] [ ] [ ] [ ]( ) 2222 2111100

++B

BBB NxNcxcxc

Where each xB-2 digit belongs to a different x[n] variable; nevertheless we can form an Nbit word that can take 2N values, i.e. with N=7 one of the possible outcomes is:

[ ] [ ] [ ] [ ] [ ] [ ] [ ]( ) [ ] [ ] [ ]( ) 22 2421206051403121100 ++=++++++ BB cccccccccc

Multiplication by a power of 2 is no more that a bit shift, so what need to do is to slice

and concatenate the bits of the different x[n] in order to build a table given that the c[n]are all known.

What is left is to show how we can deal with signed implementations of DA. A minormodification needs to be introduced when working with signed twos complement

numbers. In twos complement, the MSB is used to determine the sign of the number. We

use, therefore, the following B-bit representation:

[ ] [ ] [ ]

=

+=

2

0

1

122

B

b

b

bB

Bnxnxnx

Then, the output y[n] is defined by:

[ ] [ ] [ ] [ ] [ ]

=

=

=

+=1

0

2

0

1

0

1

122

N

n

b

B

b

bN

n

B

B nxncnxncny

Finally, a block diagram for the DA implementation of a FIR filter is shown in figure 1.


4/16

X0[N-1]. . .XB-1[N-1] X1[N-1]

X0[1]. . .XB-1[1] X1[1]

X0[0]. . .XB-1[0] X1[0]

.

.

.

.

.

.

.

.

. LU

T

+/-

Register

Bit shift register Arith. Table Scaling Accumulator

Fig.1 DA Block Diagram

SysGen Implementation

We will use VHDL to implement all major parts of the design. According to fig. 1, we

need:

Register

SOP Table

Pre-Adder (LUT Adder)

Scaling Accumulator

Download lab2.zip and uncompress it on C:\DSP_Spring07\Lab2 .The following files

should appear:

dsp_fir.mdl, register_1to7.mdl, filter_lut_a.mdl, filter_lut_b.mdl, lut_adder.mdl

regne_config.m, filter_lut_a_config.m, filter_lut_b_config.m,lut_adder_config.m

regne.vhd, filter_lut_a.vhd, filter_lut_b.vhd, lut_adder.vhd,

scaling_accumulator.vhd

Black Boxes for HDL Co-Simulation

System Generator libraries provide high and low level functions for building systems.However, there may be instances when you need to build blocks using HDL modules.

These HDL modules need to be simulated a long with other SysGen blocks. The black

box block provides an interface between the Simulink model and the HDL source code.

An HDL component associated with a black box must adhere to the following SystemGenerator requirements and conventions


5/16

The entity name must not collide with any other entity name in the design

Bidirectional ports are not allowed on the top-level black box entity

For Verilog black boxes, the module and port names must be lower case and must

follow standard VHDL naming conventions

Any port that is not a clock or clock enable must be of type std_logic_vector

Any port that is a clock or clock enable must be of type std_logic Clock and clock enables must appear as pairs

Each clock name (and clock enable name) must contain the substring CLK andCE

A black box must describe its interface through a MATLAB M-function. The

configuration M-function is generated automatically by System Generator and someediting needs to be done in order to specify the characteristics of the black box entity.

The M-function contains:

The top-level entity name of the HDL component that should be associated with

the black box.

The language, i.e. VHDL or Verilog

Describes ports, including type, direction, bit width, binary point position, name,

and sample rate

Defines any generics required by the black box HDL

Specifies the black box HDL and other files that are associated with the block

Defines the clocks and clock enables for the block

Declares whether the HDL has any combinational feed-through paths

Lets proceed to create a black box for our Scaling Accumulator VHDL code

1. Set your working directory to C:\DSP_Spring07\Lab2, open Simulink and create a new

model and named it scaling_accumulator.mdl.

2. Add the ModelSim block from Xlinx BlocksetTools.

3. Add the black box block from Xilinx BlocksetBasic Elements.

4. The Configuration Wizard detects HDL files and opens a new window. Select the

scaling_accumulator.vhdl file which contains the entity description. Figure 2.


6/16

Fig.2 Select the vdhl file that contains the black box description file

5. Click on the OKbox of the Wizard Notice. The configuration M-File will open.

6. Configure the input ports by editing commented parts in the configuration M-file.

Replace the comments in:

if(this_block.inputTypesKnown)% do input type checking, dynamic output type and generic setup in thiscode block.

% (!) Port 'LUT0' appeared to have dynamic type in the HDL -- please add

type checking as appropriate;

% (!) Port 'ALUT' appeared to have dynamic type in the HDL -- please addtype checking as appropriate;

% (!) Port 'BLUT' appeared to have dynamic type in the HDL -- please addtype checking as appropriate;

% (!) Port 'CLUT' appeared to have dynamic type in the HDL -- please addtype checking as appropriate;

% (!) Port 'DLUT' appeared to have dynamic type in the HDL -- please addtype checking as appropriate;

% (!) Port 'LUT5' appeared to have dynamic type in the HDL -- please addtype checking as appropriate;

% (!) Port 'LUT6' appeared to have dynamic type in the HDL -- please addtype checking as appropriate;

% (!) Port 'LUT7' appeared to have dynamic type in the HDL -- please addtype checking as appropriate;% (!) Port 'Filter_out' appeared to have dynamic type in the HDL

% --- you must add an appropriate type setting for this port end % if(inputTypesKnown)


7/16

With the following piece of code:

if(this_block.inputTypesKnown) % do input type checking, dynamic output type and generic setup in thiscode block.

this_block.port('LUT0').useHDLVector(true);

if(this_block.port('LUT0').width ~= 23);this_block.setError('Input data type for "LUT0" must have width of 23.');

endthis_block.port('ALUT').useHDLVector(true);if(this_block.port('ALUT').width ~= 23);this_block.setError('Input data type for "ALUT" must have width of 23.');

endthis_block.port('BLUT').useHDLVector(true);if(this_block.port('BLUT').width ~= 23);this_block.setError('Input data type for "BLUT" must have width of 23.');

endthis_block.port('CLUT').useHDLVector(true);if(this_block.port('CLUT').width ~= 23);this_block.setError('Input data type for "CLUT" must have width of 23.');

end

this_block.port('DLUT').useHDLVector(true);if(this_block.port('DLUT').width ~= 23);this_block.setError('Input data type for "DLUT" must have width of 23.');

endthis_block.port('LUT5').useHDLVector(true);if(this_block.port('LUT5').width ~= 23);this_block.setError('Input data type for "LUT5" must have width of 23.');



end

7. To configure the output port, we need to specify the output bit width, binary point

position, signed or unsigned data, and generic values. Add the following code after the

previous block:

% (!) Port 'Filter_out' appeared to have dynamic type in the HDLFilter_out_port = this_block.port('Filter_out');input_bitwidth = this_block.port('LUT0').width;

% Set up the fixed parameters of the filter

% Calculate the width of the output based on worst case values for data % and coefficicients

output_bitwidth = input_bitwidth+7;

% Set the output data typeFilter_out_port.makeSigned;Filter_out_port.width = output_bitwidth;Filter_out_port.binpt = 25;

% (!) Customize the following generic settings as appropriate. If any

settings depend % on input types, make the settings in the "inputTypesKnown" code block.

this_block.addGeneric('Nb_in', this_block.port('LUT0').width);this_block.addGeneric('Nb_out', this_block.port('Filter_out').width);


8/16

% --- you must add an appropriate type setting for this port

8. Finally, delete the following lines:

% (!) Custimize the following generic settings as appropriate. If anysettings depend% on input types, make the settings in the "inputTypesKnown" code block.

this_block.addGeneric('Nb_out','integer','30');this_block.addGeneric('Nb_in','integer','23');

9. Save and close the M-configuration file. On the Simulink model (mdl file) click over the

nameBlack Boxand proceed to change it toScaling Accumulator.

10. Open the dsp_fir.mdl file. Copy and paste the newly created Scaling Accumulator.

Open the register_1to7.mdl file and copy the Register block to the dsp_fir.mdl

11. Save and close dsp_fir.mdl, register_1to7.mdl, and scaling_accumulator.mdl files.Close and launch again MATLAB and Simulink to verify that the Scaling Accumulator

block is under the DSP Spring 07 Library.

Fig.3.a

12. On the Matlab menu, click on FileSet Path and add the folder

C:\DSP_Spring07\Lab2 .Click on save and close.

Note: You may receive a warning message if you do not have write permission to update

the MATLAB installation directory. ClickYes to save the file in your working directory

(in this case, lab2). If you close MATLAB you will need to set the path again.

Now, when you load Simulink you should be able to see under the DSP Spring 07Library five new blocks: Register, LUT A, LUT B, and LUT Adder, and ScalingAccumulator.


9/16

Fig.3.a Scaling Accumulator block and (3.b) dsp_fir.mdl Simulink model.

DA FIR Design Implementation and Code Generation

The FIR filter to be implemented in DA is the Sixth order FIR filter of the previouslaboratory. Figure 4 presents a more detailed block diagram description of the DA

implementation. Notice:

a) The input of the filter is a B-bit binary number formed from each b-th position of

all N input numbers. Therefore, we need an entity that performs the slicing andconcatenation of the binary positions for the new input number.

b) The reason for two LUTs in the design is that LUTs are 4 input blocks and our

filter has 7 coefficients, so one LUT will have a 4-bit wide input and the other a 3

bit-wide input.

The BitBasher block performs slicing, concatenation and augmentation of inputs attached

to the block. The block may have up to four output ports and the number of outputs isequal to the number of expressions specified in the BitBasher Expression dialog. The

advantage of this block over others is that it does not cost anything to implement in

hardware.


10/16

DA 7-Tap FIR Filter

X0

X1

X2

X3

X4

X5

X6

PartialProduct

ROM

PartialProduct

ROM

+

Pre-Adder

+/- Z-1

Fig.4 Block Diagram of a 7-Tap DA FIR filter

Filter specifications:

Low Pass Filter. Signed 8 bit input number. Represented in Fix 8_4.

Sampling frequency of 1kHz.

Coefficients quantized to 23 bits. Represented in Fix 23_21.

Full precision Adders and Mult. blocks.

Filter output of 30 bits. Represented in Fix 30_25.

Filter implementation:

1. Set your working directory to C:\DSP_Spring07\Lab2\ and open a new model.

2. Add the Register block from DSP Spring 07 Library.

3. Add and connect two Sine Wave inputs, a Sum Block and the Xilinx Gateway Inblock to the Register block. The input to the register is the addition of the two

waves. By double clicking on the Gateway In block set the output type to Signed,Number of Bits to 8, Binary Point to 4, and Sample Period to 0.001 sec.


11/16

4. Set the frequencies of one of the sine to 2*pi*5 rad/sec. and the other to 2*pi*300

rad/sec.

5. From SimulinkCommonly Used Block add a Constant block. Set the constant value to1. Add a Xilinx Gateway In and set the output type to Unsigned, Number of Bits to 1,

Binary Point to 0, and Sample Period to 0.001. So far the model should look like figure 4.

6. From Xilinx BlocksetBasic Elements add the BitBasher block. Double click on theBitBasher block and enter the following expressions:

outLSB1={d[0],c[0],b[0],a[0]};

outMSB1={g[0],f[0],e[0]};

outLSB2={d[1],c[1],b[1],a[1]};

outMSB2={g[1],f[1],e[1]};

This block slices and concatenates the first two LSB positions of the 8-bit register

outputs. Add three more BitBasher blocks to the design and configure them accordingly.

Hint: Copy and paste the previous block expression and change the bit positions

7. Connect the BitBasher block to the Register. Click on the Register, hold down the Ctrl

key while left-clicking on the BitBasher block. When connecting the remaining three

blocks be sure to connect Reg1 to a, Reg2 to b, Reg3 to c, Reg4 to d, Reg5 to e, Reg6 tof, and Reg7 to g.

Fig.4 Input blocks to the Register block in the DA implementation.


12/16

8. Add the filter_lut_a, and filter_lut_b blocks from the DSP Spring 07 Library andconnect them to the BitBasher blocks. Also add the lut_adder block to the model and

connect it to the outputs of the filter_lut_a and filter_lut_b blocks. Your design should

look like figure 5.

9. When designs grow big it is good practice to add registers in order to reduce thepropagation delay among the components of the design. From Xilinx BlocksetBasic

Elements add Delays (registers) and connect them to the outputs of all the lut_adder

blocks.

10. Add the Scaling Accumulator block from the DSP Spring 07 Library and connect it tothe output of the delay blocks. Add a Gateway Outblock and a Scope block. Set theNumber of Axes to 2 in the Scope properties and connect one to the Scaling Accumulator

output and the other to the output of the summation block. Figure 6

11. Add the System Generator block and set the Simulinksystem period to 0.001. Verifythat the following settings:

Compilation: HDL Netlist

Part: Virtex2p xc2vp30-7ff896

Target directory: ./netlist

Synthesis Tool: XST

12. Set the simulation time to 2 seconds. Double click on all the filter_lut_a, filter_lut_b, andlut_adder blocks as well as on the Scaling_accumulator block and verify that ISE

Simulator is selected on the Simulation mode field.

13. Run the Simulation

14. Double Click on the System Generator Block and click on the Generate Box.


13/16

Fig. 5 BitBasher, LUTs and LUTs adder in the DA FIR implementation

Fig. 6 Final connections for the DA Filter


14/16

Verifying through Hardware Co-Simulation

On this final and simple step we will create a hardware co-simulation block and performboth hardware and software HDL co-simulation.

1. Create a subsystem of the DA Filter. Select all components except Gateways I/O, inputsources, System Generator token and Scope then right-click and select the Create

Subsystem block. Re-arrange the input and output connection so the design will look

like figure 7.

Note: To edit the names of inputs and output of the subsystem double click over the block

and edit the names accordingly.

Fig. 7 Compact form of the DA filter

2. Save the model as da_dir_filter_hwcosim.mdl.

3. Double click the System Generator block, click Compilation and select Hardware Co-Simulationxupxup_virtex_ii_pro

4. Enter./netlis_hw as the Target Directory and click Apply to accept changes.

5. Click Generate and wait for the compilation process to finish.

6. Set the Number of Axes to 3 in the Scope properties. Copy the da_fir_filter_hwcosimblock to the design and connect it to the input and output so it looks like figure 8.


15/16

Fig. 8 Hardware co-simulation

7. Right-click the DA Filter subsystem, select Block Properties and type 10 in the Priority

field.

8. Right-click the da_fir_filter_hwcosim block, select Block Properties and type 0 in thePriority field.

9. Save the model.

10. Connect the power cable and the usb cable of hardware board. Turn on the power. Wait

for Windows to finish the New Hardware installation.

11. Double-click on the hardware model and set the cable to Plataform USB.

12. Click the Run button in the Simulink window to run the simulation.

IIR Implementation

Implementing IIR filters is not that different from implementing FIR filters. In fact, thereare more structures dedicated to IIR than FIR filters. Find about IIR structures, chapter 6

of Oppenheim and Schafer is a good place to start, and turn in a simulation of the

implementation that you pick for the following system function:

( )4-3-2-1-

-4-3-2-1

0.0301188z0.1826756z0.6799785z0.782095z1

0.0465829z0.1863316z0.2794974z0.1863316z0.0465829

++

++++=zH

1. The inputs are two sine waves of frequencies 2* *5 rad/sec and 2* *450 rad/sec.

Set both amplitudes to 1.

2. In the Gateway In block, set Number of bits to 16 and Binary Point to 12.3. Be sure to set the Simulink System Period to 0.001 sec in the System Generator

token and the Sample Period of the Gateway In to 0.001 sec.


16/16

4. To implement the coefficients of the filter use the Constant block from the Xilinx

BlocksetBasic Elements and set the Number of bits to 27 and Binary Point to 22.

5. Configure the Mult. Block to Number of bits 27, Binary Point 22, Quantization to Round,Overflow to Saturate, and Latency to 0.

Your report must include: A little description of you design. Explan why you choose it

Print outs of your Simulink model and the filter output.

From the previous section. Print the outputs of the DA FIR implementationand the configuration file of the sacaling_accumulator block.

fir filter using distributed arithmetic v3

Documents