fir filter using distributed arithmetic v3

Upload: mohammed-moufti

Post on 06-Apr-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 FIR Filter Using Distributed Arithmetic v3

    1/16

    FIR Filter Using Distributed Arithmetic

  • 8/3/2019 FIR Filter Using Distributed Arithmetic v3

    2/16

    Introduction

    Distributed Arithmetic (DA) is a different approach for implementing digital filters. The

    basic idea is to replace all multiplications and additions by a table and a shifter-

    accumulator. DA relies on the fact that the filter coefficients are known, so multiplyingc[n]x[n] becomes a multiplication with a constant. This is an importance difference and a

    prerequisite for a DA design.

    Sysgen has a built-in DA token which we will not use to implement our design becausewe will learn how to integrate VHDL code to System Generator by using black boxes and

    co-simulation tokens.

    Finally, we will download the DA design to the VirtexII Pro board and use it to run

    hardware verification.

    Distributed Arithmetic

    Distributed Arithmetic (DA) can be used to compute sum of products. Many DSPalgorithms like convolution and correlation are formulated in a sum of products (SOP)

    fashion. Consider the following sum of products:

    [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ]111100,1

    0

    +++===

    =

    NxNcxcxcnxncxcyN

    n

    Further assume that the coefficients c[n] are known values and that the variable x[n] canbe represented by

    [ ] [ ]

    =

    =1

    0

    2B

    b

    b

    b nxnx with xb[n] [ ]1,0 ,

    where [ ]nxb represents the bth bit position of the numbers binary representation. TheSOP can be represented as:

    [ ] [ ]

    =

    =

    ==1

    0

    1

    0

    2,N

    n

    B

    b

    b

    b nxncxcy

    Expanding the summations yields to:

    [ ] [ ] [ ] [ ] 002

    21

    1 2020200, xxxcxcyB

    BB

    B ++==

    [ ] [ ] [ ] [ ]

    [ ] [ ] [ ] [ ]( )00

    2

    2

    1

    1

    0

    0

    2

    2

    1

    1

    2121211

    2121211

    +++

    +++

    NxNxNxNc

    xxxc

    B

    B

    B

    B

    B

    B

    B

    B

  • 8/3/2019 FIR Filter Using Distributed Arithmetic v3

    3/16

    Redistributing the terms we have:

    [ ] [ ] [ ] [ ] [ ] [ ]( ) 1111

    2111100,

    ++==

    B

    BBB NxNcxcxcxcy

    [ ] [ ] [ ] [ ] [ ] [ ]( )

    [ ] [ ] [ ] [ ] [ ] [ ]( ) 0000

    2

    222

    2111100

    2111100

    +++

    +++

    NxNcxcxc

    NxNcxcxcB

    BBB

    In more compact form:

    [ ] [ ]

    =

    =

    ==1

    0

    1

    0

    2,B

    b

    N

    n

    b

    b nxncxcy

    The key is to realize that the second summation can be mapped to a Look Up Table

    (LUT). The coefficients c[n] are known and the [ ]nxb values are either 1 or 0 then eachSOP is just a combination of the c[n]s for which a true table can be constructed.Suppose we have:

    [ ] [ ] [ ] [ ] [ ] [ ]( ) 2222 2111100

    ++B

    BBB NxNcxcxc

    Where each xB-2 digit belongs to a different x[n] variable; nevertheless we can form an Nbit word that can take 2N values, i.e. with N=7 one of the possible outcomes is:

    [ ] [ ] [ ] [ ] [ ] [ ] [ ]( ) [ ] [ ] [ ]( ) 22 2421206051403121100 ++=++++++ BB cccccccccc

    Multiplication by a power of 2 is no more that a bit shift, so what need to do is to slice

    and concatenate the bits of the different x[n] in order to build a table given that the c[n]are all known.

    What is left is to show how we can deal with signed implementations of DA. A minormodification needs to be introduced when working with signed twos complement

    numbers. In twos complement, the MSB is used to determine the sign of the number. We

    use, therefore, the following B-bit representation:

    [ ] [ ] [ ]

    =

    +=

    2

    0

    1

    122

    B

    b

    b

    bB

    Bnxnxnx

    Then, the output y[n] is defined by:

    [ ] [ ] [ ] [ ] [ ]

    =

    =

    =

    +=1

    0

    2

    0

    1

    0

    1

    122

    N

    n

    b

    B

    b

    bN

    n

    B

    B nxncnxncny

    Finally, a block diagram for the DA implementation of a FIR filter is shown in figure 1.

  • 8/3/2019 FIR Filter Using Distributed Arithmetic v3

    4/16

    X0[N-1]. . .XB-1[N-1] X1[N-1]

    X0[1]. . .XB-1[1] X1[1]

    X0[0]. . .XB-1[0] X1[0]

    .

    .

    .

    .

    .

    .

    .

    .

    . LU

    T

    +/-

    Register

    Bit shift register Arith. Table Scaling Accumulator

    Fig.1 DA Block Diagram

    SysGen Implementation

    We will use VHDL to implement all major parts of the design. According to fig. 1, we

    need:

    Register

    SOP Table

    Pre-Adder (LUT Adder)

    Scaling Accumulator

    Download lab2.zip and uncompress it on C:\DSP_Spring07\Lab2 .The following files

    should appear:

    dsp_fir.mdl, register_1to7.mdl, filter_lut_a.mdl, filter_lut_b.mdl, lut_adder.mdl

    regne_config.m, filter_lut_a_config.m, filter_lut_b_config.m,lut_adder_config.m

    regne.vhd, filter_lut_a.vhd, filter_lut_b.vhd, lut_adder.vhd,

    scaling_accumulator.vhd

    Black Boxes for HDL Co-Simulation

    System Generator libraries provide high and low level functions for building systems.However, there may be instances when you need to build blocks using HDL modules.

    These HDL modules need to be simulated a long with other SysGen blocks. The black

    box block provides an interface between the Simulink model and the HDL source code.

    An HDL component associated with a black box must adhere to the following SystemGenerator requirements and conventions

  • 8/3/2019 FIR Filter Using Distributed Arithmetic v3

    5/16

    The entity name must not collide with any other entity name in the design

    Bidirectional ports are not allowed on the top-level black box entity

    For Verilog black boxes, the module and port names must be lower case and must

    follow standard VHDL naming conventions

    Any port that is not a clock or clock enable must be of type std_logic_vector

    Any port that is a clock or clock enable must be of type std_logic Clock and clock enables must appear as pairs

    Each clock name (and clock enable name) must contain the substring CLK andCE

    A black box must describe its interface through a MATLAB M-function. The

    configuration M-function is generated automatically by System Generator and someediting needs to be done in order to specify the characteristics of the black box entity.

    The M-function contains:

    The top-level entity name of the HDL component that should be associated with

    the black box.

    The language, i.e. VHDL or Verilog

    Describes ports, including type, direction, bit width, binary point position, name,

    and sample rate

    Defines any generics required by the black box HDL

    Specifies the black box HDL and other files that are associated with the block

    Defines the clocks and clock enables for the block

    Declares whether the HDL has any combinational feed-through paths

    Lets proceed to create a black box for our Scaling Accumulator VHDL code

    1. Set your working directory to C:\DSP_Spring07\Lab2, open Simulink and create a new

    model and named it scaling_accumulator.mdl.

    2. Add the ModelSim block from Xlinx BlocksetTools.

    3. Add the black box block from Xilinx BlocksetBasic Elements.

    4. The Configuration Wizard detects HDL files and opens a new window. Select the

    scaling_accumulator.vhdl file which contains the entity description. Figure 2.

  • 8/3/2019 FIR Filter Using Distributed Arithmetic v3

    6/16

    Fig.2 Select the vdhl file that contains the black box description file

    5. Click on the OKbox of the Wizard Notice. The configuration M-File will open.

    6. Configure the input ports by editing commented parts in the configuration M-file.

    Replace the comments in:

    if(this_block.inputTypesKnown)% do input type checking, dynamic output type and generic setup in thiscode block.

    % (!) Port 'LUT0' appeared to have dynamic type in the HDL -- please add

    type checking as appropriate;

    % (!) Port 'ALUT' appeared to have dynamic type in the HDL -- please addtype checking as appropriate;

    % (!) Port 'BLUT' appeared to have dynamic type in the HDL -- please addtype checking as appropriate;

    % (!) Port 'CLUT' appeared to have dynamic type in the HDL -- please addtype checking as appropriate;

    % (!) Port 'DLUT' appeared to have dynamic type in the HDL -- please addtype checking as appropriate;

    % (!) Port 'LUT5' appeared to have dynamic type in the HDL -- please addtype checking as appropriate;

    % (!) Port 'LUT6' appeared to have dynamic type in the HDL -- please addtype checking as appropriate;

    % (!) Port 'LUT7' appeared to have dynamic type in the HDL -- please addtype checking as appropriate;% (!) Port 'Filter_out' appeared to have dynamic type in the HDL

    % --- you must add an appropriate type setting for this port end % if(inputTypesKnown)

  • 8/3/2019 FIR Filter Using Distributed Arithmetic v3

    7/16

    With the following piece of code:

    if(this_block.inputTypesKnown) % do input type checking, dynamic output type and generic setup in thiscode block.

    this_block.port('LUT0').useHDLVector(true);

    if(this_block.port('LUT0').width ~= 23);this_block.setError('Input data type for "LUT0" must have width of 23.');

    endthis_block.port('ALUT').useHDLVector(true);if(this_block.port('ALUT').width ~= 23);this_block.setError('Input data type for "ALUT" must have width of 23.');

    endthis_block.port('BLUT').useHDLVector(true);if(this_block.port('BLUT').width ~= 23);this_block.setError('Input data type for "BLUT" must have width of 23.');

    endthis_block.port('CLUT').useHDLVector(true);if(this_block.port('CLUT').width ~= 23);this_block.setError('Input data type for "CLUT" must have width of 23.');

    end

    this_block.port('DLUT').useHDLVector(true);if(this_block.port('DLUT').width ~= 23);this_block.setError('Input data type for "DLUT" must have width of 23.');

    endthis_block.port('LUT5').useHDLVector(true);if(this_block.port('LUT5').width ~= 23);this_block.setError('Input data type for "LUT5" must have width of 23.');

    endthis_block.port('LUT6').useHDLVector(true);if(this_block.port('LUT6').width ~= 23);this_block.setError('Input data type for "LUT6" must have width of 23.');

    endthis_block.port('LUT7').useHDLVector(true);if(this_block.port('LUT7').width ~= 23);this_block.setError('Input data type for "LUT7" must have width of 23.');

    end

    7. To configure the output port, we need to specify the output bit width, binary point

    position, signed or unsigned data, and generic values. Add the following code after the

    previous block:

    % (!) Port 'Filter_out' appeared to have dynamic type in the HDLFilter_out_port = this_block.port('Filter_out');input_bitwidth = this_block.port('LUT0').width;

    % Set up the fixed parameters of the filter

    % Calculate the width of the output based on worst case values for data % and coefficicients

    output_bitwidth = input_bitwidth+7;

    % Set the output data typeFilter_out_port.makeSigned;Filter_out_port.width = output_bitwidth;Filter_out_port.binpt = 25;

    % (!) Customize the following generic settings as appropriate. If any

    settings depend % on input types, make the settings in the "inputTypesKnown" code block.

    this_block.addGeneric('Nb_in', this_block.port('LUT0').width);this_block.addGeneric('Nb_out', this_block.port('Filter_out').width);

  • 8/3/2019 FIR Filter Using Distributed Arithmetic v3

    8/16

    % --- you must add an appropriate type setting for this port

    8. Finally, delete the following lines:

    % (!) Custimize the following generic settings as appropriate. If anysettings depend% on input types, make the settings in the "inputTypesKnown" code block.

    this_block.addGeneric('Nb_out','integer','30');this_block.addGeneric('Nb_in','integer','23');

    9. Save and close the M-configuration file. On the Simulink model (mdl file) click over the

    nameBlack Boxand proceed to change it toScaling Accumulator.

    10. Open the dsp_fir.mdl file. Copy and paste the newly created Scaling Accumulator.

    Open the register_1to7.mdl file and copy the Register block to the dsp_fir.mdl

    11. Save and close dsp_fir.mdl, register_1to7.mdl, and scaling_accumulator.mdl files.Close and launch again MATLAB and Simulink to verify that the Scaling Accumulator

    block is under the DSP Spring 07 Library.

    Fig.3.a

    12. On the Matlab menu, click on FileSet Path and add the folder

    C:\DSP_Spring07\Lab2 .Click on save and close.

    Note: You may receive a warning message if you do not have write permission to update

    the MATLAB installation directory. ClickYes to save the file in your working directory

    (in this case, lab2). If you close MATLAB you will need to set the path again.

    Now, when you load Simulink you should be able to see under the DSP Spring 07Library five new blocks: Register, LUT A, LUT B, and LUT Adder, and ScalingAccumulator.

  • 8/3/2019 FIR Filter Using Distributed Arithmetic v3

    9/16

    Fig.3.a Scaling Accumulator block and (3.b) dsp_fir.mdl Simulink model.

    DA FIR Design Implementation and Code Generation

    The FIR filter to be implemented in DA is the Sixth order FIR filter of the previouslaboratory. Figure 4 presents a more detailed block diagram description of the DA

    implementation. Notice:

    a) The input of the filter is a B-bit binary number formed from each b-th position of

    all N input numbers. Therefore, we need an entity that performs the slicing andconcatenation of the binary positions for the new input number.

    b) The reason for two LUTs in the design is that LUTs are 4 input blocks and our

    filter has 7 coefficients, so one LUT will have a 4-bit wide input and the other a 3

    bit-wide input.

    The BitBasher block performs slicing, concatenation and augmentation of inputs attached

    to the block. The block may have up to four output ports and the number of outputs isequal to the number of expressions specified in the BitBasher Expression dialog. The

    advantage of this block over others is that it does not cost anything to implement in

    hardware.

  • 8/3/2019 FIR Filter Using Distributed Arithmetic v3

    10/16

    DA 7-Tap FIR Filter

    X0

    X1

    X2

    X3

    X4

    X5

    X6

    PartialProduct

    ROM

    PartialProduct

    ROM

    +

    Pre-Adder

    +/- Z-1

    Fig.4 Block Diagram of a 7-Tap DA FIR filter

    Filter specifications:

    Low Pass Filter. Signed 8 bit input number. Represented in Fix 8_4.

    Sampling frequency of 1kHz.

    Coefficients quantized to 23 bits. Represented in Fix 23_21.

    Full precision Adders and Mult. blocks.

    Filter output of 30 bits. Represented in Fix 30_25.

    Filter implementation:

    1. Set your working directory to C:\DSP_Spring07\Lab2\ and open a new model.

    2. Add the Register block from DSP Spring 07 Library.

    3. Add and connect two Sine Wave inputs, a Sum Block and the Xilinx Gateway Inblock to the Register block. The input to the register is the addition of the two

    waves. By double clicking on the Gateway In block set the output type to Signed,Number of Bits to 8, Binary Point to 4, and Sample Period to 0.001 sec.

  • 8/3/2019 FIR Filter Using Distributed Arithmetic v3

    11/16

    4. Set the frequencies of one of the sine to 2*pi*5 rad/sec. and the other to 2*pi*300

    rad/sec.

    5. From SimulinkCommonly Used Block add a Constant block. Set the constant value to1. Add a Xilinx Gateway In and set the output type to Unsigned, Number of Bits to 1,

    Binary Point to 0, and Sample Period to 0.001. So far the model should look like figure 4.

    6. From Xilinx BlocksetBasic Elements add the BitBasher block. Double click on theBitBasher block and enter the following expressions:

    outLSB1={d[0],c[0],b[0],a[0]};

    outMSB1={g[0],f[0],e[0]};

    outLSB2={d[1],c[1],b[1],a[1]};

    outMSB2={g[1],f[1],e[1]};

    This block slices and concatenates the first two LSB positions of the 8-bit register

    outputs. Add three more BitBasher blocks to the design and configure them accordingly.

    Hint: Copy and paste the previous block expression and change the bit positions

    7. Connect the BitBasher block to the Register. Click on the Register, hold down the Ctrl

    key while left-clicking on the BitBasher block. When connecting the remaining three

    blocks be sure to connect Reg1 to a, Reg2 to b, Reg3 to c, Reg4 to d, Reg5 to e, Reg6 tof, and Reg7 to g.

    Fig.4 Input blocks to the Register block in the DA implementation.

  • 8/3/2019 FIR Filter Using Distributed Arithmetic v3

    12/16

    8. Add the filter_lut_a, and filter_lut_b blocks from the DSP Spring 07 Library andconnect them to the BitBasher blocks. Also add the lut_adder block to the model and

    connect it to the outputs of the filter_lut_a and filter_lut_b blocks. Your design should

    look like figure 5.

    9. When designs grow big it is good practice to add registers in order to reduce thepropagation delay among the components of the design. From Xilinx BlocksetBasic

    Elements add Delays (registers) and connect them to the outputs of all the lut_adder

    blocks.

    10. Add the Scaling Accumulator block from the DSP Spring 07 Library and connect it tothe output of the delay blocks. Add a Gateway Outblock and a Scope block. Set theNumber of Axes to 2 in the Scope properties and connect one to the Scaling Accumulator

    output and the other to the output of the summation block. Figure 6

    11. Add the System Generator block and set the Simulinksystem period to 0.001. Verifythat the following settings:

    Compilation: HDL Netlist

    Part: Virtex2p xc2vp30-7ff896

    Target directory: ./netlist

    Synthesis Tool: XST

    12. Set the simulation time to 2 seconds. Double click on all the filter_lut_a, filter_lut_b, andlut_adder blocks as well as on the Scaling_accumulator block and verify that ISE

    Simulator is selected on the Simulation mode field.

    13. Run the Simulation

    14. Double Click on the System Generator Block and click on the Generate Box.

  • 8/3/2019 FIR Filter Using Distributed Arithmetic v3

    13/16

    Fig. 5 BitBasher, LUTs and LUTs adder in the DA FIR implementation

    Fig. 6 Final connections for the DA Filter

  • 8/3/2019 FIR Filter Using Distributed Arithmetic v3

    14/16

    Verifying through Hardware Co-Simulation

    On this final and simple step we will create a hardware co-simulation block and performboth hardware and software HDL co-simulation.

    1. Create a subsystem of the DA Filter. Select all components except Gateways I/O, inputsources, System Generator token and Scope then right-click and select the Create

    Subsystem block. Re-arrange the input and output connection so the design will look

    like figure 7.

    Note: To edit the names of inputs and output of the subsystem double click over the block

    and edit the names accordingly.

    Fig. 7 Compact form of the DA filter

    2. Save the model as da_dir_filter_hwcosim.mdl.

    3. Double click the System Generator block, click Compilation and select Hardware Co-Simulationxupxup_virtex_ii_pro

    4. Enter./netlis_hw as the Target Directory and click Apply to accept changes.

    5. Click Generate and wait for the compilation process to finish.

    6. Set the Number of Axes to 3 in the Scope properties. Copy the da_fir_filter_hwcosimblock to the design and connect it to the input and output so it looks like figure 8.

  • 8/3/2019 FIR Filter Using Distributed Arithmetic v3

    15/16

    Fig. 8 Hardware co-simulation

    7. Right-click the DA Filter subsystem, select Block Properties and type 10 in the Priority

    field.

    8. Right-click the da_fir_filter_hwcosim block, select Block Properties and type 0 in thePriority field.

    9. Save the model.

    10. Connect the power cable and the usb cable of hardware board. Turn on the power. Wait

    for Windows to finish the New Hardware installation.

    11. Double-click on the hardware model and set the cable to Plataform USB.

    12. Click the Run button in the Simulink window to run the simulation.

    IIR Implementation

    Implementing IIR filters is not that different from implementing FIR filters. In fact, thereare more structures dedicated to IIR than FIR filters. Find about IIR structures, chapter 6

    of Oppenheim and Schafer is a good place to start, and turn in a simulation of the

    implementation that you pick for the following system function:

    ( )4-3-2-1-

    -4-3-2-1

    0.0301188z0.1826756z0.6799785z0.782095z1

    0.0465829z0.1863316z0.2794974z0.1863316z0.0465829

    ++

    ++++=zH

    1. The inputs are two sine waves of frequencies 2* *5 rad/sec and 2* *450 rad/sec.

    Set both amplitudes to 1.

    2. In the Gateway In block, set Number of bits to 16 and Binary Point to 12.3. Be sure to set the Simulink System Period to 0.001 sec in the System Generator

    token and the Sample Period of the Gateway In to 0.001 sec.

  • 8/3/2019 FIR Filter Using Distributed Arithmetic v3

    16/16

    4. To implement the coefficients of the filter use the Constant block from the Xilinx

    BlocksetBasic Elements and set the Number of bits to 27 and Binary Point to 22.

    5. Configure the Mult. Block to Number of bits 27, Binary Point 22, Quantization to Round,Overflow to Saturate, and Latency to 0.

    Your report must include: A little description of you design. Explan why you choose it

    Print outs of your Simulink model and the filter output.

    From the previous section. Print the outputs of the DA FIR implementationand the configuration file of the sacaling_accumulator block.