132876208-final-neo-16

Upload: mohammed-el-adawy

Post on 19-Oct-2015

17 views

Category:

Documents


0 download

DESCRIPTION

this is powerful project for MIPS processor

TRANSCRIPT

  • Contents

    1. Introduction

    2. System Description

    1. Building Datapath

    a) Major Components

    b) Components for Arithmetic and Logic Functions

    c) Load word (lw) and store word (sw) instructions

    d) Branch on equal instruction

    e) Jump instruction

    2. Simple Implementation Scheme

    a) Creating a Single Datapath

    b) ALU Control

    c) Main Control

    3. Multicycle Impplementation

    a) Additions and Changes in the Scheme

    b) Execution of Instructions in Clock Cycles

    4. Module Specification ( Functional Description )

    a) ALU

    b) Memory

    c) Control

    d) Datapath

    i) Instruction Fetch

    ii) Instruction Decode

    iii) Execution

    iv) Memory Writeback

    e) Processor Memory

  • 3. Uniqueness

    4. Challenges Faced

    5. Conclusion

    6. VHDL Code ( RTL Schematic included in folder )

    7. References

  • 1. Introduction

    Project Description

    Designing a fully working reduced instruction set(RISC) for a processor. Design and

    implement/simulate using VHDL a processor to run the developed instruction set using a

    Field Programmable Gate Array (FPGA) (if possible).

    Motivation

    Being an engineering student, it was our first academic project.

    This project inspired us for more research at UG level.

    Practical Implementation of theoretical concepts.

    Opportunity to calibrate a Fun-to-do element in the course.

    Performing a real hardware project was of utmost interest.

    Our Approach

    First of all, we started with revision of VHDL concepts.

    Then we reviewed features of RISC processor and its advantages over CISC.

    We continued with our study on RISC processors and implemented some of the

    components of MIPS processor.

    The next task was to design an instruction set for our own processor.

    Innovative thinking of some of the group members paved the way for 16-bit

    instruction set.

    At last, but not the least the only job left, perhaps the most difficult one, was to

    implement the processor using VHDL and we moved on to finish it.

  • 2. System Description 1. Building a Datapath a) Major Components

    At first we look at the elements required to execute the NEO instructions and their connection. The first element needed is a place to store the program instructions. This Instruction Memory is used to hold and supply instructions given an address. The address must be kept in the Program Counter (PC), and in order to increment the PC to the address of the next instruction, we also need an Adder. All these elements are shown in

    figure-

    After fetching one instruction from the instruction memory, the program counter has to be incremented so that it points to the address of the next instruction 2 bytes later. This is realised by the datapath shown in figure-

  • b) Components for Arithmetic and Logic Functions The instructions we use all read two registers, perform an ALU operation and write back the result. These arithmetic-logical instructions are also called R-type instructions. This instruction class considers add, sub, slt, and and or. The 8 registers of the processor are stored in a Register File. To read a dataword two inputs and two outputs are needed. The inputs are 3 bits wide and specify the register number to be read, the outputs are 16 bits wide and carry the value of the register.

    To write the result back two inputs are needed: one to specify the register number and one to supply the

    data to be written. The Register is shown in Figure

    To process the data from the Register, an ALU with two data inputs is used. Figure shows the combination

    of Register and ALU to operate on R-type instructions.

    c) Load word (lw) and store word (sw) instructions Two more elements are needed to implement the sw- and lw-instructions: the Data Memory and the Sign Extension Unit.

  • The sw- and lw-instructions compute a memory address by adding a register value to the 8-bit signed offset field contained in the instruction. Because the ALU has 16-bit values, the instruction offset field must be sign extended from 8 to 16 bits simply by concatenating the sign-bit 8 times to the original value. The instruction field for a lw- or sw-instruction is shown in figure-

    op rs rt constant

    5 bit 3 bit 3 bit 5 bit

    d) Branch on equal instruction The beq instruction has three operands, two registers that are compared for equality, and a 11-bit offset used to compute the branch target address relative to the branch instruction address.

  • Figure shows the datapath for a branch on equal instruction. The datapath must do two operations: compare the register contents and compute the branch target. Therefore two things must be done: The address field of the branch instruction must be sign extended from 8 bits to 16 bits and must be shifted left 2 bits so that it is a word offset. The branch target address is computed by adding the address of the next instruction (PC + 2) to the before computed offset. e) Jump Instruction The jump instruction is similar to the branch instruction, but computes the target PC differently and not conditional. The destination address for a jump is formed by concatenating the upper 3 bits of the current PC + 2 to the 11-bit address field in the jump instruction and adding 00 as the last two bits.

    2. Simple Implementation Scheme

    The simplest possible implementation of the MISP Processor contains the datapath segments explained above added by the required control lines. a) Creating a Single Datapath

    The simplest datapath might attempt to execute all instructions in one clock cycle. This means that any element can be used only once per instruction. So these elements have to be duplicated. If possible datapath elements can be shared by different instruction flows. Therefore multiple connections to the input must be realised. This is commonly done by a multiplexer.

  • Figure shows the combined datapath including a memory of instructions and one for data, the ALU, the PC-unit and the mentioned multiplexers.

    b) ALU Control The NEO field that contains the information about the instruction has the following structure:

    op rs rt rd shamt

    5 bit 3 bit 3 bit 3 bit 2 bit

    The meaning of the fields are: op: basic operation rs: first register source rt: second register source rd: register destination shamt: shift amount Opcode for R-Type instructions-

    Mnemonic Opcode Description

    ADD 00000 RD = RS + RT

    MOVE 00001 RD = RS

    SUB 00010 RD = RS RT

    SLL 00011 RD = RS > SHIFT

    AND 00101 RD = RS & RT

  • OR 00110 RD = RS | RT

    NOT 00111 RD = ~RS

    XOR 01000 RD = RS XOR RT

    SLT 01001 RD = (RS < RT) ? 1 : 0

    JR 01010 PC = RS

    Opcode for I-Type instructions-

    Mnemonic Opcode Description

    ADDI 01011 RD = RS + CONST

    SUBI 01100 RD = RS - CONST

    SLTI 01101 RD = (RS < CONST) ? 1 : 0

    LW 01110 RD = MEM(RS + OFF)

    SW 01111 RS = MEM(RT + OFF)

    Opcode for J-Type instructions-

    Mnemonic Opcode Description

    BEQ 10000 IF, RS = RT, PC += OFF

    BNE 10001 IF, RS RT, PC += OFF

    J 10010 Jump to address

    JAL 10011 Jump and link

    c) Main Control The main control unit generates the control bits for the multiplexers, the data memory and the ALU control unit. The input of the main control unit is the 5-bit op-field of the NEO instruction field.

  • 3. Multicycle Implementation To avoid the disadvantages of the single cycle implementation described in the section before, a multicycle implementation is used. This technique divides each instruction into steps and each step is executed in one clock cycle. The multicycle implementation allows a functional unit to be used more than once in a instruction, so that the number of functional units can be reduced. The major advantage of a multicycle design is the ability to share functional units within an execution.

  • a) Additions and Changes in the Scheme

    Comparing to the single-cycle datapath the differences are that only one memory unit is used for instructions and data, there is only one ALU instead of an ALU and two adders and several output registers are added to hold the output value of a unit until it is used in a later clock cycle. The instruction register (IR) and the memory data register (MDR) are added to save the output of the memory. The registers A and B hold the register operands read form the register file and the ALUOut holds the output of the ALU. With exception of the IR all these registers hold data only between a pair of adjacent clock cycles. Because the IR holds the value during the whole time of the execution of a instruction, it requires a write control signal. The reduction from former three ALUs to one causes also the following changes in the datapath : An additional multiplexer is added for the first ALU input to choose between the A register and the PC. The multiplexer at the second ALU input is changed from a two-way to a fourway multiplexer. The two new inputs are a constant 2 to increment the PC and the sign-extended and shifted offset field for the branch instruction. In order to handle branches and jumps more additions in the datapath are required. The three cases of R-type instructions, branch instruction and jump instruction cause three different values to be written into the PC: The output of the ALU which is PC + 2 should be stored directly to the PC. The register ALUOut after computing the branch target address. The lower 11 bits of the IR shifted left by two and concatenated with the upper 4 bits of the incremented PC, when the instruction is jump. If the instruction is branch, the write signal for the PC is conditional. Only if the the two compared registers are equal, the computed branch address has to be written to the PC.

  • Therefore the PC needs two write signals, which are PCWrite if the write is unconditional (value is PC + 2 or jump instruction) and PCWriteCond if the write is conditional. The write signal for the PC is combined form the ALU zero bit and the two write signals PCWrite and PCWriteCond by an AND gate and OR gate. b) Execution of Instructions in Clock Cycles The execution of an instruction is broken into clock cycles, that means that each instruction is divided into a series of steps.

    The execution of an instruction is divided into maximal five steps. Different elements of the datapath can work in parallel during one clock cycle, whereas others can only be used in series. So there must be sure, that after one step the values computed are stored either in the memory or in one of the registers. The operation steps are: 1. Instruction fetch step

    Fetch the instruction from the memory and computed the address of the sequential instruction:

  • IR = Memory[PC] PC = PC + 4

    Control signal setting: MemRead = 1 IRWrite = 1 IorD = 0 ALUSrcA = 1 ALUSrcB = 01 ALUOp = 00 PCSource = 00 PCWrite = 1 2. Instruction decode and register fetch step It is still unknown what the instruction is, so there can only be performed actions that are applicable for all instructions or are not harmful. The registers indicated by the rs and rd field of the instruction are read and store into the A and B register, and the potential branch target is computed and stored into the ALUOut register. A = Reg[IR[13-11]] B = Reg[IR[10-8]] ALUOut = PC + (sign-extend (IR[7-0])

  • b. Arithmetic-logical instruction: ALUOut = A op B

    Control signal setting: ALUSrcA = 1 ALUSrcB = 00 ALUOp = 10

    c. Branch: if (A == B) PC = ALUOut

    Control signal setting: ALUSrcA = 1 ALUSrcB = 00 ALUOp = 01 PCWriteCond = 1 PCSource = 01

    d. Jump: PC = PC[15-13] & (IR[10-0]

  • Control signal setting: RegDst = 1 RegWrite = 1 MemtoReg = 0

    5. Memory read completion step The load instruction is completed by writing back the value from the memory: Reg[IR[9-7]] = MDR

    Control signal setting: MemtoReg = 1 RegWrite = 1 RegDst = 0

    4. Module Specification a) ALU Functional Description The arithmetic-logic unit (ALU) performs basic arithmetic and logic operations which are controlled by the opcode. The result of the instruction is written to the output. An additional zero-bit signalizes an high output if the result equals zero. At the present time, the basic arithmetic operations add and sub and the logic operations and, or and slt can be applied to inputs. The inputs are 16 bit wide with type unsigned. A detection of overflow or borrow is not supported at the moment.

    b) Memory Functional Description

    Data is synchronously written to or read from the memory with a data bus width of 16 bit. The memory consists of four ram blocks with 8 bit data width each. A control signal enables the memory to be written, otherwise data is only read. In order to store data to the memory the data word is subdivided into four bytes which are separately written to the ram blocks. Vice versa, the single bytes are concatenated to get the data word back again. At the moment, it is only possible to read and write data words. An addressing of half-words or single bytes is not allowed. In order to write or read data words, all ram blocks have to be selected. Hence, the lowest two bit are not examined for chip-select logic.

  • Data is addressed by the NEO-processor with an address width of 16 bit, while the address width of a ram block is 8 bit each. All ram blocks are connected to the same address. Since we do not use the full address width for addressing and chip selects, data words are addressed by multiple addresses.

    c) Control Functional Description The input to the State Machine are the upper 5 bits of the opcode field containing the instruction. The outputs of the state machine are the control signals of the single functional units of the processor implementation especially the multiplexers of the datapath. The Operation Code of the ALU is stored in a truth table and the corresponding Opcode is produced depending on the ALUOp signal of the state machine.

    d) Data Path Functional Description

    The datapath is divided into four sections with respect to the pipelining structure of a processor. The four parts are the Instruction Fetch, Instruction Decode, Execution and Memory Writeback. These sections are synthesized of their own and then combined to the Data Block. i) Instruction Fetch Functional Description The Instruction Fetch Block contains the PC the Instruction Register and the Memory Data Register. This part provides the data and instruction form the memory. ii) Instruction Decode Functional Description The Instruction Decode Block writes the instruction of the Instruction Register to the Register File and computes the second operand for a Branch Instruction or a sw- or lw-instruction. iii) Execution Functional Description The Execution contains the ALU as main element and computes the desired result of the instruction.

  • It also computes the jump target address and provides it for the Memory Writeback Block. The operands loaded to the ALU are chosen by two multiplexers which are sensible to the signals ALUSrcA and ALUSrcB. iv) Memory Writeback Functional Description The Memory Writeback Block consists of the ALUOut register and a multiplexer with source signal PCSource. This block leads the result of the computation either back to memory or to the register file. The multiplexer leads back the next PC value depending on the PCSource signal.

    e) Processor and Memroy Functional Description The both parts Datapath and Controlpath are combined to the processing unit. Together with the Memory the whole processor is completed.

  • 3. Uniqueness

    What we were trying to do while working on this project was to learn and experiment on

    how a processor works and how we could modify its specifications for our purpose. So, we

    did not need a lot of instructions to implement and so thought upon of building our own

    ISA for our processor NEO.

    We designed our processor for lower number of instructions. This implied that we needed

    a few bits. So we devised a compact ISA of 16-bit set.

    It reduced the added burden of unused bits.

    ISA did not have function field rather we integrated it in opcode field only.

    So our ISA is as follows :

    R- Type Format

    op rs rt rd shamt

    5 bit 3 bit 3 bit 3 bit 2 bit

    I- Type Format

    Op rs rt constant

    5 bit 3 bit 3 bit 5 bit

    J- Type Format

    op offset

    5 bit 11 bit

  • 4. Challenges Faced

    First of all we would like to say that working on this project was very fun and we really liked

    working on this project as a team. We also realized how an impossible looking task viewed

    from the eyes of an individual becomes so easy when people of different special calibre get

    united and work upon it.

    But during this term we faced a lot of problems too.

    First of all there was time limitation because of over-burden of a few irrelevant courses

    which led to less devotion of time to more relevant and useful subjects.

    Then, while working on the project, programming of individual components was not that

    difficult but implementation of synchronous operation of control and datapath was quite

    complex for us.Also integration of VHDL codes of individual components written by

    different team members in different mnemonics and variables led to lot of errors and

    chaos.

    Now a few technical errors that arose also made us feel downhearted.

    First we were using ModelSim which was the default option for simulator but the code

    segments were not simulating which led us to think that there was error in our codes. But

    later we realize that it was not because of the codes but because of using MultiSim. So we

    shifted on to using ISim for simulation which removed a lot of warnings and simulation

    errors from our codes.

    Then we would like to recommend to make use of already available advanced libraries such

    as numeric_std to include arithmetic operations of Signed integers, since it makes our work

    very easy.

    Also, there is a strange fact about the simulation of RTL Schematic of Xilinx. When we

    generated our RTL Schematic, we knew that we were not taking most of the bits from

    various components but what Xilinx did was that it removed all those component diagrams

    from RTL Schematic.

    Later, on googling it we found out that this behaviour was happening was because in the

    main entity no output was defined from those blocks i.e. if a block of component is not

    giving any output, Xilinx simply removes that from the RTL Schematic. So make sure

    whenever you find such an ambiguity, just make sure to include the output from every

    block and not leave a block as waste else it wouldnt be included in the RTL Schematic.

  • 5. Conclusion

    o Our own experiences

    Application of Theoretical knowledge learned in VHDL and Computer Architecture

    We learned VHDL coding in the course DIGITAL ELECTRONICS. But we never

    thought that we would have such a great opportunity to use that knowledge in

    future to develop a code for our own processor while in our UG academics. When

    we started, a general tension of How we will be able to read so many pages of

    books to know deeply about this course was in our minds but now we all have

    developed how to do pacing of pages and to fetch only that stuff that is important .

    Realization that working as an individual and as a TEAM is a totally different aspect

    Since when we have entered this institute, we all have been working individually.

    But then the project under your guidance has changed our way of working. We all

    learned how to work in a group effectively and efficiently in this world where people

    of different traits and natures are there. Working on this project taught us how to

    bring effectivity and outcome combined as a total team effort and not purely on an

    individual level. So we thank you sir for giving us that platform that brought out the

    best in us.

    Got acquainted with the existing processors and their variety for use in different

    purposes

    Now we know what are the different technologies used in different processors and

    we can also differentiate between their varying characteristics. We also came to

    know that how we can make our NEO more useful and according to demand now a

    days.

    o Scope of Improvement

    Including more operations

    Our main intention while doing this project was to learn about how a processor

    really works. So,we didnt focus on including a large no. of instructions. If we are

    given more time, we would like to include more operations.

  • Include Pipelining to improve the processors performance

    As we all know, Pipelining enhances the performance of a processor. But due to time

    limitation, we could not work upon including pipeling for our processor.

    Realizing hardware implementation of the processor on FPGA Spartan 3 kit

    We were almost there in realizing the hardware implementation of the processor on

    FPGA Spartan 3 kit but we could not arrange the FPGA kit. We even studied about

    the user constraint file required in the VHDL for hardware implementation on FPGA.

    But due to unavailability of the kit, we could not achieve what we intended for.

  • 6. VHDL Code ( RTL Schematic included in folder )

    1. VHDL code for ALU

    LIBRARY IEEE;

    USE IEEE.std_logic_1164.ALL;

    USE IEEE.numeric_std.ALL;

    -- use package

    USE work.procmem_definitions.ALL;

    ENTITY alu IS

    PORT (

    a, b : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    opcode : IN STD_ULOGIC_VECTOR(2 DOWNTO 0);

    result : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    zero : OUT STD_ULOGIC);

    END alu;

    ARCHITECTURE behave OF alu IS

    BEGIN

    PROCESS(a, b, opcode)

    -- declaration of variables

    VARIABLE a_uns : UNSIGNED(width-1 DOWNTO 0);

    VARIABLE b_uns : UNSIGNED(width-1 DOWNTO 0);

    VARIABLE r_uns : UNSIGNED(width-1 DOWNTO 0);

    VARIABLE z_uns : UNSIGNED(0 DOWNTO 0);

    BEGIN

    -- initialize values

    a_uns := UNSIGNED(a);

    b_uns := UNSIGNED(b);

    r_uns := (OTHERS => '0');

    z_uns(0) := '0';

    CASE opcode IS

    -- add

    WHEN "010" =>

    r_uns := a_uns + b_uns;

    -- sub

    WHEN "110" =>

    r_uns := a_uns - b_uns;

    -- and

    WHEN "000" =>

    r_uns := a_uns AND b_uns;

    -- or

    WHEN "001" =>

    r_uns := a_uns OR b_uns;

    -- slt

    WHEN "111" =>

    r_uns := a_uns - b_uns;

    IF SIGNED(r_uns) < 0 THEN

    r_uns := TO_UNSIGNED(1, r_uns'LENGTH);

    ELSE

    r_uns := (OTHERS => '0');

    END IF;

  • -- others

    WHEN OTHERS => r_uns := (OTHERS => 'X');

    END CASE;

    -- set zero bit if result equals zero

    IF TO_INTEGER(r_uns) = 0 THEN

    z_uns(0) := '1';

    ELSE

    z_uns(0) := '0';

    END IF;

    -- assign variables to output signals

    result ALUopcode ALUopcode ALUopcode ALUopcode ALUopcode ALUopcode

  • when others => ALUopcode clk,

  • rst_n => rst_n,

    instr_15_11 => instr_15_11,

    RegDst => RegDst,

    RegWrite => RegWrite,

    ALUSrcA => ALUSrcA,

    MemRead => MemRead,

    MemWrite => MemWrite,

    MemtoReg => MemtoReg,

    IorD => IorD,

    IRWrite => IRWrite,

    PCWrite => PCWrite_intern,

    PCWriteCond => PCWriteCond_intern,

    ALUOp => ALUOp_intern,

    ALUSrcB => ALUSrcB,

    PCSource => PCSource

    );

    inst_ALUControl : ALUControl

    PORT MAP (

    instr_4_0 => instr_4_0,

    ALUOp => ALUOp_intern,

    ALUopcode => ALUopcode

    );

    PC_en

  • END IF;

    END PROCESS;

    -------------------------------------------------------------------------------

    -- Logic Process

    logic_process : PROCESS(state, instr_15_11)

    -- RegDst RegWrite ALUSrcA MemRead MemWrite MemtoReg IorD IRWrite PCWrite PCWriteCond10x1bit

    -- ALUOp ALUSrcB PCSource3x2bit

    VARIABLE control_signals : std_ulogic_vector(15 downto 0);

    -- Defintion of Constants for the value of the Inst_Funct_Field

    Constant LOADWORD : std_ulogic_vector(4 downto 0) := "00011";

    Constant STOREWORD : std_ulogic_vector(4 downto 0) := "01011";

    Constant RTYPE : std_ulogic_vector(4 downto 0) := "00000";

    Constant BEQ : std_ulogic_vector(4 downto 0) := "00100";

    Constant JMP : std_ulogic_vector(4 downto 0) := "00010";

    BEGIN

    CASE state IS

    -- Instruction Fetch

    WHEN InstFetch =>

    control_signals := "0001000110000100";

    next_state

    control_signals := "0000000000001100";

    IF instr_15_11 = LOADWORD OR instr_15_11 = STOREWORD THEN

    next_state

  • WHEN BranchCompl =>

    control_signals := "0010000001010001";

    next_state

    control_signals := "0000000010001110";

    next_state

    control_signals := (others => 'X');

    next_state

  • ARCHITECTURE behave OF data IS

    COMPONENT data_fetch

    PORT (

    clk : IN STD_ULOGIC;

    rst_n : IN STD_ULOGIC;

    pc_in : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    alu_out : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    mem_data : IN std_ulogic_vector(width-1 DOWNTO 0);

    PC_en : IN STD_ULOGIC;

    IorD : IN STD_ULOGIC;

    IRWrite : IN STD_ULOGIC;

    reg_memdata : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    instr_15_11 : OUT STD_ULOGIC_VECTOR(4 downto 0);

    instr_10_8 : OUT STD_ULOGIC_VECTOR(2 DOWNTO 0);

    instr_7_5 : OUT STD_ULOGIC_VECTOR(2 DOWNTO 0);

    instr_4_0 : OUT STD_ULOGIC_VECTOR(4 downto 0);

    mem_address : OUT std_ulogic_vector(width-1 DOWNTO 0);

    pc_out : OUT std_ulogic_vector(width-1 DOWNTO 0));

    END COMPONENT;

    COMPONENT data_decode

    PORT (

    clk : IN STD_ULOGIC;

    rst_n : IN STD_ULOGIC;

    instr_10_8 : IN STD_ULOGIC_VECTOR(2 DOWNTO 0);

    instr_7_5 : IN STD_ULOGIC_VECTOR(2 DOWNTO 0);

    instr_4_0 : IN STD_ULOGIC_VECTOR(4 downto 0);

    reg_memdata : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    alu_out : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    RegDst : IN STD_ULOGIC;

    RegWrite : IN STD_ULOGIC;

    MemtoReg : IN STD_ULOGIC;

    reg_A : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    reg_B : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    instr_4_0_se : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    instr_4_0_se_sl : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0));

    END COMPONENT;

    COMPONENT data_execution

    PORT (

    instr_10_8 : IN std_ulogic_vector(2 downto 0);

    instr_7_5 : IN std_ulogic_vector(2 downto 0);

    instr_4_0 : IN std_ulogic_vector(4 downto 0);

    ALUSrcA : IN std_ulogic;

    ALUSrcB : IN std_ulogic_vector(1 downto 0);

    ALUopcode : IN std_ulogic_vector(2 downto 0);

    reg_A, reg_B : IN std_ulogic_vector(width-1 downto 0);

    pc_out : IN std_ulogic_vector(width-1 downto 0);

    instr_4_0_se : IN std_ulogic_vector(width-1 downto 0);

  • instr_4_0_se_sl : IN std_ulogic_vector(width-1 downto 0);

    jump_addr : OUT std_ulogic_vector(width-1 downto 0);

    alu_result : OUT std_ulogic_vector(width-1 downto 0);

    zero : OUT std_ulogic);

    END COMPONENT;

    COMPONENT data_memwriteback

    PORT (

    clk, rst_n : IN std_ulogic;

    jump_addr : IN std_ulogic_vector(width-1 downto 0);

    alu_result : IN std_ulogic_vector(width-1 downto 0);

    PCSource : IN std_ulogic_vector(1 downto 0);

    pc_in : OUT std_ulogic_vector(width-1 downto 0);

    alu_out : OUT std_ulogic_vector(width-1 downto 0));

    END COMPONENT;

    SIGNAL pc_in_intern : std_ulogic_vector(width-1 downto 0);

    SIGNAL alu_out_intern : std_ulogic_vector(width-1 downto 0);

    SIGNAL reg_memdata_intern : std_ulogic_vector(width-1 downto 0);

    SIGNAL instr_10_8_intern : std_ulogic_vector(2 downto 0);

    SIGNAL instr_7_5_intern : std_ulogic_vector(2 downto 0);

    SIGNAL instr_4_0_intern : std_ulogic_vector(4 downto 0);

    SIGNAL pc_out_intern : std_ulogic_vector(width-1 downto 0);

    SIGNAL reg_A_intern : std_ulogic_vector(width-1 downto 0);

    SIGNAL reg_B_intern : std_ulogic_vector(width-1 downto 0);

    SIGNAL instr_4_0_se_intern : std_ulogic_vector(width-1 downto 0);

    SIGNAL instr_4_0_se_sl_intern : std_ulogic_vector(width-1 downto 0);

    SIGNAL jump_addr_intern : std_ulogic_vector(width-1 downto 0);

    SIGNAL alu_result_intern : std_ulogic_vector(width-1 downto 0);

    BEGIN

    inst_data_fetch: data_fetch

    PORT MAP (

    clk => clk,

    rst_n => rst_n,

    pc_in => pc_in_intern,

    alu_out => alu_out_intern,

    mem_data => mem_data,

    PC_en => PC_en,

    IorD => IorD,

    IRWrite => IRWrite,

    reg_memdata => reg_memdata_intern,

    instr_15_11 => instr_15_11,

    instr_10_8 => instr_10_8_intern,

    instr_7_5 => instr_7_5_intern,

    instr_4_0 => instr_4_0_intern,

    mem_address => mem_address,

    pc_out => pc_out_intern);

  • inst_data_decode : data_decode

    PORT MAP (

    clk => clk,

    rst_n => rst_n,

    instr_10_8 => instr_10_8_intern,

    instr_7_5 => instr_7_5_intern,

    instr_4_0 => instr_4_0_intern,

    reg_memdata => reg_memdata_intern,

    alu_out => alu_out_intern,

    RegDst => RegDst,

    RegWrite => RegWrite,

    MemtoReg => MemtoReg,

    reg_A => reg_A_intern,

    reg_B => reg_B_intern,

    instr_4_0_se => instr_4_0_se_intern,

    instr_4_0_se_sl => instr_4_0_se_sl_intern

    );

    inst_data_execution: data_execution

    PORT MAP (

    instr_10_8 => instr_10_8_intern,

    instr_7_5 => instr_7_5_intern,

    instr_4_0 => instr_4_0_intern,

    ALUSrcA => ALUSrcA,

    ALUSrcB => ALUSrcB,

    ALUopcode => ALUopcode,

    reg_A => reg_A_intern,

    reg_B => reg_B_intern,

    pc_out => pc_out_intern,

    instr_4_0_se => instr_4_0_se_intern,

    instr_4_0_se_sl => instr_4_0_se_sl_intern,

    jump_addr => jump_addr_intern,

    alu_result => alu_result_intern,

    zero => zero

    );

    inst_data_memwriteback : data_memwriteback

    PORT MAP (

    clk => clk,

    rst_n => rst_n,

    jump_addr => jump_addr_intern,

    alu_result => alu_result_intern,

    PCSource => PCSource,

    pc_in => pc_in_intern,

    alu_out => alu_out_intern

    );

    reg_B

  • 6. VHDL code for data decode

    LIBRARY IEEE;

    USE IEEE.std_logic_1164.ALL;

    USE IEEE.numeric_std.ALL;

    -- use package

    USE work.procmem_definitions.ALL;

    ENTITY data_decode IS

    PORT (

    -- inputs

    clk : IN STD_ULOGIC;

    rst_n : IN STD_ULOGIC;

    instr_10_8 : IN STD_ULOGIC_VECTOR(2 DOWNTO 0);

    instr_7_5 : IN STD_ULOGIC_VECTOR(2 DOWNTO 0);

    instr_4_0 : IN STD_ULOGIC_VECTOR(4 DOWNTO 0);

    reg_memdata : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    alu_out : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    -- control signals

    RegDst : IN STD_ULOGIC;

    RegWrite : IN STD_ULOGIC;

    MemtoReg : IN STD_ULOGIC;

    -- outputs

    reg_A : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    reg_B : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    instr_4_0_se : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    instr_4_0_se_sl : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0)

    );

    END data_decode;

    ARCHITECTURE behave OF data_decode IS

    COMPONENT regfile IS

    PORT (clk,rst_n : IN std_ulogic;

    wen : IN std_ulogic; -- write control

    writeport : IN std_ulogic_vector(width-1 DOWNTO 0); -- register input

    adrwport : IN std_ulogic_vector(regfile_adrsize-1 DOWNTO 0);-- address write

    adrport0 : IN std_ulogic_vector(regfile_adrsize-1 DOWNTO 0);-- address port 0

    adrport1 : IN std_ulogic_vector(regfile_adrsize-1 DOWNTO 0);-- address port 1

    readport0 : OUT std_ulogic_vector(width-1 DOWNTO 0); -- output port 0

    readport1 : OUT std_ulogic_vector(width-1 DOWNTO 0) -- output port 1

    );

    END COMPONENT;

    COMPONENT tempreg IS

    PORT (

    clk : IN STD_ULOGIC;

    rst_n : IN STD_ULOGIC;

    reg_in : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    reg_out : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0) );

    END COMPONENT;

  • -- internal signals

    SIGNAL write_reg : STD_ULOGIC_VECTOR(regfile_adrsize-1 DOWNTO 0);

    SIGNAL write_data : STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    SIGNAL data_1 : STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    SIGNAL data_2 : STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    BEGIN

    A : tempreg

    PORT MAP (

    clk => clk,

    rst_n => rst_n,

    reg_in => data_1,

    reg_out => reg_A );

    B : tempreg

    PORT MAP (

    clk => clk,

    rst_n => rst_n,

    reg_in => data_2,

    reg_out => reg_B );

    inst_regfile : regfile

    PORT MAP (

    clk => clk,

    rst_n => rst_n,

    wen => RegWrite,

    writeport => write_data,

    adrwport => write_reg,

    adrport0 => instr_10_8,

    adrport1 => instr_7_5,

    readport0 => data_1,

    readport1 => data_2 );

    -- multiplexer for write register

    write_reg 'X');

    -- multiplexer for write data

    write_data 'X');

    -- sign extension and shift

    proc_sign_ext : PROCESS(instr_4_0)

    -- variables needed for reading result of sign extension

    VARIABLE temp_instr_4_0_se : STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    VARIABLE temp_instr_4_0_se_sl : STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    BEGIN

    -- sign extend instr_4_0 to 32 bits

    temp_instr_4_0_se := STD_ULOGIC_VECTOR(RESIZE(SIGNED(instr_4_0),

    instr_4_0_se'LENGTH));

    -- shift left 2

    temp_instr_4_0_se_sl := temp_instr_4_0_se(width-3 DOWNTO 0) & "00";

    instr_4_0_se

  • END PROCESS;

    END behave;

    7. VHDL code for data execution

    LIBRARY IEEE;

    USE IEEE.std_logic_1164.ALL;

    USE IEEE.numeric_std.ALL;

    -- use package

    USE work.procmem_definitions.ALL;

    ENTITY data_execution IS

    PORT (instr_10_8 : IN std_ulogic_vector(2 downto 0);

    instr_7_5 : IN std_ulogic_vector(2 downto 0);

    instr_4_0 : IN std_ulogic_vector(4 downto 0);

    ALUSrcA : IN std_ulogic;

    ALUSrcB : IN std_ulogic_vector(1 downto 0);

    ALUopcode : IN std_ulogic_vector(2 downto 0);

    reg_A, reg_B : IN std_ulogic_vector(width-1 downto 0);

    pc_out : IN std_ulogic_vector(width-1 downto 0);

    instr_4_0_se : IN std_ulogic_vector(width-1 downto 0);

    instr_4_0_se_sl : IN std_ulogic_vector(width-1 downto 0);

    jump_addr : OUT std_ulogic_vector(width-1 downto 0);

    alu_result : OUT std_ulogic_vector(width-1 downto 0);

    zero : OUT std_ulogic

    );

    END data_execution;

    ARCHITECTURE behave OF data_execution IS

    COMPONENT alu

    PORT (

    a, b : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    opcode : IN STD_ULOGIC_VECTOR(2 DOWNTO 0);

    result : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    zero : OUT STD_ULOGIC

    );

    END COMPONENT;

    SIGNAL mux_A_out : std_ulogic_vector(width-1 downto 0);

    SIGNAL mux_B_out : std_ulogic_vector(width-1 downto 0);

    BEGIN

    alu_inst: alu

    PORT MAP (

    a => mux_A_out,

    b => mux_B_out,

    opcode => ALUopcode,

    result => alu_result,

    zero => zero

    );

  • -- Multiplexor for ALU input A:

    mux_A : PROCESS (ALUSrcA, PC_out, reg_A)

    BEGIN

    CASE ALUSrcA IS

    WHEN '0' => mux_A_out mux_A_out mux_A_out 'X');

    END CASE;

    END PROCESS;

    -- Multiplexor for AlU input B:

    mux_B : PROCESS (ALUSrcB, reg_B, instr_4_0_se, instr_4_0_se_sl)

    BEGIN

    CASE ALUSrcB IS

    WHEN "00" => mux_B_out mux_B_out mux_B_out mux_B_out mux_B_out 'X');

    END CASE;

    END PROCESS;

    -- Computation of Jump Address:

    jump_addr

  • instr_4_0 : OUT STD_ULOGIC_VECTOR(4 DOWNTO 0);

    mem_address : OUT std_ulogic_vector(width-1 DOWNTO 0);

    pc_out : OUT std_ulogic_vector(width-1 DOWNTO 0)

    );

    END data_fetch;

    ARCHITECTURE behave OF data_fetch IS

    COMPONENT instreg IS

    PORT (

    clk : IN STD_ULOGIC;

    rst_n : IN STD_ULOGIC;

    memdata : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    IRWrite : IN STD_ULOGIC;

    instr_15_11 : OUT STD_ULOGIC_VECTOR(4 DOWNTO 0);

    instr_10_8 : OUT STD_ULOGIC_VECTOR(2 DOWNTO 0);

    instr_7_5 : OUT STD_ULOGIC_VECTOR(2 DOWNTO 0);

    instr_4_0 : OUT STD_ULOGIC_VECTOR(4 DOWNTO 0) );

    END COMPONENT;

    COMPONENT tempreg IS

    PORT (

    clk : IN STD_ULOGIC;

    rst_n : IN STD_ULOGIC;

    reg_in : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    reg_out : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0) );

    END COMPONENT;

    COMPONENT pc IS

    PORT (

    clk : IN STD_ULOGIC;

    rst_n : IN STD_ULOGIC;

    pc_in : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    PC_en : IN STD_ULOGIC;

    pc_out : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0) );

    END COMPONENT;

    -- signals for components

    SIGNAL pc_out_intern : STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    BEGIN

    -- instances of components

    proc_cnt: pc

    PORT MAP (

    clk => clk,

    rst_n => rst_n,

    pc_in => pc_in,

    PC_en => PC_en,

    pc_out => pc_out_intern);

    instr_reg : instreg

    PORT MAP (

    clk => clk,

    rst_n => rst_n,

    memdata => mem_data,

    IRWrite => IRWrite,

  • instr_15_11 => instr_15_11,

    instr_10_8 => instr_10_8,

    instr_7_5 => instr_7_5,

    instr_4_0 => instr_4_0 );

    mem_data_reg : tempreg

    PORT MAP (

    clk => clk,

    rst_n => rst_n,

    reg_in => mem_data,

    reg_out => reg_memdata );

    -- multiplexer

    addr_mux : PROCESS(IorD, pc_out_intern, alu_out)

    VARIABLE mem_address_temp : STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    BEGIN

    IF IorD = '0' THEN

    mem_address_temp := pc_out_intern;

    ELSIF IorD = '1' THEN

    mem_address_temp := alu_out;

    ELSE

    mem_address_temp := (OTHERS => 'X');

    END IF;

    mem_address

  • reg_in : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    reg_out : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0)

    );

    END COMPONENT;

    SIGNAL alu_out_internal : std_ulogic_vector(width-1 downto 0);

    BEGIN

    tempreg_inst: tempreg

    PORT MAP (

    clk => clk,

    rst_n => rst_n,

    reg_in => alu_result,

    reg_out => alu_out_internal

    );

    -- Multiplexor for ALU input A:

    mux : PROCESS (PCSource, ALU_result, ALU_out_internal, jump_addr)

    BEGIN

    CASE PCSource IS

    WHEN "00" => pc_in pc_in pc_in pc_in 'X');

    END CASE;

    END PROCESS;

    alu_out

  • IF rst_n = '0' THEN

    instr_15_11 '0');

    instr_10_8 '0');

    instr_7_5 '0');

    instr_4_0 '0');

    ELSIF RISING_EDGE(clk) THEN

    -- write the output of the memory into the instruction register

    IF(IRWrite = '1') THEN

    instr_15_11

  • SIGNAL data_out_1 : STD_LOGIC_VECTOR(7 DOWNTO 0);

    SIGNAL data_out_2 : STD_LOGIC_VECTOR(7 DOWNTO 0);

    SIGNAL data_out_3 : STD_LOGIC_VECTOR(7 DOWNTO 0);

    SIGNAL address_0 : STD_LOGIC_VECTOR(7 DOWNTO 0);

    SIGNAL address_1 : STD_LOGIC_VECTOR(7 DOWNTO 0);

    SIGNAL address_2 : STD_LOGIC_VECTOR(7 DOWNTO 0);

    SIGNAL address_3 : STD_LOGIC_VECTOR(7 DOWNTO 0);

    BEGIN

    -- instances of 4 ram blocks

    mem_block0 : ram

    PORT MAP (

    address => address_0,

    data => data_in_0,

    inclock => clk,

    wren_p => wren_p,

    q => data_out_0 );

    mem_block1 : ram

    PORT MAP (

    address => address_1,

    data => data_in_1,

    inclock => clk,

    wren_p => wren_p,

    q => data_out_1 );

    mem_block2 : ram

    PORT MAP (

    address => address_2,

    data => data_in_2,

    inclock => clk,

    wren_p => wren_p,

    q => data_out_2 );

    mem_block3 : ram

    PORT MAP (

    address => address_3,

    data => data_in_3,

    inclock => clk,

    wren_p => wren_p,

    q => data_out_3 );

    -- create a write_enable for instances

    wren_p not used for address

    -- note: ram blocks can be addressed with mulitple addresses

    temp_ram_address := mem_address(ram_adrwidth-1+2 DOWNTO 2);

    address_0

  • address_2

  • instr_4_0 : OUT std_ulogic_vector(4 downto 0);

    zero : OUT std_ulogic);

    END COMPONENT;

    -- internal signals for connection of components

    SIGNAL instr_15_11_intern : std_ulogic_vector(4 downto 0);

    SIGNAL instr_4_0_intern : std_ulogic_vector(4 downto 0);

    SIGNAL zero_intern : std_ulogic;

    SIGNAL ALUopcode_intern : std_ulogic_vector(2 downto 0);

    SIGNAL RegDst_intern : std_ulogic;

    SIGNAL RegWrite_intern : std_ulogic;

    SIGNAL ALUSrcA_intern : std_ulogic;

    SIGNAL MemtoReg_intern : std_ulogic;

    SIGNAL IorD_intern : std_ulogic;

    SIGNAL IRWrite_intern : std_ulogic;

    SIGNAL ALUSrcB_intern : std_ulogic_vector(1 downto 0);

    SIGNAL PCSource_intern : std_ulogic_vector(1 downto 0);

    SIGNAL PC_en_intern : std_ulogic;

    BEGIN

    inst_control : control

    PORT MAP (

    clk => clk,

    rst_n => rst_n,

    instr_15_11 => instr_15_11_intern,

    instr_4_0 => instr_4_0_intern,

    zero => zero_intern,

    ALUopcode => ALUopcode_intern,

    RegDst => RegDst_intern,

    RegWrite => RegWrite_intern,

    ALUSrcA => ALUSrcA_intern,

    MemRead => MemRead,

    MemWrite => MemWrite,

    MemtoReg => MemtoReg_intern,

    IorD => IorD_intern,

    IRWrite => IRWrite_intern,

    ALUSrcB => ALUSrcB_intern,

    PCSource => PCSource_intern,

    PC_en => PC_en_intern

    );

    inst_data: data

    PORT MAP (

    clk => clk,

    rst_n => rst_n,

    PC_en => PC_en_intern,

    IorD => IorD_intern,

    MemtoReg => MemtoReg_intern,

    IRWrite => IRWrite_intern,

    ALUSrcA => ALUSrcA_intern,

    RegWrite => RegWrite_intern,

    RegDst => RegDst_intern,

    PCSource => PCSource_intern,

    ALUSrcB => ALUSrcB_intern,

  • ALUopcode => ALUopcode_intern,

    mem_data => mem_data,

    reg_B => reg_B,

    mem_address => mem_address,

    instr_15_11 => instr_15_11_intern,

    instr_4_0 => instr_4_0_intern,

    zero => zero_intern

    );

    END behave;

    13. VHDL code for pc

    LIBRARY IEEE;

    USE IEEE.std_logic_1164.ALL;

    USE IEEE.numeric_std.ALL;

    -- use package

    USE work.procmem_definitions.ALL;

    ENTITY pc IS

    PORT (

    clk : IN STD_ULOGIC;

    rst_n : IN STD_ULOGIC;

    pc_in : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    PC_en : IN STD_ULOGIC;

    pc_out : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0) );

    END pc;

    ARCHITECTURE behave OF pc IS

    BEGIN

    proc_pc : PROCESS(clk, rst_n)

    VARIABLE pc_temp : STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    BEGIN

    IF rst_n = '0' THEN

    pc_temp := (OTHERS => '0');

    ELSIF RISING_EDGE(clk) THEN

    IF PC_en = '1' THEN

    pc_temp := pc_in;

    END IF;

    END IF;

    pc_out

  • ENTITY processor IS

    PORT (clk, rst_n : IN std_ulogic;

    run:in std_ulogic;

    --mem_data : IN std_ulogic_vector;

    data_in1 , mem_address1 : in std_ulogic_vector(width-1 DOWNTO 0);

    MemRead1, MemWrite1 : in std_ulogic;

    --data_in : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    data_out1 : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0)

    );

    END processor;

    ARCHITECTURE behave OF processor IS

    COMPONENT NEO

    PORT (

    clk, rst_n : IN std_ulogic;

    mem_data : IN std_ulogic_vector(width-1 downto 0);

    reg_B, mem_address : OUT std_ulogic_vector(width-1 downto 0);

    MemRead, MemWrite : out std_ulogic

    );

    END COMPONENT;

    COMPONENT memory

    PORT (

    clk : IN STD_ULOGIC;

    rst_n : IN STD_ULOGIC;

    MemRead : IN STD_ULOGIC;

    MemWrite : IN STD_ULOGIC;

    mem_address : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    data_in : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    data_out : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0));

    END COMPONENT;

    SIGNAL mem_data : std_ulogic_vector(width-1 downto 0);

    signal reg_B : std_ulogic_vector(width-1 downto 0);

    signal mem_address : std_ulogic_vector(width-1 downto 0);

    signal MemRead : std_ulogic;

    signal MemWrite : std_ulogic;

    signal WrEnable: std_ulogic;

    signal RdEnable: std_ulogic;

    signal addr: STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    signal data_write: STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    signal data_read : STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    BEGIN

    process( run, rst_n,MemRead,MemWrite,mem_address,reg_B, mem_data,MemRead1,MemWrite1,mem_address1,data_in1)

    begin

    if (run='1') then

    WrEnable

  • elsif (run='0') then

    WrEnable mem_address,

    MemRead => MemRead,

    MemWrite => MemWrite

    );

    inst_memory : memory

    PORT MAP (

    clk => clk,

    rst_n => rst_n,

    MemRead => RdEnable,

    MemWrite => WrEnable,

    mem_address => addr,

    data_in => data_write,

    data_out => data_read

    );

    data_out1

  • 16. VHDL code for ram

    LIBRARY IEEE;

    USE IEEE.std_logic_1164.ALL;

    USE IEEE.numeric_std.ALL;

    -- use altera_mf library for RAM block

    --LIBRARY altera_mf;

    --USE altera_mf.ALL;

    -- use package

    USE work.procmem_definitions.ALL;

    ENTITY ram IS

    PORT (address : IN std_logic_vector(7 DOWNTO 0);

    data : IN std_logic_vector(7 DOWNTO 0);

    inclock : IN std_logic; -- used to write data in RAM cells

    wren_p : IN std_logic;

    q : OUT std_logic_vector(7 DOWNTO 0));

    END ram;

    ARCHITECTURE rtl OF ram IS

    TYPE MEM IS ARRAY(0 TO 255) OF std_logic_vector(7 DOWNTO 0);

    SIGNAL ram_block : MEM;

    SIGNAL read_address_reg : std_logic_vector(7 DOWNTO 0);

    BEGIN

    PROCESS (inclock)

    BEGIN

    IF rising_edge(inclock) THEN

    IF (wren_p = '1') THEN

    ram_block(to_integer(unsigned(address)))

  • 17. VHDL code for regfile

    LIBRARY IEEE;

    USE IEEE.std_logic_1164.ALL;

    USE IEEE.numeric_std.ALL;

    -- use package

    USE work.procmem_definitions.ALL;

    ENTITY regfile IS

    PORT (clk,rst_n : IN std_ulogic;

    wen : IN std_ulogic; -- write control

    writeport : IN std_ulogic_vector(width-1 DOWNTO 0); -- register input

    adrwport : IN std_ulogic_vector(regfile_adrsize-1 DOWNTO 0);-- address write

    adrport0 : IN std_ulogic_vector(regfile_adrsize-1 DOWNTO 0);-- address port 0

    adrport1 : IN std_ulogic_vector(regfile_adrsize-1 DOWNTO 0);-- address port 1

    readport0 : OUT std_ulogic_vector(width-1 DOWNTO 0); -- output port 0

    readport1 : OUT std_ulogic_vector(width-1 DOWNTO 0) -- output port 1

    );

    END regfile;

    ARCHITECTURE behave OF regfile IS

    SUBTYPE WordT IS std_ulogic_vector(width-1 DOWNTO 0); -- reg word TYPE

    TYPE StorageT IS ARRAY(0 TO regfile_depth-1) OF WordT; -- reg array TYPE

    SIGNAL registerfile : StorageT; -- reg file contents

    BEGIN

    -- perform write operation

    PROCESS(rst_n, clk)

    BEGIN

    IF rst_n = '0' THEN

    FOR i IN 0 TO regfile_depth-1 LOOP

    registerfile(i) '0');

    END LOOP;

    ELSIF rising_edge(clk) THEN

    IF wen = '1' THEN

    registerfile(to_integer(unsigned(adrwport)))

  • ENTITY tempreg IS

    PORT (

    clk : IN STD_ULOGIC;

    rst_n : IN STD_ULOGIC;

    reg_in : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

    reg_out : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0)

    );

    END tempreg;

    ARCHITECTURE behave OF tempreg IS

    BEGIN

    temp_reg: PROCESS(clk, rst_n)

    BEGIN

    IF rst_n = '0' THEN

    reg_out '0');

    ELSIF RISING_EDGE(clk) THEN

    -- write register input to output at rising edge

    reg_out

  • 7. References

    David A. Patterson, John L. Hennessy: Computer Organization and Design - The Hardware/Software Interface - Third Editon

    Computer Organization and Architecture: Designing for Performance, 8th Edition, William Stallings

    Computer System Architecture, 3rd Edition, M. Morris Mano

    Dr. Sanjeev Manhas VHDL Slides

    http://vhdlguru.blogspot.in/