s3 -processor hardware implementation (control& datapath)
DESCRIPTION
computer architecturecomputer organization & designTRANSCRIPT
Hardware Implementation Datapath & Conrol
(Processor)
Mohammed Ali AbbakerYossof Ali Abd-Elgadir
Mohammed AbdAlhakam
December 2010
2
Computer Organization
3
Introduction
• Before designing a machine we must discuss:– How the logical implementation of the machine
will operate ‘Datapath’– How the machine is clocked ‘Control’.
4
Logical elements ‘Functional Units’
Combinational elements Sequential elementsThe output depends only on the current
stateThe output depends on earlier states
Same input produce same output Contain a memory save the state
Ex. ALU Ex. Register
- The state element can be read any time- The clock is used to determine when the state element should be written
5
Clocking methodology:
- Defines when signal can read & when they can be written.
- If a signal is written at the same time it is read, the reading value could be:
1.The old value.2.The newly written value3.May be some mix of the two- Avoid the circumstance.
6
Edge-triggered clocking
• For simplicity, an edge-triggered clocking methodology is assumed, which means that any values stored in the machine are updated only on a clock edge. The state elements all update their internal storage on the clock edge.
7
Basic MIPS Implementation
•This implementation includes a subset of the core MIPS instruction set from the 4 basic classes:
- The memory reference instructions: load word (lw) and store word (sw)
- The arithmetic-logical instructions: add, sub, and, or, and slt.- The conditional instructions: branch equal (beq) and the
unconditional instructions: jump (j).
•Not include all integer instruction (for ex. Shift, multiply, divide are missing) & not include any floating-point instruction.
8
Basic MIPS Implementation cont.
• But the key principles to create a datapath and to design the control will be clear.
• most concepts used in implementing the MIPS subset at in this seminar & in the next seminar ‘Pipeline’ are the same basic ideas that are used to construct a broad spectrum of computer, from high performance servers to general-purpose microprocessors to embedded processors .
9
overview of the implementation:
•For the three types of the MIPS instruction set, much of what needs to be done to implement them is same independent of class of the instruction.
•For very instruction, the first two steps are identical:
1. Send the program Counter (PC) to the code memory & fetch it .2. Read 1 or 2 registers
•After these two steps, the action depends on the instruction class. Fortunately, for the three instruction classes, the actions are largely the same.
10
overview of the implementation cont.
3. All the instruction use the ALU except jump (j): - the memory reference instruction use it for an
address calculation. - the arithmetic-logical instruction use it for the
operation execution. - the branch instruction use it for comparison.
•Simplicity & regularity of the instruction simplify the implementation make the execution of different instruction classes similar.
11
overview of the implementation cont.
4. The action now depend on the class:
- the memory reference instruction needs to access the memory to write data @store / read data @load.
- the arithmetic-logical instruction must write back the ALU result into a register.
- the branch instruction may need to change the next instruction address based on the conversion; otherwise the PC should increment by 4 for the next instruction address.
12
High level view of MIPS implementation:
Figure 1:abstract view of MIPS subset implementation with major functional units & connection
13
Figure 1
- The input of a particular unit can come two sources.- They can’t simply wire together.- They connect together by using an element named a
data selector which choose one of the multiple sources - The data selector which is a multiplexer in fact.- The mux selects from the multiple sources depending
on the setting of its control line.- The control line setting bases on information coming
from the executed instruction.
14
High level view of MIPS implementation cont.
•Many other units is controlled depending on the type of the instruction:
- The data memory must read on a load & write on a store.
- The register must be written on a load & arithmetic-logical instruction.
- The ALU must perform several operations.
15
Figure 2: Basic implementation of MIPS subset with necessary multiplexers & control lines
16
Figure 2
- The top multiplexer control the PC.- The medium multiplexer control steering the ALU output
(arithmetic ins.) or the data memory output (load ins.).- The last mux control the ALU input which is from register
(nonimmediate arithmetic-logical ins.) or it’s from offset field of the instruction (immediate operations, load or store, or branch).
•In the next more functional units will appear & no. of connections between them will increase & of course more control units.
17
Datapath
• The collection of state elements, computation elements, and interconnections that together provide a conduit for the flow and transformation of data in the processor during execution.
18
Control
• The component of the processor that commands the datapath, memory, and I/O devices according to the instructions of the program.
20
Abstract View of the DataPath
• Shown are abstract view of datapath, let’s first look at:
Registers
Register #
Data
Register #
Datamemory
Address
Data
Register #
PC Instruction ALU
Instructionmemory
Address
21
Abstract View of the DataPath
• Arithmetic Logic Unit (ALU): is a digital circuit that performs arithmetic and logical operations.
Registers
Register #
Data
Register #
Datamemory
Address
Data
Register #
PC Instruction ALU
Instructionmemory
Address
22
ALU design• Shown are single ALU cell
gives operation according to control inputs.
• 32 such cell forms the ALU, an attention given for MSB cell from which we can detect overflow, and compare
23
ALU design• The four control lines specify the
operation (2 LSB bits).
ALU control lines Function
0000 AND
0001 OR
0010 add
0110 subtract
0111 set on less than
1100 NOR
4
32
32
32
ALU operation
A
B
Zero
Result
Overflow
CarryOut
CarryIn
24
Abstract View of the DataPath
• Registers file: A state element that consists of a set of registers that can be read and written by supplying a register number to be accessed.
Registers
Register #
Data
Register #
Datamemory
Address
Data
Register #
PC Instruction ALU
Instructionmemory
Address
25
Register file (reading):
Reg 31
…..
Reg 1
Reg 0
…..
…..
M
UX
M
UX
32
5Read register 1
Read register 2
Read data 1
Read data 2
• To specify the two
source registers out
of 32 register you can
select them by
Multiplexers by applying
their addresses.
26
Register file (writing):• To specify the destination register for write data out of 32
register you can decode its address to write on it only when
‘write’ is high.
........
........
5
to
32
decoder
Reg 0
Reg 1
Reg 31
........
E
D
E
D
E
D
5
32
Write
Write register number
Register data
27
Register file:
• Now, we can combine read and write structure for our complete register file: Read register #1
Read register #2
Write register
Write data
Write
Read data 1
Read data 2
5
5
5
32
32
32
28
Abstract View of the DataPath
• Memory system:– We can use either unified memory or two memory -
as in our case- one readable (instruction), and the other for read/write (data).
Registers
Register #
Data
Register #
Datamemory
Address
Data
Register #
PC Instruction ALU
Instructionmemory
Address
29
Memory system
• Remember that each of the two memories can holds 232= 4G byte, or 1G words (word= 4bytes).
• To access memory to get data or instruction we must supply the address line with multiple of 4 value, then four bytes are accessed for read or write in fashion similar to register file.
0
1
2
3
4
5
6
7
232-1
…
30
Abstract View of the DataPath
• Detailed datapath: now we can accomplish each of these element to execute the program
Registers
Register #
Data
Register #
Datamemory
Address
Data
Register #
PC Instruction ALU
Instructionmemory
Address
31
Fetching an Instruction• Program counter (PC) the register containing the address of the
instruction in the program being executed.• A instruction memory unit will hold the instructions that are to be
executed. • We need an ALU that performs only addition in order to calculate
the next instruction to fetch.
PC
Instructionmemory
Readaddress
Instruction
4
Add
PC
Instructionmemory
Instructionaddress
Instruction
a. Instruction memory b. Program counter
Add Sum
c. Adder
32
The Register File access
• For the R-type instructions, read the contents of 2 registers, perform an ALU op. , and write the result back into a third register (the value to be written, the register number, and the write control signal).
In s t r u c t io nR e g is te rs
W r itere g is te r
R e a dd a ta 1
R e a dd a ta 2
R e a dre g is te r 1
R e a dre g is te r 2
W r ited a t a
A L Ure s u lt
A L U
Z e ro
3
Write
ALU op
[25-11]
33
Data Memory• For the load and store instructions, we need to access the data memory and a unit that sign-extends the 16-bit
constant in an I-type instruction (immediate). In addition we use the existing ALU to compute the address to
access.
– In store:
• Register 1 used for address calculation, register 2 hold the data to be written and MemWrite is set
– In load:
• Register 1 used for address calculation, Write register for destination data from memory, both MemRead and RegWrite
are set
Instruction
16 32
RegistersWriteregister
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Datamemory
Writedata
Readdata
Writedata
Signextend
ALUresult
ZeroALU
Address
MemRead
MemW rite
RegW rite
ALU operation3[25-21]
[15-0]
[20-16]
[20-16]
34
Branch Equal• For the beq instruction, two registers for compare and a 16-bit offset used to compute
the branch address relative to the PC. To implement this instruction we must add the
sign-extend offset to the PC and shift left 2.
• Sign-extend: to increase the size of a data item by replicating the high-order sign bit of
the original data item in the high-order bits of the larger, destination data item.
• The unit Shift left 2 is simply a routing of the signals between input and output that
adds 00 to the low-order end of the sign-extended offset field; no actual shift hardware
is needed, since the amount of the shift is constant.
• Control logic is used to decide whether the incremented PC or branch target should
replace the PC, based on the zero output of the ALU
35
Branch Equal Diagram
36
Designing complete datapath• Comparing the three previous slides to build the common
datapath we note:– For register file input, ‘read register 1’ is constant [25-21] and
‘read register 2’ [20-16] (R-type and store) but ‘write register’ maybe [15-11] (R-type) or [20-16] (load) (Mux needed).
– For the ALU input, first input is from ‘read data 1’ while second maybe ‘read data 2’ (R-type & beq) or output of sign extend (sw & lw).
– The data written to register file may come from ALU output (R-type) or memory (load).
– PC next contents may come from ordinary incremented value or addition of beq offset.
37
Datapath Diagram
PC
Instructionmemory
Readaddress
Instruction[31–0]
Instruction [20 16]
Instruction [25 21]
Add
Instruction [5 0]
4
16 32Instruction [15 0]
0
0Mux
0
1
Add ALUresult
Mux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Signextend
Mux1
ALUresult
Zero
Datamemory
Writedata
Readdata
Mux
1
Instruction [15 11]
ALUcontrol
Shiftleft 2
ALUAddress
ALUSrc
ALUOp
MemRead
MemWrite
MemtoReg
PCSrc
RegWrite
RegDst
38
Datapath and control• From the diagram, peering in mind that the ALU 3 control inputs
from the ALU control:000 AND001 OR010 add (add, lw, sw)110 subtract (sub, beq)111 slt
• So, combining the eight control signals, we get:
39
Datapath & control
PC
Instructionmemory
Readaddress
Instruction[31–0]
Instruction [20 16]
Instruction [25 21]
Add
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
BranchRegDst
ALUSrc
Instruction [31 26]
4
16 32
0
0Mux
0
1
Control
Add ALUresult
Mux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Signextend
Mux1
ALUresult
Zero
PCSrc
Mux
1
Instruction [15 11]
ALUcontrol
Shiftleft 2
ALU
Instruction [15 0]
Instruction [5 0]
Datamemory
Writedata
ReaddataAddress
40
R-type
PC
Instructionmemory
Readaddress
Instruction[31–0]
Instruction [20 16]
Instruction [25 21]
Add
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
BranchRegDst
ALUSrc
Instruction [31 26]
4
16 32Instruction [15 0]
0
0Mux
0
1
Control
Add ALUresult
Mux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Signextend
Mux1
ALUresult
Zero
PCSrc
Datamemory
Writedata
Readdata
Mux
1
Instruction [15 11]
ALUcontrol
Shiftleft 2
ALU
Instruction [5 0]
Address
0 9 22 9 0 32E.g.
41
Load
PC
Instructionmemory
Readaddress
Instruction[31–0]
Instruction [25 21]
Add
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
BranchRegDst
ALUSrc
Instruction [31 26]
4
16 32
0
0Mux
0
1
Control
Add ALUresult
Mux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Signextend
Mux1
ALUresult
Zero
PCSrc
Mux
1
ALUcontrol
Shiftleft 2
ALU
Instruction [15 11]
Instruction [20 16]
Instruction [15 0]
Instruction [5 0]
Datamemory
Writedata
ReaddataAddress
35 9 8 4
42
Branch equal
PC
Instructionmemory
Readaddress
Instruction[31–0]
Instruction [25 21]
Add
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
BranchRegDst
ALUSrc
Instruction [31 26]
4
16 32
0
0Mux
0
1
Control
Add ALUresult
Mux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Signextend
Mux1
ALUresult
Zero
PCSrc
Mux
1
Instruction [15 11]
ALUcontrol
Shiftleft 2
ALU
Instruction [20 16]
Instruction [15 0]
Instruction [5 0]
Datamemory
Writedata
ReaddataAddress
4 8 21 1
43
ALU control design
PC
Instructionmemory
Readaddress
Instruction[31–0]
Instruction [20 16]
Instruction [25 21]
Add
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
BranchRegDst
ALUSrc
Instruction [31 26]
4
16 32
0
0Mux
0
1
Control
Add ALUresult
Mux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Signextend
Mux1
ALUresult
Zero
PCSrc
Mux
1
Instruction [15 11]
ALUcontrol
Shiftleft 2
ALU
Instruction [15 0]
Instruction [5 0]
Datamemory
Writedata
ReaddataAddress
44
ALU control design:• For the ALU control, the signal come
from the control should tell it do one of three:– Adding (lw or sw)– Subtracting (beq)– A/L operation according to function
bits (instruction[5-0])• So, 8 input bits (6+2) and three output
bits ALU control has.
Input Output
ALUOp Funct field ALUOp1 ALUOp0 F5 F4 F3 F2 F1 F0
Operation
Lw, Sw 0 0 X X X X X X 010
Beq X 1 X X X X X X 110
Add 1 X X X 0 0 0 0 010
Sub 1 X X X 0 0 1 0 110
And 1 X X X 0 1 0 0 000
Or 1 X X X 0 1 0 1 001
Slt 1 X X X 1 0 1 0 111
45
Main control unit design
PC
Instructionmemory
Readaddress
Instruction[31–0]
Instruction [20 16]
Instruction [25 21]
Add
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
BranchRegDst
ALUSrc
Instruction [31 26]
4
16 32
0
0Mux
0
1
Control
Add ALUresult
Mux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Signextend
Mux1
ALUresult
Zero
PCSrc
Mux
1
Instruction [15 11]
ALUcontrol
Shiftleft 2
ALU
Instruction [15 0]
Instruction [5 0]
Datamemory
Writedata
ReaddataAddress
46
Main Control Unit design• In this circuit, it’s clear that the control is
completely combinational, so we can
design it using gates or PLA.
• From diagram, We have 6 inputs and 9
outputs as depicted in truth table:
Control Signal name R-format lw sw beq
input
Op5 0 1 1 0
Op4 0 0 0 0
Op3 0 0 1 0
Op2 0 0 0 1
Op1 0 1 1 0
Op0 0 1 1 0
Outputs
RegDst 1 0 X X
ALUSrc 0 1 1 0
MemtoReg 0 1 X X
RegWrite 1 1 0 0
MemRead 0 1 0 0
MemWrite 0 0 1 0
Branch 0 0 0 1
ALUOp1 1 0 0 0
ALUOp0 0 0 0 1
47
Adding the Jump Instruction• For the j instruction, the upper 4 bits of PC+4 are concatenated
to the 26 bits (shifted left by 2) of the address in the J-type instruction.
Shiftleft 2
PC
Instructionmemory
Readaddress
Instruction[31– 0]
Datamemory
Readdata
Writedata
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Instruction [15– 11]
Instruction [20– 16]
Instruction [25– 21]
Add
ALUresult
Zero
Instruction [5– 0]
MemtoReg
ALUOp
MemW rite
RegWrite
MemRead
Branch
JumpRegDst
ALUSrc
Instruction [31– 26]
4
Mux
Instruction [25– 0] Jump address [31– 0]
PC+4 [31– 28]
Signextend
16 32Instruction [15– 0]
1
Mux
1
0
Mux
0
1
Mux
0
1
ALUcontrol
Control
Add ALUresult
Mux
0
1 0
ALU
Shiftleft 226 28
Address
48
Performance of Single-Cycle Machines• Processor we have designed is a single-cycle, that is, one cycle per instruction needed since all control signal
applied simultaneously.
• Let's assume that the operation time for the following units is:
– Memory: 2 ns
– ALU and adders: 2 ns
– Register file:1 ns.
• We will assume that MUXs, control, sign-extension, PC accesses, and wires have no delays.
• Which implementation is faster? 1. Every instruction operates in 1 clock cycle of fixed length.2. Every instruction operates in a varying length clock cycle.
• Lets look at the time needed by each instruction:
Inst. Fetch Reg. Rd ALU op Memory Reg. Wr Total
R-type 2 1 2 0 1 6ns
Load 2 1 2 2 1 8ns
Store 2 1 2 2 7ns
Branch 2 1 2 5ns
Jump 2
2ns
49
Fixed vs. Variable Cycle Length• Lets Assume a program has the following instruction mix: 24% loads, 12% stores, 44% R-
type, 18% branchs, 2% jumps.
• CPU execution time = Instruction count * Cycle time
• For the fixed cycle length the cycle time is 8 ns, long enough for the longest instruction
(load). Thus each instruction takes 8 ns to execute.
• For the variable cycle time the average CPU clock cycle is:
8*24% + 7*12% + 6*44% + 5*18% + 2*2% = 6.3 ns
• It is obvious that the variable clock implementation is faster but it is extremely hard to
implement.
• So why not use the single cycle implementation which is only 6.3/8 = 78% slower?
• When adding instructions such as multiply and divide which can take tens of cycles this
scheme is too slow (so, single-cycle not used).
50
A Multicycle Implementation• To increase clock cycle, instructions can be executed in many cycles by breaking each
instruction into several steps ( one cycle per step), (note that cycle period is fixed not CPI).
• the multicycle implementation allows a functional unit to be used more than once in each instruction as long as it is used on different clock cycles.
PC
Memory
Address
Instructionor data
Data
Instructionregister
Registers
Register #
Data
Register #
Register #
ALU
Memorydata
register
A
B
ALUOut
We now have only a single memory unit and a single ALU. In addition we need
registers to hold the output of each stage.
51
New Registers and MUXs• We have now added several new registers(which hare transparent to the
programmer) and some new MUXs:
– Instruction Register (IR) - the instruction fetched
– Memory Data Register (MDR) - data read from memory
– A, B - registers read from the register file
– ALUOut - result of ALU operation
• The new MUXs added are:
– An additional MUX to the 1st ALU input, chooses between the A register
and the PC.
– The MUX on the 2nd ALU input is changed from a 2-way to a 4-way MUX.
The additional inputs are the constant 4 (used to increment the PC) and
the sign-extended and shifted offset field (used in beq).
52
Multicycle Diagram
• IR needs write control, but others don’t• MUX to select 2 sources to memory; memory needs read signal• PC and A to one ALU input; four sources to another input
Shiftleft 2
MemtoReg
IorD MemRead MemWrite
PC
Memory
MemData
Writedata
Mux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Instruction[15– 11]
Mux
0
1
Mux
0
1
4
ALUOpALUSrcB
RegDst RegWrite
Instruction[15– 0]
Instruction [5– 0]
Signextend
3216
Instruction[25– 21]
Instruction[20– 16]
Instruction[15– 0]
Instructionregister
1 Mux
0
3
2
ALUcontrol
Mux
0
1ALU
resultALU
ALUSrcA
ZeroA
B
ALUOut
IRWrite
Address
Memorydata
register
53
Multicycle Datapath & control
• There are 3 possible sources for the PC value: – The output of the ALU which is PC+4;– The register ALUOut which is the address of the computed branch target; – The lower 26 bits of the IR shifted left by 2, concatenated with the 4 upper bits of the PC.
• Two control lines for PC
Shiftleft 2
PCMux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Instruction[15– 11]
Mux
0
1
Mux
0
1
4
Instruction[15– 0]
Signextend
3216
Instruction[25– 21]
Instruction[20– 16]
Instruction[15– 0]
Instructionregister
ALUcontrol
ALUresult
ALUZero
Memorydata
register
A
B
IorD
MemRead
MemWrite
MemtoReg
PCWriteCond
PCWrite
IRWrite
ALUO p
ALUSrcB
ALUSrcA
RegDst
PCSource
RegWrite
Control
O utputs
Op[5– 0]
Instruction[31-26]
Instruction [5– 0]
Mux
0
2
Jumpaddress [31-0]Instruction [25– 0] 26 28
Shiftleft 2
PC [31-28]
1
1 Mux
0
3
2
Mux
0
1ALUOut
Memory
MemData
Writedata
Address
54
1) Instruction Fetch
Fetch the instruction from memory and compute the address of the next sequential address:IR = Memory[PC];PC= PC + 4;
Shiftleft 2
PCMux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Instruction[15– 11]
Mux
0
1
Mux
0
1
4
Instruction[15– 0]
Signextend
3216
Instruction[25– 21]
Instruction[20– 16]
Instruction[15– 0]
Instructionregister
ALUcontrol
ALUresult
ALUZero
Memorydata
register
A
B
IorD
MemRead
MemWrite
MemtoReg
PCWriteCond
PCWrite
IRWrite
ALUO p
ALUSrcB
ALUSrcA
RegDst
PCSource
RegWrite
Control
O utputs
Op[5– 0]
Instruction[31-26]
Instruction [5– 0]
Mux
0
2
Jumpaddress [31-0 ]Instruction [25– 0] 26 28
Shiftleft 2
PC [31-28]
1
1 Mux
0
3
2
Mux
0
1ALUOut
Memory
MemData
Writedata
Address
55
2) Instruction Decode (ID) and register fetch
get the registers from the register file and compute the potential branch address (even if it isn't needed in the future):A = Reg[IR[25-21]];B = Reg[IR[20-16]];ALUOut = PC + (sign-extended(IR[15-0])<<2);
Shiftleft 2
PCMux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Instruction[15– 11]
Mux
0
1
Mux
0
1
4
Instruction[15– 0]
Signextend
3216
Instruction[25– 21]
Instruction[20– 16]
Instruction[15– 0]
Instructionregister
ALUcontrol
ALUresult
ALUZero
Memorydata
register
A
B
IorD
MemRead
MemWrite
MemtoReg
PCWriteCond
PCWrite
IRWrite
ALUO p
ALUSrcB
ALUSrcA
RegDst
PCSource
RegWrite
Control
O utputs
Op[5– 0]
Instruction[31-26]
Instruction [5– 0]
Mux
0
2
Jumpaddress [31-0 ]Instruction [25– 0] 26 28
Shiftleft 2
PC [31-28]
1
1 Mux
0
3
2
Mux
0
1ALUOut
Memory
MemData
Writedata
Address
56
3) Execution (EX), Memory address computation or branch completion
In this stage the operation is determined by the instruction class: A. Memory reference: ALUOut = A + sign-extended(IR[15-0]);B. R-type: ALUOut = A op B;
Shiftleft 2
PCMux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Instruction[15– 11]
Mux
0
1
Mux
0
1
4
Instruction[15– 0]
Signextend
3216
Instruction[25– 21]
Instruction[20– 16]
Instruction[15– 0]
Instructionregister
ALUcontrol
ALUresult
ALUZero
Memorydata
register
A
B
IorD
MemRead
MemWrite
MemtoReg
PCWriteCond
PCWrite
IRWrite
ALUO p
ALUSrcB
ALUSrcA
RegDst
PCSource
RegWrite
Control
O utputs
Op[5– 0]
Instruction[31-26]
Instruction [5– 0]
Mux
0
2
Jumpaddress [31-0 ]Instruction [25– 0] 26 28
Shiftleft 2
PC [31-28]
1
1 Mux
0
3
2
Mux
0
1ALUOut
Memory
MemData
Writedata
Address
C. Branch: if (A == B) PC = ALUOut;D. Jump: PC = PC[31-28] cat (IR[25-0]<<2)
57
3) Execution (EX), Memory address computation or branch completion
In this stage the operation is determined by the instruction class: A. Memory reference: ALUOut = A + sign-extended(IR[15-0]);B. R-type: ALUOut = A op B;
Shiftleft 2
PCMux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Instruction[15– 11]
Mux
0
1
Mux
0
1
4
Instruction[15– 0]
Signextend
3216
Instruction[25– 21]
Instruction[20– 16]
Instruction[15– 0]
Instructionregister
ALUcontrol
ALUresult
ALUZero
Memorydata
register
A
B
IorD
MemRead
MemWrite
MemtoReg
PCWriteCond
PCWrite
IRWrite
ALUO p
ALUSrcB
ALUSrcA
RegDst
PCSource
RegWrite
Control
O utputs
Op[5– 0]
Instruction[31-26]
Instruction [5– 0]
Mux
0
2
Jumpaddress [31-0 ]Instruction [25– 0] 26 28
Shiftleft 2
PC [31-28]
1
1 Mux
0
3
2
Mux
0
1ALUOut
Memory
MemData
Writedata
Address
C. Branch: if (A == B) PC = ALUOut;D. Jump: PC = PC[31-28] cat (IR[25-0]<<2)
58
3) Execution (EX), Memory address computation or branch completion
In this stage the operation is determined by the instruction class: A. Memory reference: ALUOut = A + sign-extended(IR[15-0]);B. R-type: ALUOut = A op B;
Shiftleft 2
PCMux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Instruction[15– 11]
Mux
0
1
Mux
0
1
4
Instruction[15– 0]
Signextend
3216
Instruction[25– 21]
Instruction[20– 16]
Instruction[15– 0]
Instructionregister
ALUcontrol
ALUresult
ALUZero
Memorydata
register
A
B
IorD
MemRead
MemWrite
MemtoReg
PCWriteCond
PCWrite
IRWrite
ALUO p
ALUSrcB
ALUSrcA
RegDst
PCSource
RegWrite
Control
O utputs
Op[5– 0]
Instruction[31-26]
Instruction [5– 0]
Mux
0
2
Jumpaddress [31-0 ]Instruction [25– 0] 26 28
Shiftleft 2
PC [31-28]
1
1 Mux
0
3
2
Mux
0
1ALUOut
Memory
MemData
Writedata
Address
C. Branch: if (A == B) PC = ALUOut;D. Jump: PC = PC[31-28] cat (IR[25-0]<<2)
59
3) Execution (EX), Memory address computation or branch completion
In this stage the operation is determined by the instruction class: A. Memory reference: ALUOut = A + sign-extended(IR[15-0]);B. R-type: ALUOut = A op B;
Shiftleft 2
PCMux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Instruction[15– 11]
Mux
0
1
Mux
0
1
4
Instruction[15– 0]
Signextend
3216
Instruction[25– 21]
Instruction[20– 16]
Instruction[15– 0]
Instructionregister
ALUcontrol
ALUresult
ALUZero
Memorydata
register
A
B
IorD
MemRead
MemWrite
MemtoReg
PCWriteCond
PCWrite
IRWrite
ALUO p
ALUSrcB
ALUSrcA
RegDst
PCSource
RegWrite
Control
O utputs
Op[5– 0]
Instruction[31-26]
Instruction [5– 0]
Mux
0
2
Jumpaddress [31-0]Instruction [25– 0] 26 28
Shiftleft 2
PC [31-28]
1
1 Mux
0
3
2
Mux
0
1ALUOut
Memory
MemData
Writedata
Address
C. Branch: if (A == B) PC = ALUOut;D. Jump: PC = PC[31-28] cat (IR[25-0]<<2)
60
4) Memory access or R-type completion
During this step the load/store instruction accesses memory or the AL instruction write its results.A. Memory reference: MDR = Memory[ALUOut]; (load) Memory[ALUOut] = B; (store)B. R-type: Reg[IR[15-11]] = ALUOut;
Shiftleft 2
PCMux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Instruction[15– 11]
Mux
0
1
Mux
0
1
4
Instruction[15– 0]
Signextend
3216
Instruction[25– 21]
Instruction[20– 16]
Instruction[15– 0]
Instructionregister
ALUcontrol
ALUresult
ALUZero
Memorydata
register
A
B
IorD
MemRead
MemWrite
MemtoReg
PCWriteCond
PCWrite
IRWrite
ALUO p
ALUSrcB
ALUSrcA
RegDst
PCSource
RegWrite
Control
O utputs
Op[5– 0]
Instruction[31-26]
Instruction [5– 0]
Mux
0
2
Jumpaddress [31-0 ]Instruction [25– 0] 26 28
Shiftleft 2
PC [31-28]
1
1 Mux
0
3
2
Mux
0
1ALUOut
Memory
MemData
Writedata
Address
61
5) Memory read completion step:
The load completes by writing the value from memory into a register.Reg[IR[20-16]]=MDR;
Shiftleft 2
PCMux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Instruction[15– 11]
Mux
0
1
Mux
0
1
4
Instruction[15– 0]
Signextend
3216
Instruction[25– 21]
Instruction[20– 16]
Instruction[15– 0]
Instructionregister
ALUcontrol
ALUresult
ALUZero
Memorydata
register
A
B
IorD
MemRead
MemWrite
MemtoReg
PCWriteCond
PCWrite
IRWrite
ALUO p
ALUSrcB
ALUSrcA
RegDst
PCSource
RegWrite
Control
O utputs
Op[5– 0]
Instruction[31-26]
Instruction [5– 0]
Mux
0
2
Jumpaddress [31-0 ]Instruction [25– 0] 26 28
Shiftleft 2
PC [31-28]
1
1 Mux
0
3
2
Mux
0
1ALUOut
Memory
MemData
Writedata
Address
62
Summary of the Steps
Step nameAction for R-type
instructionsAction for memory-reference
instructionsAction for branches
Action for jumps
Instruction fetch IR = Memory[PC]PC = PC + 4
Instruction A = Reg [IR[25-21]]decode/register fetch B = Reg [IR[20-16]]
ALUOut = PC + (sign-extend (IR[15-0]) << 2)
Execution, address ALUOut = A op B ALUOut = A + sign-extend if (A ==B) then PC = PC [31-28] IIcomputation, branch/ (IR[15-0]) PC = ALUOut (IR[25-0]<<2)jump completion
Memory access or R-type Reg [IR[15-11]] = Load: MDR = Memory[ALUOut]completion ALUOut or
Store: Memory [ALUOut] = B
Memory read completion Load: Reg[IR[20-16]] = MDR
63
Hardwired control unit design
InstructionFetch
InstructionDecode
AddressComputation
ExecutionJump
CompletionBranch
Completion
MemoryRead
MemoryWrite
R-TypeCompletion
WriteBack
Load + Store R-type Branch Jump
Load Store
Start
0
1
2
3
5
6
7
4
8 9
64
MemReadALUSelA=0
IorD=0IRWrite
ALUSelB=01ALUOp=00
PCWritePCSource=00
ALUSelA=0ALUSelB=11ALUOp=00TargetWrite
ALUSelA=1ALUSelB=10ALUOp=00
ALUSelA=1ALUSelB=00ALUOp=10
PCWritePCSource=10
ALUSelA=1ALUSelB=00ALUOp=01
PCWriteCondPCSource=01
MemReadALUSelA=1
IorD=1ALUSelB=10ALUOp=00
MemWriteALUSelA=1
IorD=1ALUSelB=10ALUOp=00
ALUSelA=1RegDst=1RegWrite
MemtoReg=0ALUSelB=0ALUOp=10
MemReadALUSelA=1
IorD=1RegWrite
MemtoReg=1RegDst=0
ALUSelB=10ALUOp=00
Load + Store R-type Branch Jump
Load Store
Start
0 1
2
3
5
6
7
4
8
9
65
Control unit design:• The input to the
circuit is IR[31-26]
• From the previous diagram there are 10 states, so 4 flip flop (4-bit register) is needed, together with inputs, determine the next state and output.
66
• The next slide table (derived from state diagram) is written to simplify design (instead of using state table), we can note the following:
• All control output determined directly from current state: example
– PCWrite= 1 at s3s2s1s0= 0000 or 1001
• The ten next state is described from the inputs and ten current state, e.g.
– NextState2=1 at s3s2s1s0= 0001 and ( Op= 100011 or 101011
• The 4 output NS4-NS0, can be determined accordingly, e.g.
– NS0 =1 when NextState1, 3, 5 or 7 is true
67
Output Current states Op
PCWrite state0 + state9
PCWriteCond state8
IorD state3 + state5
MemRead state0 + state3
MemWrite state5
IRWrite state0
MemtoReg state4
PCSource1 state9
PCSource0 state8
ALUOp1 state6
ALUOp0 state8
ALUSrcB1 state1 +state2
ALUSrcB0 state0 + state1
ALUSrcA state2 + state6 + state8
RegWrite state4 + state7
RegDst state7
NextState0 state4 + state5 + state7 + state8 + state9
NextState1 state0
NextState2 state1 (Op = 'lw') + (Op = 'sw')
NextState3 state2 (Op = 'lw')
NextState4 state3
NextState5 state2 (Op = 'sw')
NextState6 state1 (Op = ’R-type’)
NextState7 state6
NextState8 state1 (Op = 'beq')
NextState9 state1 (Op = 'jmp')
68
PLA implementation for control logic part
69
• Several possible initial representations, sequence control and logic representation, and control implementation => all may be determined indep.
Initial Rep. Finite State Diagram Microprogram
Sequencing Explicit Next State MicroprogramControl Function Counter +
Dispatch ROMs
Logic Rep. Logic Equations Truth Tables
Implementation PLA ROM
Sequential Control Design:
“hardwired control” “microprogrammed control”
75
References:
• Computer Organization and Design 3E (John Hennessy & David Patterson)
• Logic and Computer Design Fundamentals,3E(M. Morris Mano & Charles Kime)
76