cs152 – computer architecture and engineering lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · cs...

51
CS 152 L06 Single Cycle 1 (1) UC Regents Fall 2004 © UCB 2004-09-16 John Lazzaro (www.cs.berkeley.edu/~lazzaro) Dave Patterson (www.cs.berkeley.edu/~patterson) www-inst.eecs.berkeley.edu/~cs152/ CS152 – Computer Architecture and Engineering Lecture 6 – Single Cycle + Design Notebook

Upload: others

Post on 19-Jan-2020

16 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (1) UC Regents Fall 2004 © UCB

2004-09-16

John Lazzaro(www.cs.berkeley.edu/~lazzaro)

Dave Patterson (www.cs.berkeley.edu/~patterson)

www-inst.eecs.berkeley.edu/~cs152/

CS152 – Computer Architecture andEngineering

Lecture 6 – Single Cycle + Design Notebook

Page 2: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (2) UC Regents Fall 2004 © UCB

Review

Xilinx: Physical, yet configurable

Timing key concept: critical path

Page 3: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (3) UC Regents Fall 2004 © UCB

Outline°Single clock cycle per instruction=> Data path resource used at most once per instruction=> Need to replicate some data path resources if need more than once: memory, ALU/adders, …

°Timing, Testing of Single Cycle

°Single Cycle Datapath• 5 generic steps (on next slide)

• See how far we get this lecture; finish next time

°Design Notebook

Page 4: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB

How to Design a Processor: step-by-step°1. Analyze instruction set => datapath requirements

• the meaning of each instruction is given by the register transfers• datapath must include storage element for ISA registers

- possibly more

• datapath must support each register transfer

°2. Select set of datapath components and establish clocking methodology

°3. Assemble datapath meeting the requirements

°4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer.

°5. Assemble the control logic

Page 5: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (5) UC Regents Fall 2004 © UCB

The MIPS Instruction Formats° All MIPS instructions are 32 bits long. The three instruction

formats:

• R-type

• I-type

• J-type

° The different fields are:• op: operation of the instruction• rs, rt, rd: the source and destination register specifiers• shamt: shift amount• funct: selects the variant of the operation in the “op” field• address / immediate: address offset or immediate value• target address: target address of the jump instruction

op target address02631

6 bits 26 bits

op rs rt rd shamt funct061116212631

6 bits 6 bits5 bits5 bits5 bits5 bits

op rs rt immediate016212631

6 bits 16 bits5 bits5 bits

Page 6: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (6) UC Regents Fall 2004 © UCB

Step 1a: The MIPS-lite Subset for today

° ADD and SUB• addU rd, rs, rt• subU rd, rs, rt

° OR Immediate:• ori rt, rs, imm16

° LOAD and STORE Word• lw rt, rs, imm16• sw rt, rs, imm16

° BRANCH:• beq rs, rt, imm16

op rs rt rd shamt funct061116212631

6 bits 6 bits5 bits5 bits5 bits5 bits

op rs rt immediate016212631

6 bits 16 bits5 bits5 bits

op rs rt immediate016212631

6 bits 16 bits5 bits5 bits

op rs rt immediate016212631

6 bits 16 bits5 bits5 bits

Page 7: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (7) UC Regents Fall 2004 © UCB

Lectures vs. Chapter 5 COD 3/e

Lectures

°MIPS-lite subset:• Addu, Subu, LW, SW• BEQ, ORi

Book

°MIPS-lite subset:• Add, Sub, LW, SW• BEQ, OR• AND, SLT, J

°Difficult complaint for textbook author: Lecturing directly from the textbook!

°Good news: These lectures differ from book

Page 8: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (8) UC Regents Fall 2004 © UCB

Using Hardware Description Lang. °All start by fetching the instruction

op | rs | rt | rd | shamt | funct <= MEM[ PC ]

op | rs | rt | Imm16 <= MEM[ PC ]

inst HDL description

ADDU R[rd] <= R[rs] + R[rt]; PC <= PC + 4

SUBU R[rd] <= R[rs] – R[rt]; PC <= PC + 4

ORi R[rt] <= R[rs] | zero_ext(Imm16); PC <= PC + 4

LOAD R[rt] <= MEM[ R[rs] + sign_ext(Imm16)]; PC <= PC + 4

STORE MEM[ R[rs] + sign_ext(Imm16) ] <= R[rt]; PC <= PC + 4

BEQ if ( R[rs] == R[rt] ) PC <= PC + 4 +{sign_ext(Imm16), 2’b00 }

else PC <= PC + 4

Page 9: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (9) UC Regents Fall 2004 © UCB

Step 1: Requirements of the Instruction Set°Memory

• One for instructions, one for data

°Registers (32 x 32bit)• Read RS

• Read RT

• Write RT or RD

°PC

°Sign Extender (for immediate field)

°Add and Sub register or Extended Immediate

°Add 4 or Extended Immediate to PC

Page 10: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (10) UC Regents Fall 2004 © UCB

Step 2: Components of the Datapath

°Combinational Logic Elements

°Storage Elements (“State”)• Clocking methodology

Page 11: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (11) UC Regents Fall 2004 © UCB

Combinational Logic Elements (Basic Building Blocks)

°Adder

°MUX(multi-plexor)

°ALU

32

32

A

B32 Sum

Carry

32

32

A

B32 Result

OP

32A

B 32

Y32

Select

Adder

MU

XA

LU

CarryIn(to add values)

(to chose between values)

(to do add, subtract, or)

Page 12: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (12) UC Regents Fall 2004 © UCB

Storage Element: Register (Basic Building Block)

°Register•Similar to the D Flip Flop except

- N-bit input and output- Write Enable input

•Write Enable:- negated (0): Data Out will not change- asserted (1): Data Out will become Data In

Clk

Data In

Write Enable

N NData Out

Page 13: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (13) UC Regents Fall 2004 © UCB

Storage Element: Register File°Register File consists of 32 registers:

• Two 32-bit output busses:busA and busB

• One 32-bit input bus: busW

°Register is selected by:• RA (number) selects the register to put on busA (data)• RB (number) selects the register to put on busB (data)• RW (number) selects the register to be written

via busW (data) when Write Enable is 1

°Clock input (CLK) • The CLK input is a factor ONLY during write operation• During read operation, Register File behaves as a

combinational logic block:- RA or RB valid => busA or busB valid after “access time.”

Clk

busW

Write Enable

3232

busA

32busB

5 5 5RWRA RB

32 32-bitRegisters

Page 14: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (14) UC Regents Fall 2004 © UCB

Storage Element: Idealized Memory

°Memory (idealized)• One input bus: Data In• One output bus: Data Out

°Memory word is selected by:• Address selects the word to put on Data Out• Write Enable = 1: address selects the memory

word to be written via the Data In bus

°Clock input (CLK) • The CLK input is a factor ONLY during write operation• During read operation, behaves as a combinational logic

block:- Address valid => Data Out valid after “access time.”

Clk

Data In

Write Enable

32 32DataOut

Address

Page 15: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (15) UC Regents Fall 2004 © UCB

Administrivia

°Last mini-lab Friday

°Lab #2 Design Document due Friday• Should already have groups

°Reading: Sections 5.1 to 5.4, 5.8, Appendix B of COD 3e

Page 16: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (16) UC Regents Fall 2004 © UCB

Computers In The News°“Intel shifts focus to multicore chip performance” 9/7/04 S.F. Chronicle

°Intel Corp., which has led the charge in the gigahertz race, is expected to change its tune… Excess heat in PC microprocessors has become an increasingly difficult problem to solve as chipmakers crank up the frequency of the tiny transistors to boost performance. Dual core will be part of Intel's big theme ... which shows Intel is changing direction from focusing on faster and faster gigahertz to other features," he said. Chips with multiple cores are not new. Both IBM and Sun have such processors for larger server computers and AMD last week introduced its dual-core CPU.

Page 17: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (17) UC Regents Fall 2004 © UCB

Review: Clocking Methodology

° All storage elements are clocked by the same clock edge

° Cycle Time = CLK-to-Q + Longest Delay Path + Setup + Clock Skew

° (CLK-to-Q + Shortest Delay Path - Clock Skew) > Hold Time

Clk

Don’t CareSetup Hold

.

.

.

.

.

.

.

.

.

.

.

.

Setup Hold

Page 18: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (18) UC Regents Fall 2004 © UCB

Step 3: Assemble Datapath meeting our requirements

°HDL (Verilog) Requirements⇒ Datapath Assembly

°Instruction Fetch

°Read Operands and Execute Operation

Page 19: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (19) UC Regents Fall 2004 © UCB

3a: Overview of the Instruction Fetch Unit

°The common operations• Fetch the Instruction: mem[PC]• Update the program counter:

- Sequential Code: PC <= PC + 4 - Branch and Jump: PC <= “something else”

32

Instruction WordAddress

InstructionMemory

PCClk

Next AddressLogic

Page 20: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (20) UC Regents Fall 2004 © UCB

3b: Add & Subtract°R[rd] <= R[rs] op R[rt]

Example: addU rd, rs, rt• Ra, Rb, and Rw come from instruction’s rs, rt, and rd fields• ALUctr and RegWr: control logic after decoding the

instruction

32Result

ALUctr

Clk

busW

RegWr

3232

busA

32busB

5 5 5

Rw Ra Rb

32 32-bitRegisters

Rs RtRd

AL

Uop rs rt rd shamt funct

061116212631

6 bits 6 bits5 bits5 bits5 bits5 bits

Page 21: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (21) UC Regents Fall 2004 © UCB

Register-Register Timing: One complete cycle

32Result

ALUctr

Clk

busW

RegWr

3232

busA

32busB

5 5 5

Rw Ra Rb

32 32-bitRegisters

Rs RtRd

AL

U

Clk

PC

Rs, Rt, Rd,Op, Func

Clk-to-Q

ALUctr

Instruction Memory Access Time

Old Value New ValueDelay through Control Logic

busA, BRegister File Access Time

Old Value New Value

busWALU Delay

Old Value New Value

Old Value New Value

New ValueOld Value

Register WriteOccurs Here

RegWr Old Value New Value

Page 22: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (22) UC Regents Fall 2004 © UCB

3c: Logical Operations with Immediate°R[rt] <= R[rs] op ZeroExt[imm16] ]

32

Result

ALUctr

Clk

busW

RegWr

3232

busA

32busB

5 5 5

Rw Ra Rb32 32-bitRegisters

Rs

ZeroE

xt

Mux

RtRdRegDst Mux

3216imm16

ALUSrc

AL

U

11

op rs rt immediate016212631

6 bits 16 bits5 bits5 bits rd?

immediate016 1531

16 bits16 bits0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Rt?

Additions to prior Data-path

Why?

Why?

Why?

R-type and I-type different field for destination

Reg file busB or Immediate

0-extend immed.

Page 23: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (23) UC Regents Fall 2004 © UCB

3d: Load Operations°R[rt] <= Mem[R[rs] + SignExt[imm16]]

Example: lw rt, rs, imm16

op rs rt immediate016212631

6 bits 16 bits5 bits5 bits

32

ALUctr

Clk

busW

RegWr

3232

busA

32busB

5 5 5

Rw Ra Rb32 32-bitRegisters

Rs

RtRdRegDst

Extender

Mux

Mux

3216

imm16

ALUSrc

ExtOp

Clk

Data InWrEn

32

Adr

DataMemory

32

AL

U

MemWr Mux

W_Src

??

Rt?Why?

Chose ALU output or Data Memory output

Why?Separate memory for dataWhy?

0 or sign extend

Page 24: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (24) UC Regents Fall 2004 © UCB

3e: Store Operations°Mem[ R[rs] + SignExt[imm16] ] <= R[rt]

Example: sw rt, rs, imm16op rs rt immediate

016212631

6 bits 16 bits5 bits5 bits

32

ALUctr

Clk

busW

RegWr

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst

Extender

Mux

Mux

3216imm16

ALUSrcExtOp

Clk

Data InWrEn

32Adr

DataMemory

MemWr

AL

U

32

Mux

W_Src

Why?

Reg file bus B as Data Memory input

Page 25: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (25) UC Regents Fall 2004 © UCB

3f: The Branch Instruction

°beq rs, rt, imm16

• mem[PC] Fetch the instruction from memory

• Equal <= (R[rs] == R[rt]) Calculate the branch condition

• if (Equal) Calculate the next instruction’s address- PC <= PC + 4 + { SignExt(imm16) , 2b00 }

• else- PC <= PC + 4

op rs rt immediate016212631

6 bits 16 bits5 bits5 bits

Page 26: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (26) UC Regents Fall 2004 © UCB

Datapath for Branch Operations°beq rs, rt, imm16

Datapath generates condition (equal)

op rs rt immediate016212631

6 bits 16 bits5 bits5 bits

32

imm16

PCClk

00

Adder

Mux

Adder

4PCSrc

Clk

busW

RegWr

32

busA

32busB

5 5 5

Rw Ra Rb32 32-bitRegisters

Rs Rt

Equ

al?

Cond

PC E

xt

Inst Address

Why?

Chose PC+4 orPC+4+{SignExtImm, 00} Why?

Test if Reg file busA equals Reg file busB

Page 27: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (27) UC Regents Fall 2004 © UCB

Putting it All Together: A Single Cycle Datapathim

m16

32

ALUctr

Clk

busW

RegWr

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst

Extender

Mux

3216imm16

ALUSrcExtOp

Mux

MemtoReg

Clk

Data InWrEn32 Adr

DataMemory

MemWrA

LU

Equal

Instruction<31:0>

0

1

0

1

01

<21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRtRs

=

Adder

Adder

PC

Clk

00Mux

4

PCSrc

PC E

xt

Adr

InstMemory

Page 28: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (28) UC Regents Fall 2004 © UCB

Lectures vs. Chapter 5 3/eLectures

°MIPS-lite subset:• AddU, SubU, LW, SW• BEQ, ORI

°Control lines names• MemtoReg, PCSrc,

ALUSrc, RegDst• MemWr, RegWr• ExtOp (zero extend or

sign extend)• ALUctr 3 bits (no NOR)• MemWr==0 => MemRead

Book

°MIPS-lite subset:• Add, Sub, LW, SW• BEQ, OR• AND, SLT, J

°Control lines names• MemtoReg, PCSrc,

ALUSrc, RegDst• MemWrite, RegWrite• No ExtOp since subset

address just sign extend• ALUoperation 4 bits• MemRead & MemWrite

Page 29: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (29) UC Regents Fall 2004 © UCB

Step 5: Implement the control

Next Time!

Page 30: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (30) UC Regents Fall 2004 © UCB

Types of Bugs in Single Cycle Lab 2?

°Which Epoch most likely to uncover bugs for Single Cycle Design?

°Which Epoch would you expect for Pipelined Design?

°Which tests from Single Cycle Design can be reused for Pipelined Design?

°Suggestions for a good your plan?

Page 31: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (31) UC Regents Fall 2004 © UCB

Recap Test Plan: The testing timeline

complete processor

testing

Top-downtesting

Bottom-uptesting

unit testing

processortesting

withself-checks

Which testing types are good for each epoch?

processorassemblycomplete

correctlyexecutes

singleinstructions

correctlyexecutes

shortprograms

Time

Epoch 1 Epoch 2 Epoch 3 Epoch 4unit testing

early

multiunit

testing

latermulti-unit

testing

processortesting

withself-checks

multi-unit testing

unit testing

diagnostics

complete processor

testing

verificationprocessor

testingwith

self-checks

diagnostics

processortesting

withself-checks

multi-unit testing

unit testing

diagnostics

Page 32: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (32) UC Regents Fall 2004 © UCB

Peer Evaluation Question Assumptions

° For the next few slides in this lecture assume that

1. Delay to Access Register File ≅≅≅≅ Delay to do 32-bit add≅≅≅≅ Delay to do 32-bit ALU operation

2. Multiplexor delays 0 (ignore for now)

Page 33: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (33) UC Regents Fall 2004 © UCB

A. PC’s Clk-to-QB. Instruction Memory’s Access TimeC. Register File’s Access TimeD. ALU to Perform a 32-bit OperationE. PC Adder1 adds 4 to PCF. PC Ext sign extends immediateG. PC Adder2 adds imm to Adder1 sumH. Data Memory Access TimeI. Setup Time for PCJ. Setup Time for Register File WriteK. Clock Skew

Peer Instruction: What is critical path for BEQ?im

m16

32

Clk

busW

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

Rd

Extender

Mux

3216imm16

Mux

Clk

Data InWrEn32 Adr

DataMemory

AL

U

Equal

0

1

0

1

01

=

Adder1

Adder2

PC

Clk

00Mux

4

PC E

xt

InstMemory

Wr

1. A, B, C, D, I, K 2. A, B, C, D, G, I, K3. A, B, C, D, E, F, G, I, J, K4. A, B, C, D, E, F, G, H, I, J, K

Page 34: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (34) UC Regents Fall 2004 © UCB

A. PC’s Clk-to-QB. Instruction Memory’s Access TimeC. Register File’s Access TimeD. ALU to Perform a 32-bit OperationE. PC Adder1 adds 4 to PCF. PC Ext sign extends immediateG. PC Adder2 adds imm to Adder1 sumH. Data Memory Access TimeI. Setup Time for PCJ. Setup Time for Register File WriteK. Clock Skew

Peer Instruction: What is critical path for BEQ?im

m16

32

Clk

busW

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

Rd

Extender

Mux

3216imm16

Mux

Clk

Data InWrEn32 Adr

DataMemory

AL

U

Equal

0

1

0

1

01

=

Adder1

Adder2

PC

Clk

00Mux

4

PC E

xt

InstMemory

Wr

1. A, B, C, D, I, K 2. A, B, C, D, G, I, K3. A, B, C, D, E, F, G, I, J, K4. A, B, C, D, E, F, G, H, I, J, K

Page 35: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (35) UC Regents Fall 2004 © UCB

A. PC’s Clk-to-QB. Instruction Memory’s Access TimeC. Register File’s Access TimeD. ALU to Perform a 32-bit OperationE. PC Adder1 adds 4 to PCF. PC Ext sign extends immediateG. PC Adder2 adds imm to Adder1 sumH. Data Memory Access TimeI. Setup Time for PCJ. Setup Time for Register File WriteK. Clock Skew

Peer Instruction: What is critical path for LW?im

m16

32

Clk

busW

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

Rd

Extender

Mux

3216imm16

Mux

Clk

Data InWrEn32 Adr

DataMemory

AL

U

Equal

0

1

0

1

01

=

Adder1

Adder2

PC

Clk

00Mux

4

PC E

xt

InstMemory

Wr

1. A, B, C, D, H, J, K 2. A, B, C, D, H, I, J, K3. A, B, C, D, E, H, I, J, K4. A, B, C, D, E, F, G, H, I, J, K

Page 36: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (36) UC Regents Fall 2004 © UCB

A. PC’s Clk-to-QB. Instruction Memory’s Access TimeC. Register File’s Access TimeD. ALU to Perform a 32-bit OperationE. PC Adder1 adds 4 to PCF. PC Ext sign extends immediateG. PC Adder2 adds imm to Adder1 sumH. Data Memory Access TimeI. Setup Time for PCJ. Setup Time for Register File WriteK. Clock Skew

Peer Instruction: What is critical path for LW?im

m16

32

Clk

busW

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

Rd

Extender

Mux

3216imm16

Mux

Clk

Data InWrEn32 Adr

DataMemory

AL

U

Equal

0

1

0

1

01

=

Adder1

Adder2

PC

Clk

00Mux

4

PC E

xt

InstMemory

Wr

1. A, B, C, D, H, J, K 2. A, B, C, D, H, I, J, K3. A, B, C, D, E, H, I, J, K4. A, B, C, D, E, F, G, H, I, J, K

Page 37: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (37) UC Regents Fall 2004 © UCB

A. PC’s Clk-to-QB. Instruction Memory’s Access TimeC. Register File’s Access TimeD. ALU to Perform a 32-bit OperationE. PC Adder1 adds 4 to PCF. PC Ext sign extends immediateG. PC Adder2 adds imm to Adder1 sumH. Data Memory Access TimeI. Setup Time for PCJ. Setup Time for Register File WriteK. Clock Skew

Peer Instruction: Which has longest critical path?im

m16

32

Clk

busW

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

Rd

Extender

Mux

3216imm16

Mux

Clk

Data InWrEn32 Adr

DataMemory

AL

U

Equal

0

1

0

1

01

=

Adder1

Adder2

PC

Clk

00Mux

4

PC E

xt

InstMemory

Wr

1. ADDU2. BEQ 3. LW4. ORI5. SUBU6. SW

Page 38: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (38) UC Regents Fall 2004 © UCB

A. PC’s Clk-to-QB. Instruction Memory’s Access TimeC. Register File’s Access TimeD. ALU to Perform a 32-bit OperationE. PC Adder1 adds 4 to PCF. PC Ext sign extends immediateG. PC Adder2 adds imm to Adder1 sumH. Data Memory Access TimeI. Setup Time for PCJ. Setup Time for Register File WriteK. Clock Skew

Peer Instruction: Which has longest critical path?im

m16

32

Clk

busW

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

Rd

Extender

Mux

3216imm16

Mux

Clk

Data InWrEn32 Adr

DataMemory

AL

U

Equal

0

1

0

1

01

=

Adder1

Adder2

PC

Clk

00Mux

4

PC E

xt

InstMemory

Wr

1. ADDU2. BEQ 3. LW4. ORI5. SUBU6. SW

Page 39: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (39) UC Regents Fall 2004 © UCB

An Abstract View of the Critical Path°Ideal memory AND register file:

• The CLK input is a factor ONLY during write operation• During read operation, behave as combinational logic:

- Address valid => Output valid after “access time.”Critical Path (Load Operation) =

PC’s Clk-to-Q +Instruction Memory’s Access Time +Register File’s Access Time +ALU to Perform a 32-bit Add +Data Memory Access Time +Setup Time for Register File Write +Clock Skew

Clk

5

Rw Ra Rb32 32-bitRegisters

RdA

LU

Clk

Data In

DataAddress Ideal

DataMemory

Instruction

InstructionAddress

IdealInstruction

Memory

Clk

PC

5Rs

5Rt

16Imm

32

323232

A

B

Nex

t Add

ress

Page 40: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (40) UC Regents Fall 2004 © UCB

An Abstract View of the Implementation

DataOut

Clk

5

Rw Ra Rb32 32-bitRegisters

Rd

AL

U

Clk

Data In

DataAddress Ideal

DataMemory

Instruction

InstructionAddress

IdealInstruction

Memory

Clk

PC

5Rs

5Rt

32

323232

A

B

Nex

t Add

ress

Control

Datapath

Control Signals Conditions

Page 41: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (41) UC Regents Fall 2004 © UCB

Why should you keep a design notebook?

°Keep track of the design decisions and the reasons behind them

• Otherwise, it will be hard to debug and/or refine the design• Write it down so that can remember in long project:

2 weeks ->2 yrs• Others can review notebook to see what happened

°Record insights you have on certain aspect of the design as they come up

°Record of the different design & debug experiments• Memory can fail when very tired

°Industry practice: learn from others mistakes

Page 42: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (42) UC Regents Fall 2004 © UCB

Why do we keep it on-line?°You need to force yourself to take notes!

• Open a window and leave an editor running while you work1) Acts as reminder to take notes2) Makes it easy to take notes

• 1) + 2) => will actually do it

°Take advantage of the window system’s “cut and paste” features

°It is much easier to read your typing than your writing

°Also, paper log books have problems• Limited capacity => end up with many books• May not have right book with you at time vs. networked

screens• Can use computer to search files/index files to find what

looking for

Page 43: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (43) UC Regents Fall 2004 © UCB

How should you do it?°Keep it simple

• DON’T make it so elaborate that you won’t use (fonts, layout, ...)

°Separate the entries by dates• type “date” command in another window and cut&paste

°Start day with problems going to work on today

°Record output of simulation into log with cut&paste; add date

• May help sort out which version of simulation did what

°Record key email with cut&paste

°Record of what works & doesn’t helps team decide what went wrong after you left

°Index: write a one-line summary of what you did at end of each day

Page 44: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (44) UC Regents Fall 2004 © UCB

On-line Notebook Example

°Refer to the handout “Example of On-Line Log Book” on CS 152 home page:

~cs152/handouts/online_notebook_example.html

Page 45: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (45) UC Regents Fall 2004 © UCB

1st page of On-line notebook (Index + Wed. 9/6/95)* Index ==============================================================

Wed Sep 6 00:47:28 PDT 1995 - Created the 32-bit comparator componentThu Sep 7 14:02:21 PDT 1995 - Tested the comparatorMon Sep 11 12:01:45 PDT 1995 - Investigated bug found by Bart in

comp32 and fixed it+ ====================================================================Wed Sep 6 00:47:28 PDT 1995

Goal: Layout the schematic for a 32-bit comparator

I've layed out the schemtatics and made a symbol for the comparator.I named it comp32. The files are

~/wv/proj1/sch/comp32.sch ~/wv/proj1/sch/comp32.sym

Wed Sep 6 02:29:22 PDT 1995- ====================================================================

• Add 1 line index at front of log file at end of each session: date+summary• Start with date, time of day + goal• Make comments during day, summary of work• End with date, time of day (and add 1 line summary at front of file)

Page 46: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (46) UC Regents Fall 2004 © UCB

2nd page of On-line notebook (Thursday 9/7/95)+ ====================================================================Thu Sep 7 14:02:21 PDT 1995

Goal: Test the comparator component

I've written a command file to test comp32. I've placed it in ~/wv/proj1/diagnostics/comp32.cmd.

I ran the command file in viewsim and it looks like the comparator is working fine. I saved the output into a log file called~/wv/proj1/diagnostics/comp32.log

Notified the rest of the group that the comparatoris done.

Thu Sep 7 16:15:32 PDT 1995- ====================================================================

Page 47: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (47) UC Regents Fall 2004 © UCB

3rd page of On-line notebook (Monday 9/11/95)+ ====================================================================Mon Sep 11 12:01:45 PDT 1995

Goal: Investigate bug discovered in comp32 and hopefully fix it

Bart found a bug in my comparator component. He left the followinge-mail.

-------------------From [email protected] Sun Sep 10 01:47:02 1995Received: by wayne.manor (NX5.67e/NX3.0S)

id AA00334; Sun, 10 Sep 95 01:47:01 -0800Date: Wed, 10 Sep 95 01:47:01 -0800From: Bart Simpson <[email protected]>To: [email protected], old_man@gokuraku, hojo@sanctuarySubject: [cs152] bug in comp32Status: R

Hey Bruce,I think there's a bug in your comparator. The comparator seems to think that ffffffff and fffffff7 are equal.

Can you take a look at this?Bart----------------

Page 48: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (48) UC Regents Fall 2004 © UCB

4th page of On-line notebook (9/11/95 contd)

I verified the bug. here's a viewsim of the bug as it appeared..(equal should be 0 instead of 1)

------------------SIM>stepsize 10nsSIM>v a_in A[31:0]SIM>v b_in B[31:0]SIM>w a_in b_in equalSIM>a a_in ffffffff\hSIM>a b_in fffffff7\hSIM>simtime = 10.0ns A_IN=FFFFFFFF\H B_IN=FFFFFFF7\H EQUAL=1 Simulation stopped at 10.0ns.-------------------

Ah. I've discovered the bug. I mislabeled the 4th net inthe comp32 schematic.

I corrected the mistake and re-checked all the otherlabels, just in case.

I re-ran the old diagnostic test file and tested it againstthe bug Bart found. It seems to be working fine. hopefullythere aren’t any more bugs:)

Page 49: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (49) UC Regents Fall 2004 © UCB

5th page of On-line notebook (9/11/95 contd)

On second inspectation of the whole layout, I think I canremove one level of gates in the design and make it go faster.But who cares! the comparator is not in the critical pathright now. the delay through the ALU is dominating the criticalpath. so unless the ALU gets a lot faster, we can live with a less than optimal comparator.

I e-mailed the group that the bug has been fixed

Mon Sep 11 14:03:41 PDT 1995- ====================================================================

• Perhaps later critical path changes; • What was idea to make comparator faster? • Check on-line notebook!

Page 50: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (50) UC Regents Fall 2004 © UCB

Added benefit: cool post-design statisticsSample graph from the Alewife project:

• For the Communications andMemory Management Unit (CMMU)

• These statistics came fromon-line record of bugs

Page 51: CS152 – Computer Architecture and Engineering Lecture 6 ...cs152/fa04/lecnotes/lec3-2.pdf · CS 152 L06 Single Cycle 1 (4) UC Regents Fall 2004 © UCB How to Design a Processor:

CS 152 L06 Single Cycle 1 (51) UC Regents Fall 2004 © UCB

Lecture Summary°5 steps to design a processor

1. Analyze instruction set => datapath requirements2. Select set of datapath components & establish clock methodology3. Assemble datapath meeting the requirements4. Analyze implementation of each instruction to determine setting of control

points that effects the register transfer.5. Assemble the control logic (Next Lecture)

°MIPS makes it easier• Instructions same size; Source registers, immediates always in same place• Operations always on registers/immediates

°Single cycle datapath => CPI=1, CCT => long°On-line Design Notebook

• Open a window and keep an editor running while you work;cut&paste• Former CS 152 students (and TAs) say they use on-line notebook

for programming as well as hardware design; one of most valuable skills

• Refer to the handout as an example