tdt4255 computer designtdt4255 computer design review ... · tdt4255 computer designtdt4255...

44
1 TDT4255 Computer Design TDT4255 Computer Design Review Lecture – First Half Magnus Jahre TDT4255 – Computer Design

Upload: buinga

Post on 16-Jun-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

1

TDT4255 Computer DesignTDT4255 Computer Design

Review Lecture – First Half

Magnus Jahre

TDT4255 – Computer Design

Page 2: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

2

ABOUT THE EXAM

TDT4255 – Computer Design

Page 3: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

3

About exam• The exam will cover a large part of the curriculum• The exam will cover a large part of the curriculum

(reading list) • Exam properties that we seek:

– Comprehensible and unambiguous– Correct– Reasonable (e.g. not too easy, not too difficult, not ask about

i t t d t il b t th t t f i i l dunimportant details but rather try to focus on principles and understanding, etc.)

– Relevant (same as above) – Differentiating (NTNU has decided that an 'A' should be anDifferentiating (NTNU has decided that an A should be an

outstanding result, and we need to have some difficult questions to be able to find eventual A-candidates and to get a reasonable distribution of the students among the possible marks.) U di t bl (W thi k it h ld t b i i f ti– Unpredictable (We think it should not be given information or answers to questions that are of a kind that makes it possible for smart or pushing students to find out what the exam will include or not. We want to influence the students so that they prepare for the

b t i t i i th l i f th t i l

TDT4255 – Computer Design

exam by trying to maximize the learning of the course material rather than by speculation :-) ).

Page 4: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

4

How to Answer an ExamHow to Answer an Exam Question• Only answer what is asked for

No points awarded for answers that are besides the point– No points awarded for answers that are besides the point

• Only answer what you are reasonably sure is correct• Only answer what you are reasonably sure is correct– Norwegian saying: ”It’s better to keep you mouth shut and let

people think you are stupid than to open your mouth and remove all d bt ”doubt.”

• There is a limited amount of space available to• There is a limited amount of space available to answer the questions– Prioritize: good priorities indicate good understanding

TDT4255 – Computer Design

g p g g

Page 5: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

5

Example Assignment (1/2)

• Explain the difference between a write-through and a write back strategy for cacheswrite-back strategy for caches

• Good answer:• Good answer:– A write-through strategy updates main memory on all cache writes– A write-back strategy writes back dirty data when the block is

evicted from the cache

• Why is this good?– Answers the question– Only answers the question

TDT4255 – Computer Design

Only answers the question

Page 6: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

6

Example Assignment (2/2)• Explain the difference between a write through and a• Explain the difference between a write-through and a

write-back strategy for caches

• Poor answer:– A write-through strategy updates main memory on all cache writesg gy p y– A write-back strategy writes back dirty data when the block is

evicted from the cache– Set associative caches are common in current processors– Set associative caches are common in current processors– Fully associative caches are popular because they give the lowest

miss rates(th ti ith ibl i l t f t b t– (the answer continues with any possible irrelevant facts about caches where some are correct and others are wrong or at least imprecise)

N t k d f ! I i !TDT4255 – Computer Design

Not asked for! Imprecise!

Page 7: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

7

Other Practicalities

• The exam will have no multiple choiceTrade off: hard to write vs easy to grade– Trade off: hard to write vs. easy to grade

• MIPS fact sheet will be providedp

• I will make last years exam for TDT4160 available– Curriculum is very different– Introductory course: You will get harder questions– Illustrates my exam styleIllustrates my exam style

TDT4255 – Computer Design

Page 8: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

8

Chapter 1 Reviewp

TDT4255 – Computer Design

Acknowledgement: Slides are adapted from Morgan Kaufmann companion material

Page 9: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

9

Defining PerformanceDefining Performance• Which airplane has the best performance?

Boeing 747

Boeing 777

Boeing 747

Boeing 777

DouglasDC-8-50

BAC/SudConcorde

Douglas DC-8-50

BAC/SudConcorde

0 100 200 300 400 500

Passenger Capacity

0 2000 4000 6000 8000 10000

Cruising Range (miles)

BAC/SudConcorde

Boeing 747

Boeing 777

BAC/SudConcorde

Boeing 747

Boeing 777

0 500 1000 1500

DouglasDC-8-50

Concorde

Cruising Speed (mph)

0 100000 200000 300000 400000

Douglas DC-8-50

Concorde

Passengers x mph

TDT4255 – Computer Design

Cruising Speed (mph) Passengers x mph

Page 10: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

10

Response Time• Book definition: Time from issuing a command to its

completionThi i ft f d t th t d ti– This is often referred to as the turn-around time

• More common response time definition: Time fromMore common response time definition: Time from issue to first response

• Execution time is the time the processor is busy execution the programg– Turn-around time includes the time the process waits to be

executed, execution time does notAlso: user execution time vs system execution time

TDT4255 – Computer Design

– Also: user execution time vs. system execution time

Page 11: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

11

Response Time and Throughput

• Throughputg p– Total work done per unit time

• How are response time and throughput affected byy– Replacing the processor with a faster version?– Adding more processors?Adding more processors?

TDT4255 – Computer Design

Page 12: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

12

CPI in More Detail• If different instruction classes take different numbers of

cyclescycles

n

1i

ii )Count nInstructio(CPICycles Clock

Weighted average CPI

n

1i

ii CountnInstructio

Count nInstructioCPICountnInstructio

Cycles ClockCPI 1i

Relative frequency

TDT4255 – Computer Design

Page 13: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

13

Appendix D Reviewpp

TDT4255 – Computer Design

Acknowledgement: Slides are adapted from Morgan Kaufmann companion material

Page 14: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

14

Combinatorial logic

• Combinatorial logic only depends on current inputsWe don’t need a clock!– We don t need a clock!

• There might be inputs that are irrelevant to our circuit• There might be inputs that are irrelevant to our circuit– Don’t cares– Room for optimizationRoom for optimization

TDT4255 – Computer Design

Page 15: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

15

32 Bit ALU

• Exploit the 1 bit ALU abstraction to create aabstraction to create a wide ALU– Called a ripple carry pp y

adder

• Ripple carry adders are slow– Carry propagation

through the circuit is the critical path

TDT4255 – Computer Design

Page 16: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

16

Carry Lookahead• Idea: We can use more logic to shorten the critical

path of a ripple carry adder

• Each carry bit uses all previous carries and inputs– We can compute each carry directly by applying the formulas

recursively– But: Logic overhead grows quickly

• Two bit carry lookahead example:

1111112

0000001

bacacbcbacacbc

TDT4255 – Computer Design

11000000100000012 ][][ babacacbabacacbbc

Page 17: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

17

Sequential Systems

Cl ki th d l i• Clocking methodologies– Edge triggered: State elements are updated on clock transitions– Level triggered: State elements are updated continuously while theLevel triggered: State elements are updated continuously while the

clock is either 1 or 0– Choose one or the other

Different methodologies may be appropriate for different production– Different methodologies may be appropriate for different production technologies

TDT4255 – Computer Design

Page 18: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

18

Register

• Collection of flip-fl l t h th t

reg: process(clk)begin

flops or latches that store multi-bit values

if rising_edge(clk) thendata_out <= data_in_1;

end if;

• Register files end process reg;

VHDL d i id ti l tg

contain multiple registers and access

VHDL code is identical to latch/flip-flop except that the signals are vectors and not g

logic scalars

TDT4255 – Computer Design

Page 19: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

19

Register File Example

2 P t R d l i 1 P t W it l iTDT4255 – Computer Design

2 Port Read logic 1 Port Write logic

Page 20: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

20

Finite State Machines

• Commonly synchronousChanges state on clock– Changes state on clock tick

• Two types– Moore: Next state only

depends on current state– Mealy: Next state

depends on current state M M l ?

depe ds o cu e t stateand inputs Moore or Mealy?

TDT4255 – Computer Design

Almost all electronic systems contain a number of state machines

Page 21: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

21

Chapter 2 Reviewp

TDT4255 – Computer Design

Acknowledgement: Slides are adapted from Morgan Kaufmann companion material

Page 22: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

22

Instruction Set DesignInstruction Set DesignDP1 Si li it f l it• DP1: Simplicity favors regularity– Regularity makes implementation simpler– Simplicity enables higher performance at lower costSimplicity enables higher performance at lower cost

• DP2: Smaller is faster

• DP3: Make the common case fast– Small constants are common– Immediate operand avoids a load instruction

• DP4: Good design demands good compromises– Different formats complicate decoding, but allow 32-bit instructions uniformly

TDT4255 – Computer Design

Different formats complicate decoding, but allow 32 bit instructions uniformly– Keep formats as similar as possible

Page 23: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

23

MIPS R-format Instructionsop rs rt rd shamt funct

6 bits 6 bits5 bits 5 bits 5 bits 5 bits

• Instruction fields– op: operation code (opcode)

6 bits 6 bits5 bits 5 bits 5 bits 5 bits

op: operation code (opcode)– rs: first source register number– rt: second source register number– rd: destination register number– shamt: shift amount (00000 for now)– funct: function code (extends opcode)( p )

TDT4255 – Computer Design

Page 24: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

24

MIPS I-format Instructions

op rs rt constant or address6 bits 5 bits 5 bits 16 bits

• Immediate arithmetic and load/store instructions– rt: destination or source register number– Constant: –215 to +215 – 1– Address: offset added to base address in rs

TDT4255 – Computer Design

Page 25: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

25

Branch Addressing• Branch instructions specify

– Opcode, two registers, target address

M t b h t t b h• Most branch targets are near branch– Forward or backward

op rs rt constant or address6 bit 5 bit 5 bit 16 bit6 bits 5 bits 5 bits 16 bits

PC-relative addressingg Target address = PC + offset × 4 PC already incremented by 4 by this time

TDT4255 – Computer Design

PC already incremented by 4 by this time

Page 26: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

26

Jump Addressing• Jump (j and jal) targets could be anywhere in text

segmentEncode full address in instruction– Encode full address in instruction

op addressop address6 bits 26 bits

(P d )Di t j dd i (Pseudo)Direct jump addressing Target address = PC31…28 : (address × 4)

TDT4255 – Computer Design

Page 27: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

27

Local Data on the StackLocal Data on the Stack

• Local data allocated by calleee g C automatic variables– e.g., C automatic variables

• Procedure frame (activation record)– Used by some compilers to manage stack storage

TDT4255 – Computer Design

Used by some compilers to manage stack storage

Page 28: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

28

Memory LayoutMemory Layout• Text: program code

St ti d t l b l• Static data: global variables

t ti i bl i C– e.g., static variables in C, constant arrays and strings

– $gp initialized to address$gp initialized to address allowing ±offsets into this segment

• Dynamic data: heap– E.g., malloc in C, new in

JJava• Stack: automatic storage

TDT4255 – Computer Design

Page 29: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

29

Translation and Startup

Many compilers produce object modules directlyj y

St tiStatic linking

TDT4255 – Computer Design

Page 30: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

30

Chapter 3 Reviewp

TDT4255 – Computer Design

Acknowledgement: Slides are adapted from Morgan Kaufmann companion material

Page 31: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

31

Integer AdditionInteger Addition• Example: 7 + 6

Overflow if result out of rangeAddi d d fl Adding +ve and –ve operands, no overflow

Adding two +ve operandsO fl if lt i i 1 Overflow if result sign is 1

Adding two –ve operandsOverflow if result sign is 0

TDT4255 – Computer Design

Overflow if result sign is 0

Page 32: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

32

MultiplicationMultiplication• Start with long-multiplication approach

1000multiplicand

1000× 1001

10000000

multiplier

0000 0000 1000 1001000prod ct 1001000

Length of product

product

g pis the sum of operand lengths

TDT4255 – Computer Design

Page 33: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

33

Optimized MultiplierOptimized Multiplier• Perform steps in parallel: add/shift

One cycle per partial-product addition That’s ok, if frequency of multiplications is low

TDT4255 – Computer Design

, q y p

Page 34: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

34

Dividend/Divisor = Quotient

DivisionC f• Check for 0 divisor

• Long division approachIf divisor ≤ dividend bits

quotient

dividend – If divisor ≤ dividend bits• 1 bit in quotient, subtract

– Otherwise

10011000 1001010

-1000• 0 bit in quotient, bring down next

dividend bit

• Restoring division

100010101 1010

divisor

g– Do the subtract, and if remainder

goes < 0, add divisor back• Signed division

1010-1000

10remainder• Signed division

– Divide using absolute values– Adjust sign of quotient and remainder

n-bit operands yield n-bitquotient and remainder

TDT4255 – Computer Design

j g qas required

Page 35: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

35

Representable Floating PointRepresentable Floating Point Numbers

TDT4255 – Computer Design

Page 36: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

36

IEEE Floating-Point FormatIEEE Floating Point Formatsingle: 8 bitsdouble: 11 bits

single: 23 bitsdouble: 52 bits

S Exponent Fractiondouble: 11 bits double: 52 bits

Bias)(ExponentS 2Fraction)(11)(x

• S: sign bit (0 non-negative, 1 negative)• Normalize significand: 1.0 ≤ |significand| < 2.0

– Always has a leading pre-binary-point 1 bit, so no need to represent it explicitly (hidden bit)

– Significand is Fraction with the “1.” restored• Exponent: excess representation: actual exponent + Bias

– Ensures exponent is unsigned– Single: Bias = 127; Double: Bias = 1203

TDT4255 – Computer Design

Single: Bias 127; Double: Bias 1203

Page 37: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

37

Chapter 4 Reviewp

TDT4255 – Computer Design

Acknowledgement: Slides are adapted from Morgan Kaufmann companion material

Page 38: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

38

Single Cycle DatapathSingle Cycle Datapath

TDT4255 – Computer Design

Page 39: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

39

R-Type InstructionR Type Instruction

TDT4255 – Computer Design

Page 40: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

40

Load InstructionLoad Instruction

TDT4255 – Computer Design

Page 41: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

41

Branch-on-Equal InstructionBranch on Equal Instruction

TDT4255 – Computer Design

Page 42: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

42

Datapath With Jumps AddedDatapath With Jumps Added

TDT4255 – Computer Design

Page 43: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

43

Multi-cycle Datapath (1/2)Multi cycle Datapath (1/2)• Idea: Add registers at strategic points in the datapathg g p p• Activate only needed functional units with control

signals

TDT4255 – Computer Design

Page 44: TDT4255 Computer DesignTDT4255 Computer Design Review ... · TDT4255 Computer DesignTDT4255 Computer Design Review Lecture ... Slides are adapted from Morgan Kaufmann ... Representable

44

Multicycle Datapath (2/2)Multicycle Datapath (2/2)• Area savings possible (but not necessary)g p ( y)

– Only one memory– Only one ALU

TDT4255 – Computer Design