eece476 lectures 4,5 –alus, add, multiply, and floating-point chapter 3: computer arithmetic the...

52
EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating- Point Chapter 3: Computer Arithmetic The University of British Columbia EECE 476 © 2005 Guy Lemieux

Post on 19-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

EECE476

Lectures 4,5 –ALUs,Add, Multiply, and Floating-Point

Chapter 3: Computer Arithmetic

The University ofBritish Columbia EECE 476 © 2005 Guy Lemieux

Page 2: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

2

Announcements

• Assignment 1

– First part posted on web.• Do it as practice for tomorrow’s tutorial !!

– Second part coming soon.• Do it as practice for QUIZ next week!

• Quiz Dates– Quiz 1 Thurs, Sept 22nd based on Assign 1– Quiz 2, etc TBD

Page 3: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

3

Reading

• Chapter 3– 3.2 signed numbers– 3.3 addition and subtraction– 3.4 multiplication– 3.5 division– 3.6 floating-point (read lightly)

Page 4: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

4

Computer Arithmetic

• Objective 1– Discover the “logic complexity” of the

different types of arithmetic done by a CPU

– The complexity will have an impact on performance later!

• Objective 2– Learn how to build an ALU for your project

Page 5: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

5

The Conclusions

• Add is easy– Fast adding is not too bad either…– Subtraction: addition’s tricky pal

• Multiply is hard…– But you can add many times

• Divide is really hard…– Divide and be conquered!

• Anything floating-point is impossible!– Well, not quite, but you will get the idea…

Page 6: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

6

Computer Architecture?

• Recall– Computer Architecture = ISA + Machine Organization– Machine Organization = implementation details!

• Begin to consider coupling– ISA Machine Organization

• Heart of computer: arithmetic calculations– Done by ALU: Arithmetic Logic Unit

• Some parts not done by ALU– Decision-making, iteration, memory/state …– All of these are important as well

Page 7: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

7

MIPS Arithmetic Instructions• Let’s design an ALU that MIPS can use !!

• Many different operations– Arithmetic

• Add, AddU, Sub, SubU,• AddI, AddIU • Mult, MultiU, Div, DivU

– Logical• And, Or, Xor, Nor• AndI, OrI, XorI

– Logical/Arithmetic• SLT, SLTU• SLTI, SLTIU

– Shifting (Left/Right & Logical/Arithmetic & Const/Variable)• SLL, SRL, SRA, SLLV, SRLV, SRAV

Page 8: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

8

MIPS ALU Design

• First: simplify!

– Throw out “hard” operations• Mult, Div

– Extract & group basic operations• Add, Sub• And, Or, Nor, Xor• SLT• Shifting (is this hard?)

Page 9: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

9

MIPS ALU Design

• Second: simplify!

– Identify common optimizations• Sub = variation of Add• Nor = variation of Or (why Nor ?)

– Some other CPUs have even more operations• Bit set, Bit test, Bit clear, etc

Page 10: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

10

ALU Design 1

• Easy way…

– Try to be more creative!

F

Instruction/operation

+/–

*

Page 11: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

11

ALU Design 2

• Start with single bit operations– All operations share same 2 inputs– Small optimizations may be possible

• E.g., Or and Nor

• E.g., Add and And (see problem set)• Generally, these aren’t too helpful

NorOperation

FA

B

Page 12: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

12

ALU Design 3

• Build up to larger multi-bit operations

– Bigger & better optimizations are possible• E.g., Add and Sub• E.g., SLT

Page 13: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

13

Add/AND/OR for ALU: Bit-based

b

0

2

Result

Operation

a

1

CarryIn

CarryOut

Result31a31

b31

Result0

CarryIn

a0

b0

Result1a1

b1

Result2a2

b2

Operation

ALU0

CarryIn

CarryOut

ALU1

CarryIn

CarryOut

ALU2

CarryIn

CarryOut

ALU31

CarryIn

One Bit Multiple Bits

Page 14: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

14

Subtract for ALU

• A – B = A + (B + 1)

0

2

Result

Operation

a

1

CarryOut

0

1

Binvert

b

CarryIn

Page 15: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

15

Fast Adders

• We will assume fast adders are available– Slow O(n) vs. Fast O(log n)– Eg, carry-lookahead, carry-skip, carry-select

• You should know “fast adders” already, but…– Not a part of this course– Fast adders are NOT on assignments, tests or exam– Do NOT use a fast adder in your project

• FPGAs have their own “fast carry chain” to do adds quickly, and you will merely confuse the tool

Page 16: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

16

Set Less Than

Most-significant bitAll other bits

• One-bit ALU blocks

0

3

Result

Operation

a

1

CarryIn

CarryOut

0

1

Binvert

b 2

Less

0

3

Result

Operation

a

1

CarryIn

0

1

Binvert

b 2

Less

Set

Overflowdetection

Overflow

Page 17: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

17

Set Less Than

• Stitching theone-bit ALUblockstogether

• Notice ‘Set’ outputis sign bit of A-B– Used as ‘Less’ input on LSB– Other ‘Less’ inputs

forced to 0

S eta3 1

0

A LU 0 R es ult0a0

R es ult1a1

0

R es ult2a2

0

O p era tio n

b3 1

b0

b1

b2

R es ult31

O ve rflo w

B in ve rt

C a rry In

Le ss

C a rryIn

C a rryO u t

A LU 1Le ss

C a rryIn

C a rryO u t

A LU 2Le ss

C a rryIn

C a rryO u t

A L U 31Less

C a rryIn

Page 18: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

18

Testing Result for Zero

Seta31

0

Result0a0

Result1a1

0

Result2a2

0

Operation

b31

b0

b1

b2

Result31

Overflow

Binvert

Zero

ALU0Less

CarryIn

CarryOut

ALU1Less

CarryIn

CarryOut

ALU2Less

CarryIn

CarryOut

ALU31Less

CarryIn

Page 19: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

19

BIG PICTURE

• This course is about computer architecture

• Why care about ALU design details?– Our goal is performance– Some ALU designs may be faster or slower

• You must understand the impact they have on– Clock frequency (cycle time)– Instruction set design– More advanced things (eg, impact of multiple ALUs)– Ultimately, performance!

Page 20: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

Multiply

Shift,Add,

Shift,Add,

Shift,Add…

Page 21: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

21

Multiplication - Decimal

• More complex than addition– Multiple additions (and shifts)– More gates/area, slower

• Gradeschool algorithm:

Multiplicand M 13Multiplier Q x 11 13 <- 13 x 1 13 <- 13 x 10Product P 143

Page 22: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

22

Multiplication - Binary• Same algorithm, different digits

Multiplicand M 1101 (13)Multiplier Q x 1011 (11) 1101 <- 1 Q0 Partial Product PP0 1101 <- 1 Q1 Partial Product PP1 0000 <- 0 Q2 Partial Product PP2 1101 <- 1 Q3 Partial Product PP3 10001111 (143) Product P

• M bits x N bits => M+N bit product• Binary makes it easy:

– Bit Qi is zero => PPi is 0– Bit Qi is one => PPi is M (shifted i times left)– Product is sum of PPs

Page 23: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

23

Multiplication – Hardware V0a

• Array multiplier• Stage i accumulates PPi

(0 or M shifted i)depending on Qi

• Answer Pcomes outat bottom

• Slow!Big!

Q0

M0M1M2M3

M0M1M2M3

M0M1M2M3

M0M1M2M3

Q1

Q2

Q3

P0P1P2P3P4P5P6P7

0 0 0 0

Page 24: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

24

Multiplication – Hardware V0b

• at each stage shift M left ( x 2)• next bit of Q determines whether to add in shifted multiplicand• accumulate 2n bit partial product at each stage• each stage identical: need only 1 stage in hardware (use multiple

cycles)

Q0

M0M1M2M3

M0M1M2M3

M0M1M2M3

M0M1M2M3

Q1

Q2

Q3

P0P1P2P3P4P5P6P7

0 0 0 00 0 0

Page 25: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

25

Multiplication – Hardware V1• M: 64b shift register, Q: 32b shift register, P: 64b register• Initially, ensure high bits of M are zero (M63..M32 = 0)

P: Product

M: Multiplicand

64-bit ALU

Shift Left

Shift Right

WriteControl

32 bits

64 bits

64 bits

Multiplier = datapath + control

Q: Multiplier

Q0

Add

Page 26: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

26

Multiplication – Hardware V1

Notes• 1 clock cycle per bit (32 total)• 0’s are left-shifted into M

– Lower bits of P never change once formed

• Half of bits in M are always zero– 64 bit ALU is wasted

Observations lead to refinement:• Right-shift P instead of left-shifting M

Page 27: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

27

Multiplication Algorithm• Russian Peasant Algorithm

PP=MP=0while( Q != 0 ) {

if( Q is odd ) // ie, if bit 0 of Q is = ‘1’ P = P + PP // accumulate partial product (PP) in P

end ifPP = PP * 2 // shift PP left 1 positionQ = Q / 2 // shift Q right 1 position

}

• Compare this to the hardware just presented!– Each loop iteration takes one clock cycle– How many cycles are required?

Page 28: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

28

Multiplication – Hardware V2• M: 32b register, Q: 32b shift register, P: 64b shift register• Initially, P=0. Only high bits of P (63..32) affected by a write.

P: Product

M: Multiplicand

32-bit ALU

Shift Right

Shift Right

WriteControl

32 bits

32 bits

64 bits

Q: Multiplier

Q0

Add

Page 29: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

29

Multiplication – Hardware V2

• What’sreallygoingon?

Q0

Q1

Q2

Q3

P0P1P2P3P4P5P6P7

0 0 0 0

M0M1M2M3

M0M1M2M3

M0M1M2M3

M0M1M2M3

Page 30: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

30

Multiplication – Hardware V2

Notes• 1 clock cycle per bit (32 total)• Lower 32 bits of P are initially unused

– Holds zero, but unused– Each cycle, 1 fewer unused bit

• 0’s are right-shifted into Q– Initially: 32 bits used in Q– Each step: 1 fewer bits needed in Q– At end: Q is destroyed

Observations lead to refinement:• Use lower 32 bits of P to hold Q

Page 31: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

31

Multiplication – Hardware V3

• M: 32b register, P: 64b shift register (lower half represents Q)• Initially, P=Q. Only high bits of P (63..32) are changed on write.

P: Product

M: Multiplicand

32-bit ALU

Shift Right

WriteControl

32 bits

64 bits

Q0

Add

Q: Multiplier

Page 32: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

32

Multiplication – Hardware V3

Notes• P has two halfs: high, low

MIPS multiply instruction MultU• 32 regular MIPS registers• 2 special MIPS registers: HI, LO

– Why special? Need to right-shift contents

• HI, LO store results of MultU

Page 33: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

33

Multiplication – Signed Numbers• Gradeschool algorithm assumes unsigned numbers

Multiplicand M 1101 (13)Multiplier Q x 1011 (11) 1101 <- 1 Q0 Partial Product 0 1101 <- 1 Q1 Partial Product 1 0000 <- 0 Q2 Partial Product 2 1101 <- 1 Q3 Partial Product 3 10001111 (143)

• Signed numbers?– Example above reads (-3) * (-5) = (-113), clearly wrong!– Requires some adjustments

Page 34: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

34

Multiplication – Signed Numbers

Two Cases For Signed Multiply: P = M*Q• Case A: M signed, Q unsigned or Q >= 0

– Add using sign-extension of PP

Multiplicand M 1101 (–3)Multiplier Q x 1011 (11) 11111101 <- 1 Q0 Partial Product 0 1111101 <- 1 Q1 Partial Product 1 000000 <- 0 Q2 Partial Product 2 11101 <- 1 Q3 Partial Product 3 11011111 (–33)

Page 35: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

35

Multiplication – Signed Numbers

Two Cases For Signed Multiply: P = M*Q• Case B: M signed, Q signed and Q < 0

– One method:• Note that P = M*Q = (-M)*(-Q) = (M+1)*(Q+1)• Now (Q+1) is positive, follow Case A• How to do this in hardware?

– Use sign bits to modify M and Q, two extra adds for +1’s

– Alternate method: Booth encoding• Look it up!

Page 36: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

Divide?

Forget it!

Basically, do the long division thing over multiple clock cycles:

1) Subtract divisor2) If >= 0, put “1” in answer, do next bit3) If <0, put “0” in answer, add divisor back, do next bit

Page 37: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

Floating-Point

Page 38: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

38

Integers and Beyond• Integers perfectly accurate, no error

– 32bit integer: -2,147,483,648 to 2,147,483,647– integers “overflow” or wrap on +1 from 2,147,483,647 to -2,147,483,648

• What about numbers with non-integral parts?– Large range in values, possibly large number of significant digits….– Rationals

• 0.5 => can represent as ½• 1/3 => 0.33333333333333333333333• 63/127 => 0.4960629921259842519685039370…

– Irrationals• sqrt(2) = 1.41421356237309504880168872420…• Transcendentals: pi = 3.14159265927…, e = 2.71828183…

– Scientific• NA = 6.022 x 1023 Avagadro’s number (atoms in one mole)• G = 6.67259 × 10-11 gravitational constant (F = -GMm/r2)• c = 2.99792458 x 108 speed of light (m/s)

Page 39: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

39

Floating Point Numbers

• How to represent non-integral numbers in binary?– Many possible ways

• e.g., store (numerator, denominator) => doesn’t work for irrationals

– All ways have limitations• Cannot represent all real numbers: infinite number of them, finite

number of bits!

– Need a standard on how to interpret the bits• e.g. two’s complement for signed integers

– Benefits of a standard:• Software portability: same answer on any machine• Data portability: binary data can be sent directly, no conversions• Numerical environment: defines level of mathematical precision,

allows research into error analysis, avoids future problems

Page 40: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

40

Floating Point Numbers: IEEE754

• A floating-point standard IEEE754– Standard published in 1985

• Started in 1977• Primarily work of William Kahan (UofT student)

– Based largely on development of Intel 8087• A floating-point processor designed to work with the 8086

– Intel’s chip was a model to follow• 8087 first commercial product to implement IEEE 754• Other companies implemented IEEE 754, looked at Intel’s

chip

Page 41: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

41

Binary Representationof Fractional Numbers

• Example

101.011= 1*22 + 0*21 + 1*20+ 0*2-1 + 1*2-2 + 1*2-3

= 4 + 0 + 1 + 0 + ¼ + 1/8

= 5.375

Page 42: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

42

Binary Representationof Floating-Point Numbers

• Recall scientific notation:

6.022 x 1023

6.022 is the normalized significant part

10 is the base or radix23 is the exponent

• Can do the same in binary:

1.011 x 23

= 1.011 x 8= 1011= 11 (base=10)

• Negative numbers?– Need to remember the sign of

the significant part

• Generally:

(–1)S x M x be

Where:

S is sign (0 or 1)M is significandb is base/radixe is exponent

Page 43: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

43

IEEE 754 Binary Single-Precision Floating-Point Representation

(–1)S x M x be

1.011 x 23 S = 0, M = 1.011, b = 2, e = 3

Encoding into bits:– assume b=2 (binary!), no need to store/remember– S one bit: 0– M 24 bits: 1011 0000 0000 0000 0000 0000

• If normalized, first (leftmost) digit of m is always a ‘1’, never a ‘0’• Don’t store the leading ‘1’, instead define M=1.F an store F

– F 23 bits: 011 0000 0000 0000 0000 0000– convert e=3 into binary (e may be negative!):

• Use biased notation, called Excess-N• Excess 127 used here: Define E = e+127 = 130

– E 8 bits, E = 1000 0010

Page 44: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

44

IEEE 754 Binary Floating-Point Representation

Representation of floating point numbers in IEEE 754 standard:

Single precision32 bits total

1 8 23

sign S E F

exponent:excess 127binary integer

significand:normalized binarysignificand w/ hiddeninteger bit: M = 1.F

Double precision64 bits total

1 11 52S E F

exponent:excess 1023binary integer

Page 45: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

45

IEEE 754 Precision

• Single precision– Enough for 9 decimal digits of accuracy

• Double precision– Enough for 17 decimal digits of accuracy

• Storing floating-point numbers to disk?Two options:– A: write binary value (32bits or 64bits)

• IEEE 754 standard allows us to interchange these values!– B: write value as decimal digits, eg in ASCII

• Need to write 9 (or 17) decimal digits• Need to write sign, exponent as well• Reading back in: convert to binary, get same binary value as before

Page 46: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

46

IEEE 754 Accuracy

• Not all real values can be represented– Inf. # of values between ½ and ¼ – Inf. # of values between ½ and 3/8 – Inf. # of values between ½ and 7/16, etc

• All floating-point numbers are approximations– Calculations with approximations introduce errors– Reduce size of errors by proper rounding– 754: keep extra bits of precision during calculations for rounding– Cannot solve all problems: algorithm numerical stability a must!– You get same problems on every machine using IEEE754

Page 47: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

47

IEEE 754 Range

• See /usr/include/limits.h on a Unix system

• Single precision:– Minimum: 1.175494351E-38 (FLT_MIN)– Maximum: 3.402823466E+38 (FLT_MAX)

• Double precision:– Minimum: 2.2250738585072014E-308 (DBL_MIN)– Maximum: 1.7976931348623157E+308 (DBL_MAX)

Page 48: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

48

IEEE 754 Range

• What happens on overflow ?– Depends, 754 standard defines some special cases– Normally, you get value called Infinity

• Tiny numbers ? (smaller than smallest normal)– Goes to zero? Called underflow

• This is rather drastic

– 754 standard defines denormalized numbers• Underflow occurs gradually…• Underflow/denormals hard to design hardware

– Not all chips support it

• Often use software interrupts to handle denormals

Page 49: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

49

IEEE 754 Special Cases

• Infinity, -Infinity– Caused by overflow– Caused by 1/0 (note: no error produced)

• +0, -0– This is claimed to be useful!

• NaN– “Not a Number”– 0/0– Infinity/Infinity, 0*Infinity, Infinity–Infinity, etc– Sqrt(-number)– Infectious: NaN + number = NaN, NaN x number = NaN, etc

• Comparisons (<, >, =, etc) with Infinity? NaN?– Cases all defined by the standard

Page 50: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

50

IEEE 754 Hardware

1. Compare exponents

2. Shift smaller number right

3. Add

4. Normalize

5. Round

FractionSign ExponentFractionSign Exponent

Big ALU

FractionSign Exponent

Page 51: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

51

Shifters

• Left as an exercise...

Page 52: EECE476 Lectures 4,5 –ALUs, Add, Multiply, and Floating-Point Chapter 3: Computer Arithmetic The University of British ColumbiaEECE 476© 2005 Guy Lemieux

52

BIG PICTURE

• This course is about computer architecture

• Why care about ALU design details?– Our goal is performance– Some ALU designs may be faster or slower

• You must understand the impact they have on– Clock frequency (cycle time)– Instruction set design– More advanced things (eg, impact of multiple ALUs)– Ultimately, performance!