computer arithmetic jehan-françois pâris [email protected]

133
COMPUTER ARITHMETIC Jehan-François Pâris [email protected]

Upload: estella-day

Post on 27-Dec-2015

220 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

COMPUTER ARITHMETIC

Jehan-François Pâ[email protected]

Page 2: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Chapter Organization

• Representing negative numbers• Integer addition and subtraction• Integer multiplication and division• Floating point operations• Examples of implementation

– IBM 360, RISC, x86

Page 3: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

A warning

• Binary addition, subtraction, multiplication and division are very easy

Page 4: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

ADDITION AND SUBTRACTION

Page 5: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

General concept

• Decimal addition(carry) 1_

19

+ 7

26

• Binary addition( carry) 111_

10011

+ 111

11010• 16+8+2 = 26

Page 6: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Realization

• Simplest solution is a battery of full adders

x3 y3 x2 y2 x1 y1 x0 y0

o s3 s2 s1 s0

Page 7: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Observations

• Adder add four-bit values• Output o indicates if there is an overflow

– A result that cannot be represented using 4 bits

– Happens when x + y > 15• Operation is slowed down by carry propagation

– Faster solutions (not discussed here)

Page 8: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Signed and unsigned additions

• Unsigned addition in 4-bit arithmetic

( carry) 11_1011

+ 0011

1110• 11 + 3 = 14(8 + 4 + 2)

• Signed addition in4-bit arithmetic

( carry) 11_1011

+ 0011

1110• -5 + 3 = -2

Page 9: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Signed and unsigned additions

• Same rules apply even though bit strings represent different values

• Sole difference is overflow handling

Page 10: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Overflow handling (I)

• No overflow in signed arithmetic

( carry) 111_1110

+ 0011

0001• -2 + 3 = 1(correct)

• Signed addition in4-bit arithmetic

( carry) 1__0110

+ 0011

1001• 6 + 3 -7(false)

Page 11: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Overflow handling (II)

• In signed arithmetic an overflow happens when– The sum of two positive numbers exceeds

the maximum positive value that can be represented using n bits: 2n – 1 – 1

– The sum of two negative numbers falls below the minimum negative value that can be represented using n bits: – 2n – 1

Page 12: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Example

• Four-bit arithmetic:– Sixteen possible values– Positive overflow happens when result > 7– Negative overflow happens when result < -8

• Eight-bit arithmetic:– 256 possible values– Positive overflow happens when result > 127– Negative overflow happens when result < -128

Page 13: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Overflow handling (III)

• MIPS architecture handles signed and unsigned overflows in a very different fashion:– Ignores unsigned overflows

• Implements modulo 2n arithmetic– Generates an interrupt whenever it detects a

signed overflows• Lets the OS handled the condition

Page 14: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Why?

• To keep the CPU as simple and regular as possible

Page 15: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

An interesting consequence

• Most C compilers ignore overflows– C compilers must use unsigned arithmetic for

their integer operations• Fortran compilers expect overflow conditions to

be detected– Fortran compilers must use signed arithmetic

for their integer operations

Page 16: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Subtraction

• Can be implementing by– Specific hardware– Negating the subtrahend

Page 17: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Negating a number

• Toggle all bits then add one

Page 18: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

0000 0 1111 +1 = 0000 00001 1 1110 +1 = 1111 -10010 2 1101 +1 = 1110 -20011 3 1100 +1 = 1101 -30100 4 1011 +1 = 1100 -40101 5 1010 +1 = 1011 -50110 6 1001 +1 = 1010 -6 0111 7 1000 +1 = 1001 -7

In 4-bit arithmetic (I)

Page 19: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

1000 -8 01110111 +1 =1000+1 =1000 ??1001 -7 0110 +1 = 0111 71010 -6 0101 +1 = 0110 61011 -5 0100 +1 = 0101 51100 -4 0011 +1 = 0100 41101 -3 0010 +1 = 0011 31110 -2 0001 +1 = 0010 21111 -1 0000 +1 = 0001 1

In 4-bit arithmetic (II)

Page 20: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

MULTIPLICATION

Page 21: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Decimal multiplication

(carry) 1_37

x 1274

370444

• What are the rules?– Successively multiply the

multiplicand by each digit of the multiplier starting at the right shifting the result left by an extra left position each time each time but the first

– Sum all partial results

Page 22: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Binary multiplication

(carry)111 _

1101x 101110100

1101001000001

• What are the rules?– Successively multiply the

multiplicand by each digit of the multiplier starting at the right shifting the result left by an extra left position each time each time but the first

– Sum all partial results• Binary multiplication is easy!

Page 23: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Binary multiplication table

X 0 1

0 0 0

1 0 1

Page 24: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Algorithm

• Clear contents of 64-bit product register• For (i = 0; i <32; i++) {

– If (LSB of multiplier_register ==1)• Add contents of multiplicand register to product

register– Shift right one position multiplier register– Shift left one position multiplicand register

• } / / for loop

Page 25: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Multiplier: First version

Multiplier

Product (64 bits)

Shift Left

64-bitALU

Multiplicand (64 bits)

Shift Right

Control

Page 26: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Multiplier: First version

Multiplier

Product (64 bits)

Shift Left

64-bitALU

Multiplicand (64 bits)

Shift Right

Control

As we learnedin grade school To get next bit

( LSB to MSB)

Page 27: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Explanations

• Multiplicand register must be 64-bit wide because 32-bit multiplicand will be shifted 32 times to the left– Requires a 64-bit ALU

• Product register must be 64-bit wide to accommodate the result

• Contents of multiplier register is shifted 32 times to the right so that each bit successively becomes its least significant bit (LSB)

Page 28: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Example (I)

• Multiply 0011 by 0011• Start

Multiplicand Multiplier Product0011 0011 0000

• First additionMultiplicand Multiplier Product0011 0011 0011

Page 29: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Example (II)

• Shift right and leftMultiplicand Multiplier Product0110 0001 0011

• Second additionMultiplicand Multiplier Product0110 0001 1001– 0110 + 011 = 1001

Page 30: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Example (III)

• Shift right and leftMultiplicand Multiplier Product1100 0000 1001

• Multiplier is all zeroes: we are done

Page 31: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

First Optimization

• Must have a 64-bit ALU– More complex than a 32-bit ALU

• Solution is not to shift the multiplicand– After each cycle, the LSB being added

remains unchanged– Will save that bit elsewhere and shift the

product register one position to the left after each iteration

Page 32: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Binary multiplication

1101x 101110100

1101001000101

• Observe that the least significant bit added during each cycle remains unchanged

Page 33: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Algorithm• Clear contents of 64-bit product register• For (i = 0; i <32; i++) {

– If (LSB of multiplier_register ==1)• Add contents of multiplicand register to

product register– Save LSB of product register– Shift right one position both multiplier

register and product register• } / / for loop

Page 34: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Multiplier: Second version

Multiplier

Product (64 bits)Shift Right and Save

32-bitALU

Multiplicand

Shift Right

Control+ Test

Page 35: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Decimal Example (I)

• Multiply 27 by 12• Start

Multiplicand Multiplier Product Result27 12 -- --

• First digitMultiplicand Multiplier Product Result27 12 54 --

Page 36: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Decimal Example (II)

• Shift right multiplier and productMultiplicand Multiplier Product Result27 1 5 4

• Second digitMultiplicand Multiplier Product Result27 1 32 4

Page 37: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Decimal Example (III)

• Shift right multiplier and productMultiplicand Multiplier Product Result27 0 3 24

• Multiplier equals zeroResult is obtained by concatenating contents of product and result registers– 324

Page 38: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

How did it work?

• We learned– 2712 = 2710 + 272

= 2710 + 54 = 270 + 54

• Algorithm uses another decomposition– 2712 = 2710 + 272

= 2710 + 50 + 4 = (2710 + 50) + 4 = 320 + 4

Page 39: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Example (I)

• Multiply 0011 by 0011• Start

Multiplicand Multiplier Product Result0011 0011 -- --

• First bitMultiplicand Multiplier Product Result0011 0011 0011 --

Page 40: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Example (II)

• Shift right multiplier and productMultiplicand Multiplier Product Result0011 0001 0001 1-

• Second bitMultiplicand Multiplier Product Result0011 0001 0100 1-

Product register contains 0011 + 001 = 0100

Page 41: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Example (III)

• Shift right multiplier and productMultiplicand Multiplier Product Result0011 0000 010 01-

• Multiplier equals zeroResult is obtained by concatenating contents of product and result registers– 1001 = 9

Page 42: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Second Optimization

• Both multiplier and product must be shifted to one position to the right after each iteration

• Both are now 32-bit quantities

• Can store both quantities in the product register

Page 43: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Multiplier: Third version

Multiplier + ProductShift Right and Save

32-bitALU

Multiplicand

Control+ Test

Page 44: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Third Optimization

• Multiplication requires 32 additions and 32 shift operations

• Can have two or more partial multiplications– One using bits 0-15 of multiplier– A second using bits 16-31

then add together the partial results

Page 45: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Multiplying negative numbers

• Can use the same algorithm as before but we must extend the sign bit of the product

Page 46: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Related MIPS instructions (I)

• Integer multiplication uses a separate pair of registers (hi and lo)

• mult $s0, $s1– multiply contents of register $s0 by contents

of register $s1 and store results in register pair hi-lo

• multu $s0, $s1– same but unsigned

Page 47: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Related MIPS instructions (II)

• mflo $s9– Move contents of register lo to register $s0

• mfhi $s9– Move contents of register hi to register $s0

Page 48: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

DIVISION

Page 49: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Division

• Implemented by successive subtractions• Result must verify the equality

Dividend = Multiplier× Quotient + Remainder

Page 50: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Decimal division (long division• What are the rules?

– Repeatedly try to subtract smaller multiple of divisor from dividend

– Record multiple (or zero)– At each step, repeat with a lower

power of ten– Stop when remainder is smaller

than divisor

303

7 2126

-210

26 -21

5

Page 51: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Binary division • What are the rules?

– Repeatedly try to subtract powers of two of divisor from dividend

– Mark 1 for success, 0 for failure– At each step, shift divisor one

position to the right– Stop when remainder is smaller

than divisor

011

11 1011

-11

1011 >-11

101

>>-11

10

X

X

Page 52: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Same division in decimal• What are the rules?

– Repeatedly try to subtract powers of two of divisor from dividend

– Mark 1 for success, 0 for failure– At each step, shift divisor one

position to the right– Stop when remainder is smaller

than divisor

2+1=3

3 11

-12

11 >-6

5

>-3

2

X

X

Page 53: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Observations

• Binary division is actually simpler– We start with a left-shifted version of divisor– We try to subtract it from dividend

• No need to find out which multiple to subtract– We mark 1 for success, 0 for failure– We shift divisor one position left after every

attempt

Page 54: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

How to start the division

• One 64-bit register for successive remainders

• One 64-bit register for divisor

– Start with quotient in upper half

• One 32-bit register for the quotient

Initialized with dividend

Quotient

All zeroes

Page 55: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

How we proceed (I)-

• After each step we shift the quotient to the right one position at a time

Divisor

Divisor

Div isor

Divisor

Page 56: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

How we proceed (II)

• After each step we shift the contents of the quotient register one position to the left– To make space for the new 0 or 1 being

inserted

001

0110110

Page 57: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Division Algorithm• For i in range(0,33) : # from 0 to 32

– Subtract contents of divisor register fromremainder register

– If remainder 0 :• Shift quotient register to the left• Set new rightmost bit to 1

Else :• Undo subtraction• Shift quotient register to the left • Set new rightmost bit to 0

– Shift right one position contents of divisor register

Page 58: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

A simple divider

Quotient

Remainder (64 bits)

Shift Left

64-bitALU

Divisor (64 bits)

Shift Right

Control+ Test

Page 59: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Signed division

• Easiest solution is to remember the sign of the operands and adjust the sign of the quotient and remainder accordingly

• A little problem:5 2 = 2 and the remainder is 1-5 2 = -2 and the remainder is -1

The sign of the remainder must match the sign of the quotient

Page 60: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Related MIPS instructions

• Integer division uses the same pair of registers (hi and lo) as integer multiplication

• div $s0, $s1– divide contents of register $s0 by contents of

register $s, leave the quotient in register lo and the remainder in register hi

• divu $s0, $s1– same but unsigned

Page 61: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

TRANSITION SLIDE

• Here end the materials that were on the first fall 2012 midterm

• Here start the materials that will be on the fall 2012 midterm

To be moved to the right place

Page 62: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

FLOATING POINT OPERATIONS

Page 63: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Floating point numbers

• Used to represent real numbers• Very similar to scientific notation

3.5×106, 0.82×10–5, 75×106, …• Both decimal numbers in scientific notation and

floating point numbers can be normalized:3.5×106, 8.2×10–6, 7.5×107, …

Page 64: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Fractional binary numbers

• 0.1 is ½ or 0.5ten

• 0.01 is 0.1 is 1/4 or 0.25ten

• 0.11 is ½ + ¼ = ¾ or 0.75ten

• 1.1 is 1½ or 1.5ten

• 10.01 is 2 + ¼ or 2.5ten

• 11.11 is ______ or _____

Page 65: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Normalizing binary numbers

• 0.1 becomes 1.0×2-1

• 0.01 becomes 1.0×2-2

• 0.11 becomes 1.1×2-1

• 1.1 is already normalized and equal to1.0×20

• 10.01 becomes 1.001×21

• 11.11 becomes 1______×2_____

Page 66: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Representation

• Sign + exponent + coefficient

• IEEE Standard 754– 1 + 8 + 23 = 32 bits– 1+ 11 + 52 = 64 bits (double precision)

SExp Coefficient

Page 67: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

The sign bit

• 0 indicates a positive number• 1 a negative number

Page 68: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

The exponent (I)

– 8 bits for single precision– 11 bits for double precision

• With 8 bits, we can represent exponents between -126 and + 127– All-zeroes value is reserved for the zeroes and

denormalized numbers– All-ones value are reserved for the infinities and

NaNs (Not a Number)

Page 69: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

The exponent (II)

• Exponents are represented using a biased notation– Stored value = actual exponent + bias

• For 8 bit exponents, bias is 127– Stored value of 1 corresponds to –126– Stored value of 254 corresponds to +127

0 and 255 are reserved for special values

Page 70: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

The exponent (III)

• Biased notation simplifies comparisons:• If two normalized floating point numbers

have different exponents, the one with the bigger exponent is the bigger of the two

Page 71: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Special values (I)

• Signed zeroes:– IEEE 754 distinguishes between +0 and –0– Represented by

• Sign bit: 0 or 1• Biased exponent: all zeroes• Coefficient: all zeroes

Page 72: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Special values (II)

• Denormalized numbers:– Numbers whose coefficient cannot be

normalized• Smaller than 2–126

– Will have a coefficient with leading zeroes and exponent field equal to zero• Reduces the number of significant digits• Lowers accuracy

Page 73: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Special values (III)

• Infinities:– + and –– Represented by

• Sign bit: 0 or 1• Biased exponent: all ones• Coefficient: all zeroes

Page 74: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Special values (IV)

• NaN:– For Not a Number– Often result from illegal divisions:

0/0, ∞/∞, ∞/–∞, –∞/∞, and –∞/–∞ – Represented by

• Sign bit: 0 or 1• Biased exponent: all ones• Coefficient: non zero

Page 75: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

The coefficient

• Also known as fraction or significand• Most significant bit is always one

– Implicit and not represented

• Biased exponent is 127ten

• True coefficient is implicit one followed by all zeroes

001…1000000000000000000000000

Page 76: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Decoding a floating point number

• Sign indicated by first bit• Subtract 127 from biased exponent to obtain

power of two: <be> – 127

• Use coefficient to construct a normalized binary value with a binary point:

1.<coefficient>• Number being represented is

1.<coefficient> × 2<be> – 127

Page 77: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

First example

• Sign bit is zero:Number is positive

• Biased exponent is 127Power of two is zero

• Normalized binary value is1.0000000

• Number is 1×20 = 1

0 01…1 00000000000000000000000000000

Page 78: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Second example

• Sign bit is zero:Number is positive

• Biased exponent is 128Power of two is 1

• Normalized binary value is1.1000000

• Number is 1.1×21 = 11 = 3ten

0 10…0 10000000000000000000000000000

Page 79: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Third example

• Sign bit is 1:Number is negative

• Biased exponent is 126Power of two is –1

• Normalized binary value is1.1100000

• Number is –1.11×2–1 = –0.111 = –7/8ten

101…1011000000000000000000000000000

Page 80: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Can we do it now?

• Sign bit is 0:Number is ___________

• Biased exponent is 129Power of two is _______

• Normalized binary value is1.__________

• Number is _________________________

0 129ten 10100000000000000000000000000

Page 81: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Encoding a floating point number

• Use sign to pick sign bit• Normalize the number:

Convert it to form 1.<more bits> × 2<exp>

• Add 127 to exponent <exp> to obtainbiased exponent <be>

• Coefficient <coeff> is equal to fractional part <more bits> of number

Page 82: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

First example• Represent 7:

– Convert to binary: 111– Normalize: 1.11×22

– Sign bit is 0– Biased exponent is 127 + 2 = 10000001two

– Coefficient is 1100…0

010…0111000000000000000000000000000

Page 83: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Second example• Represent 1/2

– Convert to binary: 0.1– Normalize: 1.0×2-1

– Sign bit is 0– Biased exponent is 127 – 1 = 01111110two

– Coefficient is 00…0

001…1000000000000000000000000000000

Page 84: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Third example• Represent –2

– Convert to binary: 10– Normalize: 1.0×21

– Sign bit is 1– Biased exponent is 127 + 1 = 10000000two

– Coefficient is 00…0

110…0000000000000000000000000000000

Page 85: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Fourth example• Represent 9/4

– Convert to binary: 1001×2–2

– Normalize: 1.001×21

– Sign bit is 0– Biased exponent is 127 + 1 = 10000000two

– Coefficient is 0010…0

110…0000100000000000000000000000000

Page 86: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Can we do it now?• Represent 6.25:

– Convert to binary: ________– Normalize: 1.______×2_______

– Sign bit is _____– Biased exponent is 127 + ___ = ______ten

– Coefficient is_________

Page 87: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Range

• Can represent numbers between1.00…0×2–126 and 1.11…1×2127

– Say between 2–126 and 2128

• Observing that 210 103

we divide the exponents by 10 and multiply them by 3 to obtain the interval expressed in powers of 10– Approximate range is 10–38 to 1038

Page 88: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Accuracy

• We have 24 significant bits– Theoretical precision of 1/224, that is, roughly

1/107

• Cannot add correctly billions or trillions• Actual situation is worse if we do too many

computations– 1,000,000 – 999,999.4875 = ???

Page 89: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Guard bits

• Do all arithmetic operations with two additional bits to reduce rounding errors

Page 90: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Double precision arithmetic (I)

• Use 64-bit double words• Allows us to have

– One bit for sign– Eleven bits for exponent

• 2,048 possible values– Fifty-two bits for coefficient

• Plus the implicit leading bit

Page 91: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Double precision arithmetic (II)

• Exponents are still represented using a biased notation– Stored value = actual exponent + bias

• For 11-bit exponents, bias is 1023– Stored value of 1 corresponds to –1,022– Stored value of 2,046 corresponds to +1,023– Stored values of 0 and 2,047 are reserved for

special cases

Page 92: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Double precision arithmetic (III)

• Can now represent numbers between1.00…0×2–1,022 and 1.11…1×21,203

– Say between 2–1,022 and 21,204

– Approximate range is 10–307 to 10307

• In reality, more like 10–308 to 10308

Page 93: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Double precision arithmetic (IV)

• We now have 53 significant bits– Theoretical precision of 1/253. that is, roughly

1/1016

• Can now add correctly billions or trillions

Page 94: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

If that is now enough, …

• Can use 128-bit quad words• Allows us to have

– One bit for sign– Fifteen bits for exponent

• From –16382 to +16383 – One hundred twelve bits for coefficient

• Plus the implicit leading bit

Page 95: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Decimal floating point addition (I)

• 5.25×103 + 1.22×102 = ?

• Denormalize number with smaller exponent:5.25×103 + 0.122×103

• Add the numbers:5.25×103 + 0.122×103 = 5.372×103

• Result is normalized

Page 96: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Decimal floating point addition (II)

• 9.25×103 + 8.22×102 = ?

• Denormalize number with smaller exponent:9.25×103 + 0.822×103

• Add the numbers:9.25×103 + 0.822×103 = 10.072×103

• Normalize the result:10.072×103 = 1.0072×104

Page 97: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Binary floating point addition (I)

• Say 1001 + 10 or 1.001×23 + 1.0×21

• Denormalize number with smaller exponent:1.001×23 + 0.01×23

• Add the numbers:1.001×23 + 0.01×23 = 1.011×23

• Result is normalized

Page 98: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Binary floating point addition (II)

• Say 101 + 11 or 1.01×22 + 1.1×21

• Denormalize number with smaller exponent: 1.01×22 + 0.11×22

• Add the numbers:1.01×22 + 0.11×22 = 10.00×22

• Normalize the results10.00×22 = 1.000×23

Page 99: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Binary floating point subtraction

• Say 101 – 11 or 1.01×22 – 1.1×21

• Denormalize number with smaller exponent: 1.01×22 – 0.11×22

• Perform the subtraction:1.01×22 – 0.11×22 = 0.10×22

• Normalize the results0.10×22 = 1.0×21

Page 100: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Decimal floating point multiplication

• Exponent of product is the sum of the exponents of multiplicand and multiplier

• Coefficient of product is the product of the coefficients of multiplicand and multiplier

• Compute sign using usual rules of arithmetic• May have to renormalize the product

Page 101: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Decimal floating point multiplication

• 6×103 × 2.5×102 = ?

• Exponent of product is:3 + 2 = 5

• Multiply the coefficients:6 ×2.5 = 15

• Result will be positive• Normalize the result:

15×105 = 1.5×106

Page 102: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Binary floating point multiplication

• Exponent of product is the sum of the exponents of multiplicand and multiplier

• Coefficient of product is the product of the coefficients of multiplicand and multiplier

• Compute sign using usual rules of arithmetic• May have to renormalize the product

Page 103: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Binary floating point multiplication

• Say 110 ×11 or 1.1×22 × 1.1×21

• Exponent of product is:2 + 1 = 3

• Multiply the coefficients:1.1 × 1.1 = 10.01

• Result will be positive• Normalize the result:

10.01×23 = 1.001×24

Page 104: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

FP division

• Very tricky• One good solution is to multiply the dividend by

the inverse of the divisor

Page 105: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

A trap

• Addition does not necessarily commute:• –9×1037 + 9×1037 + 4×10-37

• Observe that• (–9×1037 + 9×1037) + 4×10-37 = 4×10-37

while• –9×1037 + (9×1037+ 4×10-37) = 0

due to the limited accuracy of FP numbers

Page 106: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

IMPLEMENTATIONS

Page 107: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

The floating-point unit (I)

• Floating-point instructions were an optional feature– User had to buy a separate floating-point unit

aka floating point coprocessor• Before Intel 80486, all Intel x86

architectures the option to install a separate floating-point chip(8087, 80287, 80387)

Page 108: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

The floating-point unit (II)

• Default solution was to simulate the missing floating-point instructions through assembly routines

• As a result, many processor architectures use separate banks of registers for integer arithmetic and floating point arithmetic

Page 109: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

The floating-point unit (III)

• Some older architectures implemented– Single-precision operations in hardware

through the FPU– Double-precision operations by software

• Made double-precession operations much costlier than single-precision operations.

Page 110: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

IBM 360 FP INSTRUCTIONS

Page 111: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Overview

• FPU offers a very familiar user interface– Eight general purpose FP registers

• Distinct from the integer registers– Two-operand instructions in both RR and RX

formats• Includes single-precision and double-precision

versions or addition, subtraction, multiplication and division

Page 112: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Examples of RR instructions

• AFR f1, f2 add contents of floating-point register f2 into f1

• ADR f1,f2 add contents of double-precisionregister f2 into f1

• LFR f1, f2 load contents of floating-point register f2 into f1

• Also had load positive, load negative, load complement instructions for floating-point and double-precision operands

Page 113: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Examples of RX instructions

• AF r1, d(r2) add contents of word at address

d + contents(r2) into register r1

• AD r1,d(r2) …

Page 114: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

MIPS FP INSTRUCTIONS

Page 115: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Overview

• Thirty-two specialized single-precision registers:$f0, $f1, … $f31

• Each pair of single-precision registers forms a double-precision register

• *.s instructions apply to single precision format• *.d instructions apply to double precision format• Most instructions are in the R format

Page 116: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

R-format instructions (I)

• add.s f1, f2, f3 f1 = r2 + f3 (single precision)• add.d f2, f4, f6 (f2, f2+1) = (f4, f4+1) + (f6, f6 +1)

(double precision applies to register pairs)

• sub.s f1, f2, f3 f1 = f2 – f3 (single precision)• sub.d f2, f4, f6 (double precision)• mul.s f1, f2, f3 f1 = f2×f3 (single precision)• mul.d f2, f4, f6 (double precision)

Page 117: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

R-format instructions (II)

• div.s f1, f2, f3 f1 = f2 /f3 (single precision)• div.d f2, f4, f6 (double precision)• c.x.s f1, f2 FP condition = f1 x f2 ? 1 ! 0

where x can be equal, not equal,less than, less than or equal, greater than, greater than or

equal• c.x.d f2, f4 (double precision)

Page 118: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

I-format instructions (I)

• bclt a jump to address computed by adding 4×a to the current value ofthe PC if the FP condition is true

• bclf a jump to address computed by adding 4×a to the current value ofthe PC if the FP condition is false

Page 119: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

I-format instructions (I)

• lwcl f1, a(r1) load floating-point word at addressa + contents(r1) into f1

• ldcl f2, a(r1) (double precision)• swcl f1, a(r1) store floating-point value in f1

into word at addressa + contents(r1)

• sdcl f2, a(r1) (double precision)

The "c" in the opcodes stands for coprocessor!

Page 120: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

x86 FP INSTRUCTIONS

Page 121: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Overview

• Original x86 FP coprocessor had a stack architecture

• Stack registers were 80-bit wide as well as all internal registers– Better accuracy

• Provided single and double precision operations

Page 122: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Stack operations (I)

• Three types of operations:– Loads store an operand on the top of the stack– Arithmetic and comparison operations find

two operands of the top of the stack and replace them by the result of the operation

– Stores move the top of stack register into memory

Page 123: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Example

• a = b + c

– Load b on top of stack

– Load c on top of stack

– Add c to b

– Store result into a

b---b

b---b---

c

---b

b + c---

---

Page 124: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Stack operations (II)

• Instruction set also allowed– Operations on top of stack register and the ith

register below– Immediate operands– Operations on top of stack register and a

memory location• Poor performance of FP unit architecture

motivated an extension to the x86 instruction set

Page 125: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Intel SSE2 FP Architecture (I)

• SSE2 Extension (2001) provided 8 floating point registers– Could hold either single precision or double

precision values– Number extended to 16 by AMD, followed by

Intel

Page 126: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Intel SSE2 FP Architecture (II)

• Registers are now 128-bit wide– Can hold

• One quad precision value• Two double precision values• Four single precision values

• Can perform same operation in parallel on all single/double precision values stored in the same register

Wow!

Page 127: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

REVIEW QUESTIONS

Page 128: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Review questions

• How would you represent 0.5 in double precision?

• How would you convert this double-precision value into a single precision format?

• When doing accounting, we could do all the computations in cents using integer arithmetic. What would we win? What would we lose?

Page 129: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Solutions

• How would you represent 0.5 in double precision?

– Normalized representation: 1.0 2-1

– Sign: 0– Biased exponent: 1023 – 1 = 1022– Coefficient: All zeroes

• Because the 1 is implicit

Page 130: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Solutions

• How would you convert this double-precision value into a single precision format?

– Same normalized representation: 1.0 2-1

– Same sign: 0– New biased exponent: 127 – 1 = 126– Same coefficient: All zeroes

• Because the 1 is implicit

Page 131: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Solutions

• When doing accounting, we could do all the computations in cents using integer arithmetic. What would we win? What would we lose?

– Big plus:• The results would be exact

– Big minus:• Could not handle numbers bigger than

$20,000,000 in 32-bit signed arithmetic

Page 132: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

Why $20,000,000?

• 32-bit unsigned arithmetic can represent numbers from 0 to 232 – 1

• 32-bit unsigned arithmetic can represent numbers from -231 to 231 – 1– Roughly from -2000,000,000 to 2,000,000,000

• Must divide by 100 as we were using cents!

Page 133: COMPUTER ARITHMETIC Jehan-François Pâris jparis@uh.edu

TRANSITION SLIDE

• Here end the materials that were on the first fall 2012 midterm