iki10230 pengantar organisasi komputer bab 6: aritmatika

of 65 /65
1 IKI10230 Pengantar Organisasi Komputer Bab 6: Aritmatika 7 & 14 Mei 2003 Bobby Nazief ([email protected]) Qonita Shahab ([email protected]) bahan kuliah: http://www.cs.ui.ac.id/kuliah/iki10230/ Sumber : 1. Hamacher. Computer Organization, ed-5. 2. Materi kuliah CS61C/2000 & CS152/1997, UCB.

Author: rhea

Post on 15-Jan-2016

47 views

Category:

Documents


0 download

Embed Size (px)

DESCRIPTION

IKI10230 Pengantar Organisasi Komputer Bab 6: Aritmatika. Sumber : 1. Hamacher. Computer Organization , ed-5. 2. Materi kuliah CS61C/2000 & CS152/1997, UCB. 7 & 14 Mei 2003 Bobby Nazief ([email protected]) Qonita Shahab ([email protected]) bahan kuliah: http://www.cs.ui.ac.id/kuliah/iki10230/. - PowerPoint PPT Presentation

TRANSCRIPT

Pengantar Organisasi Komputer2. Materi kuliah CS61C/2000 & CS152/1997, UCB.
*
So far, unsigned numbers
0 => +, 1 => -
Representation called sign and magnitude
*
Arithmetic circuit more complicated
Special steps depending whether signs are the same or not
Also, Two zeros
Sign and magnitude abandoned
Called one’s Complement
Note: postive numbers have leading 0s, negative numbers have leadings 1s.
What is -00000 ?
How many negative ones?
*
Obvious solution didn’t work, find another
What is result for unsigned numbers if tried to subtract large number from a small one?
Would try to borrow from string of leading 0s,
so result would have a string of leading 1s
With no obvious better alternative, pick representation that made the hardware simple: leading 0s positive, leading 1s negative
000000...xxx is >=0, 111111...xxx is < 0
This representation called two’s complement
*
2 N-1 non-negatives
2 N-1 negatives
. . .
. . .
One zero, 1st bit => >=0 or <0, called sign bit
but one negative with no positive –2,147,483,648ten
*
Two’s Complement Formula
Can represent positive and negative numbers in terms of the bit value times a power of 2:
d31 x -231 + d30 x 230 + ... + d2 x 22 + d1 x 21 + d0 x 20
Example
= 1x-231 +1x230 +1x229+... +1x22+0x21+0x20
= -231 + 230 + 229 + ... + 22 + 0 + 0
= -2,147,483,648ten + 2,147,483,644ten
*
Two’s complement shortcut: Negation
Invert every 0 to 1 and every 1 to 0, then add 1 to the result
Sum of number and its one’s complement must be 111...111two
111...111two= -1ten
Let x’ mean the inverted representation of x
Then x + x’ = -1 x + x’ + 1 = 0 x’ + 1 = -x
Example: -4 to +4 to -4
x : 1111 1111 1111 1111 1111 1111 1111 1100two
x’: 0000 0000 0000 0000 0000 0000 0000 0011two
+1: 0000 0000 0000 0000 0000 0000 0000 0100two
()’: 1111 1111 1111 1111 1111 1111 1111 1011two
+1: 1111 1111 1111 1111 1111 1111 1111 1100two
*
Two’s comp. shortcut: Sign extension
Convert 2’s complement number using n bits to more than n bits
Simply replicate the most significant bit (sign bit) of smaller to fill new bits
2’s comp. positive number has infinite 0s
2’s comp. negative number has infinite 1s
Bit representation hides leading bits;
sign extension restores some of them
16-bit -4ten to 32-bit:
1111 1111 1111 1100two
*
The outputs are Sumi, CarryOuti
Note: CarryIni+1 = CarryOuti
*
CarryOut = AB + ACin + BCin
implement gates for Sum
implement gates for CarryOut
Sum
A
B
CarryIn
CarryOut
Critical Path of n-bit Rippled-carry adder is n*CP
CP = 2 gate-delays (Cout = AB + ACin + BCin)
A0
B0
1-bit
FA
Sum0
CarryIn0
CarryOut0
A1
B1
1-bit
FA
Sum1
CarryIn1
CarryOut1
A2
B2
1-bit
FA
Sum2
CarryIn2
CarryOut2
A3
B3
1-bit
FA
Sum3
CarryIn3
CarryOut3
A
B
S
G
P
A
B
S
G
P
A
B
S
G
P
C2 = G1 + G0 · P1 + C0 · P0 · P1
C3 = G2 + G1 · P2 + G0 · P1 · P2 + C0 · P0 · P1 · P2
G = G3 + P3·G2 + P3·P2·G1 + P3·P2·P1·G0
C4 = . . .
A B C-out
Cin
Names: suppose G0 is 1 => carry no matter what else => generates a carry
suppose G0 =0 and P0=1 => carry IFF C0 is a 1 => propagates a carry
Like dominoes
*
All carries can be obtained in 3 gate-delays:
1 needed to developed all Pi and Gi
2 needed in the AND-OR circuit
All sums can be obtained in 6 gate-delays:
3 needed to obtain carries
1 needed to invert carry
2 needed in the AND-OR circuit of Sum’s circuit
Independent of the number of bits (n)
4-bit Adder:
16-bit Adder:
¯ ¯ ¯ ¯ ¯ ¯
*
C1 = G0 + C0 · P0
C2 = G1 + G0 · P1 + C0 · P0 · P1
C3 = G2 + G1 · P2 + G0 · P1 · P2 + C0 · P0 · P1 · P2
G
P
G0
P0
C4 = . . .
C0
C
L
A
4-bit
Adder
4-bit
Adder
4-bit
Adder
Result will be correct, provided there’s no overflow
0 1 0 1 (+5)
+0 0 1 0 (+2)
0 1 1 1 (+7)
0 1 0 1 (+5)
+1 0 1 0 (-6)
1 1 1 1 (-1)
1 0 1 1 (-5)
+1 1 1 0 (-2)
11 0 0 1 (-7)
0 1 1 1 (+7)
+1 1 0 1 (-3)
10 1 0 0 (+4)
0 0 1 0 (+2) 0 0 1 0
0 1 0 0 (+4) +1 1 0 0 (-4)
1 1 1 0 (-2)
1 1 1 0 (-2) 1 1 1 0
1 0 1 1 (-5) +0 1 0 1 (+5)
10 0 1 1 (+3)
Subtraction:
*
- 4 5 = - 9 but ...
0
1
1
1
1
1
0
7
3
1
– 6
– 4
– 5
7
Well so far so good but life is not always perfect.
Let’s consider the case 7 plus 3, you will get 10.
But if you perform the binary arithmetics on our 4-bit adder you will get 1010, which is negative 6.
Similarly, if you try to add negative 4 and negative 5 together, you should get negative 9.
But the binary arithmetics will give you 0111, which is 7.
So what went wrong? The problem is overflow.
The number you get are simply too big, in the positive 10 case, and too small in the negative 9 case, to be represented by four bits.
+2 = 39 min. (Y:19)
Overflow Detection
Overflow: the result is too large (or too small) to represent properly
Example: - 8 < = 4-bit binary number <= 7
When adding operands with different signs, overflow cannot occur!
Overflow occurs when adding:
On your own: Prove you can detect overflow by:
Carry into MSB ° Carry out of MSB
0
1
1
1
0
0
1
1
0
1
1
1
1
1
0
7
3
1
– 6
–4
– 5
7
0
Recalled from some earlier slides that the biggest positive number you can represent using 4-bit is 7 and the smallest negative you can represent is negative 8.
So any time your addition results in a number bigger than 7 or less than negative 8, you have an overflow.
Keep in mind is that whenever you try to add two numbers together that have different signs, that is adding a negative number to a positive number, overflow can NOT occur.
Overflow occurs when you to add two positive numbers together and the sum has a negative sign. Or, when you try to add negative numbers together and the sum has a positive sign.
If you spend some time, you can convince yourself that If the Carry into the most significant bit is NOT the same as the Carry coming out of the MSB, you have a overflow.
+2 = 41 min. (Y:21)
Carry into MSB ° Carry out of MSB
For a N-bit Adder: Overflow = CarryIn[N - 1] XOR CarryOut[N - 1]
A0
B0
1-bit
FA
Result0
CarryIn0
CarryOut0
A2
B2
1-bit
FA
Result2
CarryIn2
A3
B3
1-bit
FA
Result3
CarryIn3
CarryOut3
Overflow
X
Y
0
0
0
0
1
1
1
0
1
1
1
0
A1
B1
1-bit
FA
Result1
CarryIn1
CarryOut1
Recall the XOR gate implements the not equal function: that is, its output is 1 only if the inputs have different values.
Therefore all we need to do is connect the carry into the most significant bit and the carry out of the most significant bit to the XOR gate.
Then the output of the XOR gate will give us the Overflow signal.
+1 = 42 min. (Y:22)
CC Flags will be set/cleared by arithmetic operations:
N (negative): 1 if result is negative (MSB = 1), otherwise 0
C (carry): 1 if carry-out(borrow) is generated, otherwise 0
V (overflow): 1 if overflow occurs, otherwise 0
Z (zero): 1 if result is zero, otherwise 0
0 1 0 1 (+5)
+1 0 1 0 (-6)
1 1 1 1 (-1)
0 1 1 1 (+7)
+1 1 0 1 (-3)
10 1 0 0 (+4)
0 1 0 1 (+5)
+0 1 0 0 (+4)
1 0 0 1 (-7?)
0 0 1 1 (+3)
+1 1 0 1 (-3)
10 0 0 0 (0)
*
Multiplicand 1101 (13)
Multiplier 1011 (11)
Binary makes it easy:
1 => place a copy ( 1 x multiplicand)
*
A0
A1
A2
A3
A0
A1
A2
A3
A0
A1
A2
A3
A0
A1
A2
A3
B0
B1
B2
B3
P0
P1
P2
P3
P4
P5
P6
P7
0
0
0
0
at each stage shift A left ( x 2)
use next bit of B to determine whether to add in shifted multiplicand
accumulate 2n bit partial product at each stage
B0
B1
B2
B3
P0
P1
P2
P3
P4
P5
P6
P7
0
0
0
0
0
0
0
A0
A1
A2
A3
A0
A1
A2
A3
A0
A1
A2
A3
A0
A1
A2
A3
Proceed as above
Works well for both Negative & Positive Multipliers
Example 2 x 6 = 0010 x 0110: 0010
x 0110
+ 0000 shift (0 in multiplier) + 0010 add (1 in multiplier) + 0100 add (1 in multiplier) + 0000 shift (0 in multiplier) 00001100
FA with add or subtract gets same result in more than one way: 6 = – 2 + 8 0110 = – 00010 + 01000 = 11110 + 01000
For example
0010 x 0110 00000000 shift (0 in multiplier) 1111110 sub (first 1 in multpl.)
000000 shift (mid string of 1s)
+ 00010 add (prior step had last 1) 00001100
*
1 0 Begins run of 1s 0001111000 sub
1 1 Middle of run of 1s 0001111000 none
0 1 End of run of 1s 0001111000 add
0 0 Middle of run of 0s 0001111000 none
Originally for Speed (when shift was faster than add)
*
Booths Example (2 x 7)
1a. P = P - m 1110 + 1110 1110 0111 0 shift P (sign ext)
1b. 0010 1111 0011 1 11 -> nop, shift
2. 0010 1111 1001 1 11 -> nop, shift
3. 0010 1111 1100 1 01 -> add
4a. 0010 + 0010
Operation Multiplicand Product next?
*
Booths Example (2 x -3)
1a. P = P - m 1110 + 1110 1110 1101 0 shift P (sign ext)
1b. 0010 1111 0110 1 01 -> add + 0010
2a. 0001 0110 1 shift P
2b. 0010 0000 1011 0 10 -> sub + 1110
3a. 0010 1110 1011 0 shift
3b. 0010 1111 0101 1 11 -> nop
4a 1111 0101 1 shift
4b. 0010 1111 1010 1 done
Operation Multiplicand Product next?
*
10 Remainder (or Modulo result)
See how big a number can be subtracted, creating quotient bit on each step
Binary => 1 * divisor or 0 * divisor
Dividend = Quotient x Divisor + Remainder
=> | Dividend | = | Quotient | + | Divisor |
Unsigned integers:
-2(N-1) to 2(N-1) - 1
0.0000000110 (1.010 x 10-8)
*
(exactly one digit to left of decimal point)
Alternatives to representing 1/1,000,000,000
Normalized: 1.0 x 10-9
6.02 x 1023
Computer arithmetic that supports it called floating point, because it represents numbers where binary point is not fixed, as it is for integers
Declare such variable in C as float
1.0two x 2-1
S represents Sign
2.0 x 10-38 to as large as 2.0 x 1038
0
31
S
Exponent
30
23
22
Significand
Overflow!
What if result too small? (>0, < 2.0x10-38 )
Underflow!
Underflow => Negative exponent larger than represented in 8-bit Exponent field
How to reduce chances of overflow or underflow?
*
Next Multiple of Word Size (64 bits)
Double Precision (vs. Single Precision)
C variable declared as double
Represent numbers almost as small as
2.0 x 10-308 to almost as large as 2.0 x 10308
But primary advantage is greater accuracy
due to larger significand
Single Precision, DP similar
Significand:
To pack more bits, leading 1 implicit for normalized numbers
1 + 23 bits single, 1 + 52 bits double
always true: 0 < Significand < 1 (for normalized numbers)
*
IEEE 754 Floating Point Standard (2/4)
Kahan wanted FP numbers to be used even if no FP hardware; e.g., sort records with FP numbers using integer compares
Could break FP number into 3 parts: compare signs, then compare exponents, then compare significands
Wanted it to be faster, single compare if possible, especially if positive numbers
Then want order:
Exponent next, so big exponent => bigger #
Significand last: exponents same => bigger #
*
Negative Exponent?
2’s comp? 1.0 x 2-1 v. 1.0 x2+1 (1/2 v. 2)
This notation using integer compare of
1/2 v. 2 makes 1/2 > 2!
Instead, pick notation 0000 0001 is most negative, and 1111 1111 is most positive
1.0 x 2-1 v. 1.0 x2+1 (1/2 v. 2)
0
1/2
0
2
1/2
0
0
2
Summary (single precision):
Called Biased Notation, where bias is number subtract to get real number
IEEE 754 uses bias of 127 for single prec.
Subtract 127 from Exponent field to get actual value for exponent
1023 is bias for double precision
(-1)S x (1 + Significand) x 2(Exponent-127)
Double precision identical, except with exponent bias of 1023
0
31
S
Exponent
30
23
22
Significand
Exponent Significand Object
0 0 0
Infinity and NaNs
result of operation overflows, i.e., is larger than the largest number that
can be represented
overflow is not the same as divide by zero (raises a different exception)
+/- infinity
It may make sense to do further computations with infinity
e.g., X/0 > Y may be a valid comparison
Not a number, but not infinity (e.q. sqrt(-4))
invalid operation exception (unless operation is = or =)
NaN
*
Can’t just add significands
How do we do it?
De-normalize to match exponents
Keep the same exponent
Normalize (possibly changing exponent)
*
De-normalize to match exponents
Extra Bits for rounding
Guard Digits: digits to the right of the first p digits of significand to guard against loss of digits – can later be shifted left into first P places during normalization.
Addition: carry-out shifted in
Multiplication: carry and guard, Division requires guard
"Floating Point numbers are like piles of sand; every time you move one you lose a little sand, but you pick up a little dirt."
How many extra bits?
Addition:
*
Rounding Digits
normalized result, but some non-zero digits to the right of the
significand --> the number should be rounded
E.g., B = 10, p = 3:
0 2 1.69
0 0 7.85
0 2 1.61
-
one round digit must be carried to the right of the guard digit so that
after a normalizing left shift, the result can be rounded, according
to the value of the round digit
IEEE Standard:
round towards plus infinity
round towards minus infinity
round digit < B/2 then truncate
> B/2 then round up (add 1 to ULP: unit in last place)
= B/2 then round to nearest even digit
it can be shown that this strategy minimizes the mean error
introduced by rounding
Sticky Bit
Additional bit to the right of the round digit to better fine tune rounding
d0 . d1 d2 d3 . . . dp-1 0 0 0
0 . 0 0 X . . . X X X S
X X S
+
Sticky bit: set to 1 if any 1 bits fall off
the end of the round digit
d0 . d1 d2 d3 . . . dp-1 0 0 0
0 . 0 0 X . . . X X X 0
X X 0
-
Normal operations in +,-,*,/ require one carry/borrow bit + one guard digit
One round digit needed for correct rounding
Sticky bit needed when round digit is B/2 for max accuracy
Rounding to nearest has mean error = 0 if uniform distribution of digits
are assumed
denorm
gap
The gap between 0 and the next representable number is much larger
than the gaps between nearby representable numbers.
IEEE standard uses denormalized numbers to fill in the gap, making the
distances between numbers near 0 more alike.
0
2
2
2
-bias
1-bias
2-bias
NOTE: PDP-11, VAX cannot represent subnormal numbers. These
machines underflow to zero instead.
A
B
CarryIn
CarryOut
Sum
0
0
0
0
0
0
0
1
0
1
0
1
0
0
1
0
1
1
1
0
1
0
0
0
1
1
0
1
1
0
1
1
0
1
0
1
1
1
1
1
beginning of run
end of run
middle of run