introduction to computer organization and architecture lecture 10 by juthawut chantharamalee jutha...

70
Introduction to Computer Organization and Architecture Lecture 10 By Juthawut Chantharamalee http://dusithost.dusit.ac.th/ ~juthawut_cha/home.htm

Upload: calvin-hunt

Post on 31-Dec-2015

219 views

Category:

Documents


3 download

TRANSCRIPT

Introduction to Computer Organization and Architecture

Lecture 10By Juthawut

Chantharamaleehttp://dusithost.dusit.ac.th/~juthawut_cha/home.htm

Outline Adders Comparators Shifters Multipliers Dividers Floating Point Numbers

2Introduction to Computer Organization and Architecture

Binary Representations of Numbers To find negative numbers

Sign and magnitude: msb = ‘1’ 1’s complement: complement each bit to change sign 2’s complement: 2n – positive number

b2b1b0 Unsigned

Sign and Magnitude

1’s

Complement

2’s Complement

0 1 1 3 +3 +3 +3

0 1 0 2 +2 +2 +2

0 0 1 1 +1 +1 +1

0 0 0 0 +0 +0 +0

1 0 0 4 -0 -3 -4

1 0 1 5 -1 -2 -3

1 1 0 6 -2 -1 -2

1 1 1 7 -3 -0 -1

3Introduction to Computer Organization and Architecture

Single-Bit AdditionHalf Adder Full Adder

A B Cout S

0 0

0 1

1 0

1 1

A B C Cout S

0 0 0

0 0 1

0 1 0

0 1 1

1 0 0

1 0 1

1 1 0

1 1 1

A B

S

Cout

A B

C

S

Cout

out

S

C

out

S

C

44Introduction to Computer Organization and Architecture

Single-Bit AdditionHalf Adder Full Adder

A B Cout S

0 0 0 0

0 1 0 1

1 0 0 1

1 1 1 0

A B C Cout S

0 0 0 0 0

0 0 1 0 1

0 1 0 0 1

0 1 1 1 0

1 0 0 0 1

1 0 1 1 0

1 1 0 1 0

1 1 1 1 1

A B

S

Cout

A B

C

S

Cout

5Introduction to Computer Organization and Architecture

Carry-Ripple Adder Simplest design: cascade full adders

Critical path goes from Cin to Cout Design full adder to have fast carry delay

CinCout

B1A1B2A2B3A3B4A4

S1S2S3S4

C1C2C3

6Introduction to Computer Organization and Architecture

Carry Propagate Adders N-bit adder called CPA

Each sum bit depends on all previous carries How do we compute all these carries quickly?

+

BN...1AN...1

SN...1

CinCout

11111 1111 +0000 0000

A4...1

carries

B4...1

S4...1

CinCout

00000 1111 +0000 1111

CinCout

7Introduction to Computer Organization and Architecture

Propagate and Generate - Define An n-bit adder is just a combinational circuit

si = a XOR b XOR c = aibi’ci’ + ai’bici’ + ai’bi’ci + aibici

ci+1 = MAJ(a,b,c) = aibi + aici + bici

Want to write si in “sum of products” form ci+1 = gi + pici, where gi = aibi, pi = ai + bi

if gi is true, then ci+1 is true, thus carry is generated if pi is true, and if ci is true, ci is propagated

Note that the pi equation can also be written as

pi = ai XOR bi since gi = 1 when ai = bi = 1 (i.e. generate occurs, so propagate is a “don’t care”)

8Introduction to Computer Organization and Architecture

Propagate and Generate - Blocks In the general case, for any j with i<j, j+1<k

ck+1 = Gik + Pikci

Gik = Gj+1,k + Pj+1,k Gij

Pik = PijPj+1,k

Gik equation in words: A carry is generated out of the block consisting of bits

i through k inclusive if it is generated in the high-order part of the block (j+1, k) or it is generated in the low-order (i,j) part of the block and then

propagated through the high part

0:00:00inGCP0:00:00inGCP

9Introduction to Computer Organization and Architecture

Propagate and Generate-Lookahead Recursively apply to eliminate carry terms

ci+1 = gi + pigi-1 + pipi-1gi-2 + …

+ pipi-1…p1g0 + pipi-1…p1p0c0

This is a carry-lookahead adder Note large fan-in of OR gate and last AND gate Too big! Build p’s and g’s in steps

c1 = g0 + p0c0

c2 = G01 + P01c0

Where: G01 = g1 + p1g0, P01 = p1p0

1010Introduction to Computer Organization and Architecture

PG Logic

S1

B1A1

P1G1

G00

S2

B2

P2G2

G01

A2

S3

B3A3

P3G3

G02

S4

B4

P4G4

G03

A4 Cin

G0 P0

1: Bitwise PG logic

2: Block PG logic

3: Sum logicC0C1C2C3

Cout

C4

11Introduction to Computer Organization and Architecture

Carry-Ripple Revisited

S1

B1A1

P1G1

G00

S2

B2

P2G2

G01

A2

S3

B3A3

P3G3

G02

S4

B4

P4G4

G03

A4 Cin

G0 P0

C0C1C2C3

Cout

C4

G03 = G3 + P3 G02

12Introduction to Computer Organization and Architecture

Carry-Skip Adder Carry-ripple is slow through all N stages Carry-skip allows carry to skip over groups of n

bits Decision based on n-bit propagate signal

Cin+

S4:1

P4:1

A4:1 B4:1

+

S8:5

P8:5

A8:5 B8:5

+

S12:9

P12:9

A12:9 B12:9

+

S16:13

P16:13

A16:13 B16:13

CoutC4

1

0

C81

0

C121

0

1

0

13Introduction to Computer Organization and Architecture

Carry-Lookahead Adder Carry-lookahead adder computes G0i for many

bits in parallel. Uses higher-valency cells with more than two

inputs.

Cin+

S4:1

G4:1P4:1

A4:1 B4:1

+

S8:5

G8:5P8:5

A8:5 B8:5

+

S12:9

G12:9P12:9

A12:9 B12:9

+

S16:13

G16:13P16:13

A16:13 B16:13

C4C8C12Cout

14Introduction to Computer Organization and Architecture

Carry-Select Adder Trick for critical paths dependent on late input X

Precompute two possible outputs for X = 0, 1 Select proper output when X arrives

Carry-select adder precomputes n-bit sums For both possible carries into n-bit group

Cin+

A4:1 B4:1

S4:1

C4

+

+

01

A8:5 B8:5

S8:5

C8

+

+

01

A12:9 B12:9

S12:9

C12

+

+

01

A16:13 B16:13

S16:13

Cout

0

1

0

1

0

1

15Introduction to Computer Organization and Architecture

Comparators 0’s detector: A = 00…000 1’s detector: A = 11…111 Equality comparator: A = B Magnitude comparator: A < B

16Introduction to Computer Organization and Architecture

1’s & 0’s Detectors 1’s detector: N-input AND gate 0’s detector: NOTs + 1’s detector (N-input NOR)

A0

A1

A2

A3

A4

A5

A6

A7

allones

A0

A1

A2

A3

allzeros

allones

A1

A2

A3

A4

A5

A6

A7

A0

17Introduction to Computer Organization and Architecture

Equality Comparator Check if each bit is equal (XNOR, aka equality

gate) 1’s detect on bitwise equality

A[0]B[0]

A = B

A[1]B[1]

A[2]B[2]

A[3]B[3]

18Introduction to Computer Organization and Architecture

Magnitude Comparator Compute B-A and look at sign B-A = B + ~A + 1 For unsigned numbers, carry out is sign bit

19Introduction to Computer Organization and Architecture

Signed vs. Unsigned For signed numbers, comparison is harder

C: carry out Z: zero (all bits of A-B are 0) N: negative (MSB of result) V: overflow (inputs had different signs, output sign B)

20Introduction to Computer Organization and Architecture

Shifters Logical Shift:

Shifts number left or right and fills with 0’s 1011 LSR 1 = ____ 1011 LSL1 = ____

Arithmetic Shift: Shifts number left or right. Rt shift sign extends

1011 ASR1 = ____ 1011 ASL1 = ____

Rotate: Shifts number left or right and fills with lost bits

1011 ROR1 = ____ 1011 ROL1 = ____

21Introduction to Computer Organization and Architecture

Shifters Logical Shift:

Shifts number left or right and fills with 0’s 1011 LSR 1 = 0101 1011 LSL1 = 0110

Arithmetic Shift: Shifts number left or right. Rt shift sign extends

1011 ASR1 = 1101 1011 ASL1 = 0110

Rotate: Shifts number left or right and fills with lost bits

1011 ROR1 = 1101 1011 ROL1 = 0111

22Introduction to Computer Organization and Architecture

Funnel Shifter A funnel shifter can do all six types of shifts Selects N-bit field Y from 2N-bit input

Shift by k bits (0 k < N)

B C

offsetoffset + N-1

0N-12N-1

Y

23Introduction to Computer Organization and Architecture

Funnel Shifter Operation

Computing N-k requires an adder

24Introduction to Computer Organization and Architecture

Funnel Shifter Operation

Computing N-k requires an adder

25Introduction to Computer Organization and Architecture

Funnel Shifter Operation

Computing N-k requires an adder

26Introduction to Computer Organization and Architecture

Funnel Shifter Operation

Computing N-k requires an adder

27Introduction to Computer Organization and Architecture

Funnel Shifter Operation

Computing N-k requires an adder

28Introduction to Computer Organization and Architecture

Simplified Funnel Shifter Optimize down to 2N-1 bit input

29Introduction to Computer Organization and Architecture

Simplified Funnel Shifter Optimize down to 2N-1 bit input

30Introduction to Computer Organization and Architecture

Simplified Funnel Shifter Optimize down to 2N-1 bit input

31Introduction to Computer Organization and Architecture

Simplified Funnel Shifter Optimize down to 2N-1 bit input

32Introduction to Computer Organization and Architecture

Simplified Funnel Shifter Optimize down to 2N-1 bit input

33Introduction to Computer Organization and Architecture

Funnel Shifter Design 1 N N-input multiplexers

Use 1-of-N hot select signals for shift amount

k[1:0]

s0s1s2s3Y3

Y2

Y1

Y0

Z0Z1Z2Z3Z4

Z5

Z6

left Inverters & Decoder

34Introduction to Computer Organization and Architecture

Funnel Shifter Design 2 Log N stages of 2-input muxes

No select decoding needed

Y3

Y2

Y1

Y0Z0

Z1

Z2

Z3

Z4

Z5

Z6

k0k1

left

35Introduction to Computer Organization and Architecture

Multi-input Adders Suppose we want to add k N-bit words

Ex: 0001 + 0111 + 1101 + 0010 = _____

36Introduction to Computer Organization and Architecture

Multi-input Adders Suppose we want to add k N-bit words

Ex: 0001 + 0111 + 1101 + 0010 = 10111

37Introduction to Computer Organization and Architecture

Multi-input Adders Suppose we want to add k N-bit words

Ex: 0001 + 0111 + 1101 + 0010 = 10111

Straightforward solution: k-1 N-input CPAs Large and slow

+

+

0001 0111

+

1101 0010

10101

10111

38Introduction to Computer Organization and Architecture

Carry Save Addition A full adder sums 3 inputs and produces 2 outputs

Carry output has twice weight of sum output

N full adders in parallel are called carry save adder Produce N sums and N carry outs

Z4Y4X4

S4C4

Z3Y3X3

S3C3

Z2Y2X2

S2C2

Z1Y1X1

S1C1

XN...1 YN...1 ZN...1

SN...1CN...1

n-bit CSA

39Introduction to Computer Organization and Architecture

CSA Application Use k-2 stages of CSAs

Keep result in carry-save redundant form

Final CPA computes actual result

4-bit CSA

5-bit CSA

0001 0111 1101 0010

+

10110101_

0001 0111+1101 10110101_

XYZSC

0101_ 1011 +0010

XYZSC

ABS

40Introduction to Computer Organization and Architecture

CSA Application Use k-2 stages of CSAs

Keep result in carry-save redundant form

Final CPA computes actual result

4-bit CSA

5-bit CSA

0001 0111 1101 0010

+

10110101_

01010_ 00011

0001 0111+1101 10110101_

XYZSC

0101_ 1011 +0010 0001101010_

XYZSC

01010_+ 00011

ABS

41Introduction to Computer Organization and Architecture

CSA Application Use k-2 stages of CSAs

Keep result in carry-save redundant form

Final CPA computes actual result

4-bit CSA

5-bit CSA

0001 0111 1101 0010

+

10110101_

01010_ 00011

0001 0111+1101 10110101_

XYZSC

0101_ 1011 +0010 0001101010_

XYZSC

01010_+ 00011 10111

ABS

10111

42Introduction to Computer Organization and Architecture

Multiplication Example: 1100 : 1210

0101 : 510

43Introduction to Computer Organization and Architecture

Multiplication Example: 1100 : 1210

0101 : 510 1100

44Introduction to Computer Organization and Architecture

Multiplication Example: 1100 : 1210

0101 : 510 1100 0000

45Introduction to Computer Organization and Architecture

Multiplication Example: 1100 : 1210

0101 : 510 1100 0000 1100

46Introduction to Computer Organization and Architecture

Multiplication Example: 1100 : 1210

0101 : 510 1100 0000 1100 0000

47Introduction to Computer Organization and Architecture

Multiplication Example: 1100 : 1210

0101 : 510 1100 0000 1100 000000111100 : 6010

48Introduction to Computer Organization and Architecture

Multiplication Example:

M x N-bit multiplication Produce N M-bit partial products Sum these to produce M+N-bit product

1100 : 1210 0101 : 510 1100 0000 1100 000000111100 : 6010

multiplier

multiplicand

partialproducts

product

49Introduction to Computer Organization and Architecture

General Form Multiplicand: Y = (yM-1, yM-2, …, y1, y0)

Multiplier: X = (xN-1, xN-2, …, x1, x0)

Product:

1 1 1 1

0 0 0 0

2 2 2M N N M

j i i jj i i j

j i i j

P y x x y

x0y5 x0y4 x0y3 x0y2 x0y1 x0y0

y5 y4 y3 y2 y1 y0

x5 x4 x3 x2 x1 x0

x1y5 x1y4 x1y3 x1y2 x1y1 x1y0

x2y5 x2y4 x2y3 x2y2 x2y1 x2y0

x3y5 x3y4 x3y3 x3y2 x3y1 x3y0

x4y5 x4y4 x4y3 x4y2 x4y1 x4y0

x5y5 x5y4 x5y3 x5y2 x5y1 x5y0

p0p1p2p3p4p5p6p7p8p9p10p11

multiplier

multiplicand

partialproducts

product

50Introduction to Computer Organization and Architecture

Dot Diagram Each dot represents a bit

partial products

multiplier x

x0

x15

51Introduction to Computer Organization and Architecture

Array Multipliery0y1y2y3

x0

x1

x2

x3

p0p1p2p3p4p5p6p7

B

ASin Cin

SoutCout

BA

CinCout

Sout

Sin

=

CSAArray

CPA

critical path BA

Sout

Cout CinCout

Sout

=Cin

BA

52Introduction to Computer Organization and Architecture

Rectangular Array Squash array to fit rectangular floorplan

y0y1y2y3

x0

x1

x2

x3

p0

p1

p2

p3

p4p5p6p7

53Introduction to Computer Organization and Architecture

Fewer Partial Products Array multiplier requires N partial products If we looked at groups of r bits, we could form

N/r partial products. Faster and smaller? Called radix-2r encoding

Ex: r = 2: look at pairs of bits Form partial products of 0, Y, 2Y, 3Y First three are easy, but 3Y requires adder

54Introduction to Computer Organization and Architecture

Booth Encoding Instead of 3Y, try –Y, then increment next partial

product to add 4Y Similarly, for 2Y, try –2Y + 4Y in next partial product

55Introduction to Computer Organization and Architecture

Booth Hardware Booth encoder generates control lines for each PP

Booth selectors choose PP bits

56Introduction to Computer Organization and Architecture

Sign Extension Partial products can be negative

Require sign extension, which is cumbersome High fanout on most significant bit

multiplier x

x0

x15

0

00

x-1

x16x17

ssssssssssssssss

ssssssssssssss

ssssssssssss

ssssssssss

ssssssss

ssssss

ssss

ss

PP0

PP1

PP2

PP3

PP4

PP5

PP6

PP7

PP8

57Introduction to Computer Organization and Architecture

Simplified Sign Ext. Sign bits are either all 0’s or all 1’s

Note that all 0’s is all 1’s + 1 in proper column Use this to reduce loading on MSB

s111111111111111s

s1111111111111s

s11111111111s

s111111111s

s1111111s

s11111s

s111s

s1s

PP0

PP1

PP2

PP3

PP4

PP5

PP6

PP7

PP8

58Introduction to Computer Organization and Architecture

Even Simpler Sign Ext. No need to add all the 1’s in hardware

Precompute the answer!

ssss

ss1

ss1

ss1

ss1

ss1

ss1

ss

PP0PP1PP2PP3PP4PP5PP6PP7PP8

59Introduction to Computer Organization and Architecture

Division - Restoring n times

Shift A and Q left one bit

Subtract M from A, put answer in A

If the sign of A is 1 set q0 to 0

Add M back to A If the sign of A is 0

set q0 to 1

Introduction to Computer Organization and Architecture 60

qn 1-

mn 1-

-bit

Divisor M

Controlsequencer

Dividend Q

Shift left

adder

an 1- a0 q0

m0

a n

0

Add/Subtract

Quotientsetting

n 1+

A

Division – Restoring Example

61

10111

1 1 1 1 1

01111

0

0

0

1

000

0

0

00

0

0

0

00

0

1

00

0

0

0

SubtractShift

Restore

1 00001 0000

1 1

Initially

Subtract

Shift

10111

100001100000000

SubtractShift

Restore

101110100010000

1 1

QuotientRemainder

Shift

10111

1 0000

Subtract

Second cycle

First cycle

Third cycle

Fourth cycle

00

0

0

00

1

0

1

10000

1 11 0000

11111Restore

q0Set

q0Set

q0Set

q0Set

1 01

111 1

01

0001

Division - Nonrestoring n times

If the sign of A is 0 shift A and Q left subtract M from A

Else shift A and Q left add M to A

Now if sign of A is 0 set q0 to 1

Else set q0 to 0

If the sign of A is 1 add M to A

Introduction to Computer Organization and Architecture 62

qn 1-

mn 1-

-bit

Divisor M

Controlsequencer

Dividend Q

Shift left

adder

an 1- a0 q0

m0

a n

0

Add/Subtract

Quotientsetting

n 1+

A

Division – Nonrestoring Example

63

1

Add

Quotient

Remainder

0 0 0 01

0 0 1 01 1 1 1 1

1 1 1 1 10 0 0 1 1

0 0 0 01 1 1 1 1

Shift 0 0 01100001111

Add

0 0 0 1 10 0 0 0 1 0 0 01 1 1 0 1

ShiftSubtract

Initially 0 0 0 0 0 1 0 0 0

1 1 1 0 0000

1 1 1 0 00 0 0 1 1

0 0 0ShiftAdd

0 0 10 0 0 011 1 1 0 1

ShiftSubtract

0 0 0 110000

Restore remainder

Fourth cycle

Third cycle

Second cycle

First cycle

q0Set

q0Set

q0Set

q0Set

1 01

111 1

01

0001

Floating Point – Single Precision IEEE-754, 854 Decimal point can

“move” – hence it’s “floating”

Floating point is useful for scientific calculations

Can represent Very large integers and Very small fractions ~10±38

Introduction to Computer Organization and Architecture 64

Sign ofnumber :

32 bits

mantissa fraction23-bit

representationexcess-127exponent in8-bit signed

Value represented

0 0 1 0 1 0 . . . 00 0 0 1 0 1 0 0 0

S M

Value represented

(a) Single precision

(b) Example of a single-precision number

E

+

1.0010100 287-

=

1. M 2E 127-

=

0 signifies-1 signifies

Floating Point – Double Precision Double Precision can

represent ~10±308

Introduction to Computer Organization and Architecture 65

52-bitmantissa fraction

11-bit excess-1023exponent

64 bits

Sign

S M

Value represented 1. M 2E 1023-

=

E

Floating Point The IEEE Standard

requires these operations, at a minimum

Add Subtract Multiply Divide Remainder Square Root Decimal/Binary Conversion

Special Values

Exceptions Underflow, Overflow,

divide by 0, inexact, invalid

Introduction to Computer Organization and Architecture 66

E’ M Value

0 0 +/- 0

255 0 +/- ∞

0 ≠ 0 ±0.M X 2 -126

255 ≠ 0 Not a Number NaN

FP Arithmetic Operations Add/Subtract

Shift mantissa of smaller exponent number right by the difference in exponents

Set the exponent of the result = the larger exponent Add/Sub Mantissas, get sign Normalize

MultiplyDivide Add/Sub exponents, Subtract/Add 127 Multiply/Divide Mantissas, determine sign Normalize

Introduction to Computer Organization and Architecture 67

FP Guard Bits and Truncation Guard bits

Extra bits during intermediate steps to yield maximum accuracy in the final result

They need to be removed when generating the final result Chopping

simply remove guard bits

Von Neumann rounding if all guard bits 0, chop, else 1

Rounding Add 1 to LSB if guard MSB = 1

Introduction to Computer Organization and Architecture 68

FP Add-Subtract Unit

Introduction to Computer Organization and Architecture 69

E X

Magnitude M

with larger EM of number

with smaller EM of number

subtractor8-bit

sign

subtractor8-bit

MUX

Mantissa

SHIFTER

SWAP

detector

Normalize andround

Leading zeros

to right

adder/subtractor

SubtractAdd /

Sign

Add/Sub

n bitsS A S B

E A E B M A M B

n E A E B -=

E A E B

S R

E X-

E R M RR :32-bitresultR A B+=

32-bit operandsA : S A E A M AB : S B E B M B

CombinationalCONTROL

network

The End Lecture 10