cs 35101 computer architecture spring 2006 week 6/7
Post on 02-Jan-2016
28 Views
Preview:
DESCRIPTION
TRANSCRIPT
CS 35101Computer
ArchitectureSpring 2006
Week 6/7
Paul Durand (www.cs.kent.edu/~durand)
Course url: www.cs.kent.edu/~durand/cs35101.htm
Head’s Up Week 6 & 7 material
Digital Logic Design Processor organization / description MIPS arithmetic operations PH 3.1, 3.2, 3.3
Reminders Midterm #1 – Thursday, February 23rd
Next week’s material MIPS arithmetic operations
- Reading assignment – PH 3.4 through 3.5
To make the architect’s crucial task even conceivable, it is necessary to separate the architecture, the definition of the product as perceivable by the user, from its implementation. Architecture versus implementation defines a clean boundary between parts of the design task, and there is plenty of work on each side of it.
The Mythical Man-Month, Brooks, pg. 256
Review: MIPS ISACategory Instr Op Code Example Meaning
Arithmetic
(R & I format)
add 0 and 32 add $s1, $s2, $s3 $s1 = $s2 + $s3
subtract 0 and 34 sub $s1, $s2, $s3 $s1 = $s2 - $s3
add immediate 8 addi $s1, $s2, 6 $s1 = $s2 + 6
or immediate 13 ori $s1, $s2, 6 $s1 = $s2 v 6
Data Transfer
(I format)
load word 35 lw $s1, 24($s2) $s1 = Memory($s2+24)
store word 43 sw $s1, 24($s2) Memory($s2+24) = $s1
load byte 32 lb $s1, 25($s2) $s1 = Memory($s2+25)
store byte 40 sb $s1, 25($s2) Memory($s2+25) = $s1
load upper imm 15 lui $s1, 6 $s1 = 6 * 216
Cond. Branch (I & R format)
br on equal 4 beq $s1, $s2, L if ($s1==$s2) go to L
br on not equal 5 bne $s1, $s2, L if ($s1 !=$s2) go to L
set on less than 0 and 42 slt $s1, $s2, $s3 if ($s2<$s3) $s1=1 else $s1=0
set on less than immediate
10 slti $s1, $s2, 6 if ($s2<6) $s1=1 else $s1=0
Uncond. Jump (J & R format)
jump 2 j 2500 go to 10000
jump register 0 and 8 jr $t1 go to $t1
jump and link 3 jal 2500 go to 10000; $ra=PC+4
Review: MIPS Organization, so far
ProcessorMemory
32 bits
230
words
read/write addr
read data
write data
word address(binary)
0…00000…01000…10000…1100
1…1100Register File
src1 addr
src2 addr
dst addr
write data
32 bits
src1data
src2data
32registers
($zero - $ra)
32
32
3232
32
32
5
5
5
PC
ALU
32 32
3232
32
0 1 2 37654
byte address(big Endian)
FetchPC = PC+4
DecodeExec
Add32
324
Add32
32br offset
Processor Organization Processor control needs to have the
Ability to input instructions from memory Logic to control instruction sequencing and to issue signals
that control the way information flows between the datapath components and the operations performed by them
Processor datapath needs to have the Ability to load data from and store data to memory Interconnected components - functional units (e.g., ALU) and
storage units (e.g., Register File) - for executing the ISA
Need a way to describe the organization High level (block diagram) description Schematic (gate level) description Textural (simulation/synthesis level) description
Levels of Description of a Digital SystemArchitectural
Functional/Behavioral
Register Transfer
Logic
Circuit
models programmer's view at ahigh level; written in your favoriteprogramming language
more detailed model, like theblock diagram view
model is in terms of datapath FUs,registers, busses; register xferoperations are clock phase accurate
model is in terms of logic gates; delay information can be specified for gates; digital waveforms
model is in terms of circuits (electrical behavior); accurateanalog waveforms
Less Abstract
More Accurate
Slower Simulation
Special languages + simulation systems for describing the inherent parallel activity in hardware (VHDL and verilog)
Schematic capture + logic simulation package like LogicWorks
Why Simulate First?
Physical breadboarding discrete components/lower scale integration precedes actual
construction of the prototype verification of the initial design
No longer possible as designs reach higher levels of integration!
Simulation before construction - aka functional verification
high level constructs means faster to design and test can play “what if” more easily limited performance (can’t usually simulate all possible input
transitions) and accuracy (can’t usually model wiring delays accurately), however
Because ease of use is the purpose, this ratio of function to conceptual complexity is the ultimate test of system design. Neither function alone nor simplicity alone defines a good design.
The Mythical Man-Month, Brooks, pg. 43
Review: MIPS ISACategory Instr Op Code Example Meaning
Arithmetic
(R & I format)
add 0 and 32 add $s1, $s2, $s3 $s1 = $s2 + $s3
subtract 0 and 34 sub $s1, $s2, $s3 $s1 = $s2 - $s3
add immediate 8 addi $s1, $s2, 6 $s1 = $s2 + 6
or immediate 13 ori $s1, $s2, 6 $s1 = $s2 v 6
Data Transfer
(I format)
load word 35 lw $s1, 24($s2) $s1 = Memory($s2+24)
store word 43 sw $s1, 24($s2) Memory($s2+24) = $s1
load byte 32 lb $s1, 25($s2) $s1 = Memory($s2+25)
store byte 40 sb $s1, 25($s2) Memory($s2+25) = $s1
load upper imm 15 lui $s1, 6 $s1 = 6 * 216
Cond. Branch (I & R format)
br on equal 4 beq $s1, $s2, L if ($s1==$s2) go to L
br on not equal 5 bne $s1, $s2, L if ($s1 !=$s2) go to L
set on less than 0 and 42 slt $s1, $s2, $s3 if ($s2<$s3) $s1=1 else $s1=0
set on less than immediate
10 slti $s1, $s2, 6 if ($s2<6) $s1=1 else $s1=0
Uncond. Jump (J & R format)
jump 2 j 2500 go to 10000
jump register 0 and 8 jr $t1 go to $t1
jump and link 3 jal 2500 go to 10000; $ra=PC+4
Review: MIPS Organization, so far
ProcessorMemory
32 bits
230
words
read/write addr
read data
write data
word address(binary)
0…00000…01000…10000…1100
1…1100Register File
src1 addr
src2 addr
dst addr
write data
32 bits
src1data
src2data
32registers
($zero - $ra)
32
32
3232
32
32
5
5
5
PC
ALU
32 32
3232
32
0 1 2 37654
byte address(big Endian)
FetchPC = PC+4
DecodeExec
Add32
324
Add32
32br offset
Arithmetic
Where we've been: Abstractions:
- Instruction Set Architecture (ISA)- Assembly and machine language
What's up ahead: Implementing the architecture
32
32
32
m (operation)
result
A
B
ALU
4
zero ovf
11
Number Representation Bits are just bits (have no inherent meaning)
conventions define the relationships between bits and numbers
Binary numbers (base 2) - integers0000 0001 0010 0011 0100 0101 0110 0111
1000 1001 . . . in decimal from 0 to 2n-1 for n bits
Of course, it gets more complicated storage locations (e.g., register file words) are finite, so have to
worry about overflow (i.e., when the number is too big to fit into 32 bits)
have to be able to represent negative numbers, e.g., how do we specify -8 in
addi $sp, $sp, -8 #$sp = $sp - 8 in real systems have to provide for more that just integers, e.g.,
fractions and real numbers (and floating point)
Possible RepresentationsSign Mag. Two’s Comp. One’s Comp.
1000 = -8
1111 = -7 1001= -7 1000 = -7
1110 = -6 1010 = -6 1001 = -6
1101 = -5 1011 = -5 1010 = -5
1100 = -4 1100 = -4 1011 = -4
1011 = -3 1101 = -3 1100 = -3
1010 = -2 1110 = -2 1101 = -2
1001 = -1 1111 = -1 1110 = -1
1000 = -0 1111 = -0
0000 = +0 0000 = 0 0000 = +0
0001 = +1 0001 = +1 0001 = +1
0010 = +2 0010 = +2 0010 = +2
0011 = +3 0011 = +3 0011 = +3
0100 = +4 0100 = +4 0100 = +4
0101 = +5 0101 = +5 0101 = +5
0110 = +6 0110 = +6 0110 = +6
0111 = +7 0111 = +7 0111 = +7
Issues:
balance
number of zeros
ease of operations
Which one is best? Why?
32-bit signed numbers (2’s complement):
0000 0000 0000 0000 0000 0000 0000 0000two = 0ten
0000 0000 0000 0000 0000 0000 0000 0001two = + 1ten
0000 0000 0000 0000 0000 0000 0000 0010two = + 2ten
...
0111 1111 1111 1111 1111 1111 1111 1110two = + 2,147,483,646ten
0111 1111 1111 1111 1111 1111 1111 1111two = + 2,147,483,647ten
1000 0000 0000 0000 0000 0000 0000 0000two = – 2,147,483,648ten
1000 0000 0000 0000 0000 0000 0000 0001two = – 2,147,483,647ten
1000 0000 0000 0000 0000 0000 0000 0010two = – 2,147,483,646ten
...
1111 1111 1111 1111 1111 1111 1111 1101two = – 3ten
1111 1111 1111 1111 1111 1111 1111 1110two = – 2ten
1111 1111 1111 1111 1111 1111 1111 1111two = – 1ten
What if the bit string represented addresses? need operations that also deal with only positive (unsigned) integers
maxint
minint
MIPS Representations
Review: Signed Binary Representation2’s comp decimal
1000 -8
1001 -7
1010 -6
1011 -5
1100 -4
1101 -3
1110 -2
1111 -1
0000 0
0001 1
0010 2
0011 3
0100 4
0101 5
0110 6
0111 723 - 1 =
1011
then add a 1
1010
complement all the bits
-(23 - 1) =
-23 =
Negating a two's complement number: complement all
the bits and add a 1
remember: “negate” and “invert” are quite different!
Converting n-bit numbers into numbers with more than
n bits:
MIPS 16-bit immediate gets converted to 32 bits for arithmetic
copy the most significant bit (the sign bit) into the other bits
0010 -> 0000 0010
1010 -> 1111 1010
sign extension versus zero extend (lb vs. lbu)
Two's Complement Operations
Goal: Design a ALU for the MIPS ISA
Must support the Arithmetic/Logic operations of the ISA
Tradeoffs of cost and speed based on frequency of occurrence, hardware budget
MIPS Arithmetic and Logic Instructions
Signed arithmetic generates overflow, but no carry out
R-type:
I-Type:
31 25 20 15 5 0
op Rs Rt Rd funct
op Rs Rt Immed 16
Type op funct
ADDI 001000 xx
ADDIU 001001 xx
SLTI 001010 xx
SLTIU 001011 xx
ANDI 001100 xx
ORI 001101 xx
XORI 001110 xx
LUI 001111 xx
Type op funct
ADD 000000 100000
ADDU 000000 100001
SUB 000000 100010
SUBU 000000 100011
AND 000000 100100
OR 000000 100101
XOR 000000 100110
NOR 000000 100111
Type op funct
000000 101000
000000 101001
SLT 000000 101010
SLTU 000000 101011
000000 101100
Design Trick: Divide & Conquer
Break the problem into simpler problems, solve them and glue together the solution
Example: assume the immediates have been taken care of before the ALU
now down to 10 operations can encode in 4 bits
00 add
01 addu
02 sub
03 subu
04 and
05 or
06 xor
07 nor
12 slt
13 sltu
Just like in grade school (carry/borrow 1s) 0111 0111 0110+ 0110 - 0110 - 0101
Two's complement operations easy
subtraction using addition of negative numbers 0111 0111 - 0110 + 1010
Overflow (result too large for finite computer word):
e.g., adding two n-bit numbers does not yield an n-bit number 0111+ 0001
Addition & Subtraction
1101 0001 0001
0001 1 0001
1000
Building a 1-bit Binary Adder
1 bit Full Adder
A
BS
carry_in
carry_out
S = A xor B xor carry_in
carry_out = AB v Acarry_in v Bcarry_in (majority function)
How can we use it to build a 32-bit adder?
How can we modify it easily to build an adder/subtractor?
A B carry_in carry_out S
0 0 0 0 0
0 0 1 0 1
0 1 0 0 1
0 1 1 1 0
1 0 0 0 1
1 0 1 1 0
1 1 0 1 0
1 1 1 1 1
Building 32-bit Adder
1-bit FA
A0
B0
S0
c0=carry_in
c1
1-bit FA
A1
B1
S1
c2
1-bit FA
A2
B2
S2
c3
c32=carry_out
1-bit FA
A31
B31
S31
c31
. .
.
Just connect the carry-out of the least significant bit FA to the carry-in of the next least significant bit and connect . . .
Ripple Carry Adder (RCA)
advantage: simple logic, so small (low cost)
disadvantage: slow and lots of glitching (so lots of energy consumption)
Building 32-bit Adder/Subtractor
Remember 2’s complement is just
complement all the bits
add a 1 in the least significant bit
A 0111 0111 B - 0110 + 1010
1-bit FA S0
c0=carry_in
c1
1-bit FA S1
c2
1-bit FA S2
c3
c32=carry_out
1-bit FA S31
c31
. .
.
A0
A1
A2
A31
B0
B1
B2
B31
add/subt
B0
control(0=add,1=subt) B0 if control =
0, !B0 if control = 1
Overflow Detection and Effects
Overflow: the result is too large to represent in the number of bits allocated
When adding operands with different signs, overflow cannot occur! Overflow occurs when
adding two positives yields a negative or, adding two negatives gives a positive or, subtract a negative from a positive gives a negative or, subtract a positive from a negative gives a positive
On overflow, an exception (interrupt) occurs Control jumps to predefined address for exception Interrupted address (address of instruction causing the overflow)
is saved for possible resumption
Don't always want to detect (interrupt on) overflow
New MIPS Instructions
Category Instr Op Code Example Meaning
Arithmetic
(R & I format)
add unsigned 0 and 33 addu $s1, $s2, $s3 $s1 = $s2 + $s3
subt unsigned 0 and 35 subu $s1, $s2, $s3 $s1 = $s2 - $s3
add imm. unsigned
9 addiu $s1, $s2, 6 $s1 = $s2 + 6
Data Transfer
load byte unsigned
36 lbu $s1, 25($s2) $s1 = Memory($s2+25)
Cond. Branch (I & R format)
set on less than unsigned
0 and 43 sltu $s1, $s2, $s3 if ($s2<$s3) $s1=1 else $s1=0
set on less than imm. unsigned
11 sltiu $s1, $s2, 6 if ($s2<6) $s1=1 else $s1=0
Sign extend - addiu, sltiu
Zero extend - lbu
No overflow detected - addu, subu, addiu, sltu, sltiu
Conclusion
We can build an ALU to support the MIPS ISA we can efficiently perform subtraction using two’s complement
we can replicate a 1-bit ALU to produce a 32-bit ALU
Important points about hardware all of the gates are always working (concurrent)
the speed of a gate is affected by the number of inputs to the gate (fan-in) and the number of gates that the output is connected to (fan-out)
the speed of a circuit is affected by the number of gates in series (on the “critical path” or the “number of levels of logic”)
Our primary focus: comprehension, however, Clever changes to organization can improve performance
(similar to using better algorithms in software)
top related