cao-notess by girdhar gopal gautam 3g
TRANSCRIPT
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
1/130
Gopal sharma MVN institute of technology &management
Branch: ECE (5 th SEM)
Session-2012
Computer Architecture and Organization
MRS. Rama Pathak
Submitted byGirdhar gopal gautam ([email protected] )
1
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
2/130
Lecture 1:
Digital logics Boolean Algebra Logic Gates Truth table
Here we deal with the basic digital circuits of our computer. That is what are the hardwarecomponents we are using , how these hardware components are related and interacted toeach other and how this hardware is accessed or seen by the user.
This gives the origin of the classification of our computer study into: Computer design: This is concerned with the hardware design of the computer. In this
designer decides on the specifications of the computer system. Computer Organization: This is concerned with the way the hardware components operate
and the way they are connected to form the computer system. Computer Architecture: This is concerned with the structure and behavior of the computer
as seen by the user. It includes the information formats, the instruction set and addressingmodes for accessing memory.
In our course we will be dealing with computer architecture and organization.
Before starting with the computer architecture and organization lets discuss thecomponents which make the hardware or the organization of the computer which iscomposed of digital circuits which are handled by digital computer.
Digital Computers Imply that the computer deals with digital information Digital information: is represented by binary digits (0 and 1)
Gates blocks of Hardware that produce 1 or 0 when input logic requirements are
satisfied
2
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
3/130
Functions of gates can be described by: Truth Table Boolean Function Karnaugh Map
Table for various logic gates -1.1
Gate
GATEBinary digitalinput signal
Binary digitaloutput signal
3
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
4/130
Boolean algebra
Algebra with Binary (Boolean) Variable and Logic Operations Boolean Algebra is useful in Analysis and Synthesis of Digital
Logic Circuits
- Input and Output signals can be represented by BooleanVariables and
- Function of the Digital Logic Circuits can be represented by Logic Operations , i.e., Boolean Function(s)
- From a Boolean function, a logic diagram can beconstructed using AND, OR, and I
Note: We can have many circuits for the same Boolean expression.
For example:
Truth TableThe most elementary specification of the function of a Digital
Logic Circuit is the Truth Table
4
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
5/130
Table that describes the Output Values for all the combinationsof the Input Values, called MINTERMS
n input variables 2n minterms
Summary: Computer Design: what hardware components we need. Computer Organization: how these hardware components are interacted. Computer Architecture: how these are connected with the user. Logic Gates: Blocks of hardware giving result in 0 or 1. Basic 8 logic gatesout of 3 (AND , OR and I ) are basic Boolean Algebra: The representation of input and output signals in the formof expressions. Truth table: Table that describes the Output Values for all the combinationsof the Input Values
Lecture 2:
Combinational logic BlocksMultiplexersAddersEncodersDecoders
Combinational circuits are circuits without memory where the outputs are obtained fromthe inputs only. An n-input m-output combinational circuit is of the form.
Multiplexer is the combinational circuit which selects one of the many inputs depending onthe selection criteria.The no of selection inputs depends on the number of inputs in the manner as 2 x = y
By this if y is the no of inputs then x is the no of selection lines.Thus if we have 4 input lines, we use 2 selection lines as 2 2 =4 and so on.And this will be called as 4:1 multiplexer or 4*1 multiplexer.
This has been explained in the diagram as:
Combinationalcircuits
n input m output
5
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
6/130
AddersHalf Adder Full Adder
Half Adder: Adds 2 bits and give out carry and sum as result
4-to-1 Multiplexer
I0
I1
I2
I3
S0
S1
Y
0 0 I0
0 1 I 11 0 I
2
1 1 I3
Select OutputS
1S
0Y
6
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
7/130
Full Adder: Adds 2 bits with carry in and gives carry out and sum as result.
x
y
x
y
c = xy s = xy + xy= x y
x
c
s
0 0 0 00 1 0 11 0 0 11 1 1 0
x y c s 010 0
01
1
y
Truth Table
Digital Circuit
0
XY
Cin
S
cout
0 0 0 0 00 0 1 0 10 1 0 0 10 1 1 1 01 0 0 0 11 0 1 1 01 1 0 1 01 1 1 1 1
Cout
= xy + xcin+ yc
in
= xy + (x y) Cin
s = xy cin+xyc
in+xyc
in+xyC
in
= x y Cin
= (x y) Cin
xC
in
xC
in
Cout s
x y c in cout s
0
0
10
0
1
11
0
101
1
010
7
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
8/130
Decoder: Decoder takes n inputs and gives 2 n outputs.That is we get 8 outputs for 3 inputs and is called as 3* 8 decoder.We also have 2* 4 decoder and 4*16 decoder and so on.
We are implementing a decoder with the help of NAND gates.
Using NAND gates, it becomes more economical.
8
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
9/130
Summary: Combinational circuits: where the outputs are obtained from the inputs only. Various combinational circuits are:o Multiplexers: No of selection inputs depends on the number of inputs in the manner as 2 x = y.o Half Adder: Adds 2 bits and give result as carry and sum.o Full Adder: Adds 2 bits with carry in and gives result as carry outand Sum.o Encoder: Takes 2 n inputs and gives n outputs.o Decoder: Takes n inputs and gives 2 n outputs.
Important Questions derived from this:Q1. What is the difference in multiplexer and decoder?Q2.Draw a 4*1 decoder with the help of AND gates.
9
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
10/130
Lecture 3:
Sequential logic BlocksLatchesFlip flopsRegistersCounters
Sequential logic Blocks : logic blocks whose output logic valuedepends on the input values and the state of the blocks
In this we have the concept of memory which was notapplicable for combinational circuits.
The various sequential blocks or circuits are:
Latches: A latch is a kind of bistable multivibrator , an electronic circuit which has two
stable states and thereby can store one bit of information. Today the word is mainlyused for simple transparent storage elements, while slightly more advanced non-transparent (or clocked ) devices are described as flip-flops . Informally, as thisdistinction is quite new, the two words are sometimes used interchangeably.
S-R latch:
To overcome the restricted combination, one can add gates to the inputs that would convert(S,R) = (1,1) to one of non-restricted combinations. That can be:Q = 1 (1,0) referred to as an S-latch Q = 0 (0,1) referred to as an R-latch
10
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
11/130
Keep state (0,0) referred to as an E-latch
D-LATCHForbidden input values are forced not to occur by using an inverter between the inputs.
Flip Flops:
D flip flop:
Q
QD(data)
E(enable)
D Q
E Q
E Q
D Q
D Q(t+1)
0 01 1
11
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
12/130
If you compare the D-flip flop and D latch the only difference you find in the circuit isthat latches do not have clocks and flip flops have it.
So you can note down the difference between latches and flip flops as: Latch is an edge triggered device whereas Flip flop is a level triggered. The output of a latch changes independent of a clock signal whereas the Output of a
Flip - Flop changes at specific times determined by a clocking signal. In Latch We do not require clock pulses and flip flops are clocked devices.
Characteristics- State transition occurs at the rising edge or
falling edge of the clock pulse
Latches
respond to the input only during these periods
Edge-triggered Flip Flops (positive)
respond to the input only at this time
12
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
13/130
Counters: A counter is a device which stores (and sometimes displays) the number of times a particular event or process has occurred, often in relationship to a clock signal.
4 bit binary counter:
RING COUNTER:
In Ring Counter the output of 1 st flip flop is moved to the input of 2 nd flip flop.
J K
Q
J K
Q
J K
Q
J K
Q
Clock
Counter Enable
A0 A1 A2 A3
OutputCarry
13
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
14/130
JOHNSON COUNTER :
In Johnson counter the output of last flip flop is inverted and given to the first flip flop.
Registers: It refers to a group of flip-flops operating as a coherent unit to hold data. This isdifferent from a counter, which is a group of flip-flops operating to generate new data bytabulating it.
14
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
15/130
Shift register: A register that is capable of shifting data one bit at a time is called a shiftregister. The logical configuration of a serial shift register consists of a chain of flip-flopsconnected in cascade, with the output of one flip-flop being connected to the input of itsneighbor. The operation of the shift register is synchronous; thus each flip-flop isconnected to a common clock. Using D flip-flops forms the simplest type of shift-registers .
Bi- directional shift register with parallel load
Summary:
DQ
C DQ
C DQ
C DQ
C
A0
A1 A2 A3
4 x 1MUX
4 x 1MUX
4 x 1MUX
4 x 1MUX
Clock S 0S 1 SeriaIInput
I0 I1 I2 I3SerialInput
15
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
16/130
Sequential circuits: output logic value depends on the input values and thestate of the blocks. These circuits have memory. Various combinational circuits are:o Latches: An electronic circuit which has two stable states andthereby can store one bit of informationo Flip flops: It also has 2 stable states but with memory.o Counter: A device which stores number of times a particular eventor process has occurred.o Registers: A group of flip-flops operating as a coherent unit to holddata.
Important Questions derived from this:Q1. What is the difference in latch and flip flop?Q2. Explain Johnson counter?Q3. Draw shift register with parallel load.
16
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
17/130
Lecture 4:
Stored Program control concept
Flynns classification of computers: SISD SIMD MISD MIMD
After the discussion of basic principles of hardware and the combinational and sequentialcircuits we have in our computer system. Lets see how these components are interacted tomake our computer system which we use. We will be starting with the basic architecturesof the computer system. And the most basic one which comes is how the programs arestored in our computer system or how the different programs and data are arranged in our system.
Stored Program control concept
The simplest way to organize a computer is to have one processor register and aninstruction code with 2 parts.
Opcode (What operation is to be completed) Address (Address of the operands on which the operation is to be
computed) A computer that by design includes an instruction set architecture and can store in
memory a set of instructions (a program) that details the computation and the dataon which computation is to be done.
Memory 4096*16
The opcode tells us the operation to be performed. Address tells us the memory location where to find the operand. For a memory unit of 4096 bits we need 12 bits to specify the address.
Instruction Format
011Opcode
15Address
12
015Binary Operand
Fig 1: Stored Program Organization
Processor register
(Accumulator or AC)
Instructions(Program)
Operands(Data)
17
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
18/130
When we store an instruction code in memory, 4 bits are specified for 16 operations(as 12 bits are for operand address).
For an operation control fetches the instruction from memory, it decodes theoperation (one out of 16) and finds out the operands and then do the operation.
Computers with one processor register generally name it accumulator (or AC). Theoperation is performed with the operand and the content of AC. In case no operand is specified, we compute the operation on
accumulator .E.g. Clear AC, complement AC etc.
PARALLEL COMPUTERSThe one we studied was very basic one but sometimes we have very large computations inwhich one processor with general architecture will not of much help. Thus we take the helpof many processors or divide the processor functions into many functional units and alsodoing the same computation on many data values. So to give solutions to all these we have
various types of computers.Architectural Classification
Flynn's classification Based on the multiplicity of Instruction Streams and Data Streams Instruction Stream
Sequence of Instructions read from memory Data Stream
Operations performed on the data in the processor
Fig 2: Classification accordance to Instruction and Data stream
There are a variety of ways parallel processing can be classified. M.J.Flynn considered the organization of a computer system by the number of
instructions and data items manipulated simultaneously. The normal operation of a computer is to fetch instructions from memory and
execute them in the processor.
Number of Data Streams
Number of InstructionStreams
Single
Multiple
Single Multiple
SISD SIMD
MISD MIMD
18
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
19/130
The sequence of instructions read from memory constitutes an instructionstream.
The operations performed on the data in the processor constitute a datastream.
Parallel processing can be implemented with either instruction stream, data stream
or both.
SISD COMPUTER SYSTEMS
SISD (Single instruction single data stream) is the simplest computer available. It containsno parallelism. It has single instruction and single data stream. The instructions associatedwith SISD are executed sequentially and the system may or may not have external; parallel
processing capabilities.
Fig 3: SISD ArchitectureCharacteristics
- Standard von Neumann machine- Instructions and data are stored in memory- One operation at a time
LimitationsVon Neumann bottleneck Maximum speed of the system is limited by the
Memory Bandwidth (bits/sec or bytes/sec)- Limitation on Memory Bandwidth - Memory is shared by CPU and I/O
Examples: Superscalar processorsSuper pipelined processorsVLIW
MISD COMPUTER SYSTEMS
MISD (Multiple instruction, single data stream) is of no practical usage as there is leastchance where a lot of instructions get executed on a single data.
Control
Unit
Processor
UnitMemory
Instruction stream
Data stream
19
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
20/130
Fig 4: MISD Architecture Characteristics
- There is no computer at present that can beClassified as MISD
SIMD COMPUTER SYSTEMS
SIMD (Single instruction Multiple data stream) is the computer where a single instructiongets operated with different sets of data. It gets executed with the help of many processingunits controlled by a single control unit. The shared memory must contain various modulesso that it can communicate with all the processors at the same time.
Main memory is used for storage of programs. Master control unit decodes the instruction and determine the instruction to be
executed.
M1
CU1 P 1
M2
CU2 P 2
Mn CUn P n
Memory
Instruction stream
Data stream
Control Unit
Alignment network
P1
P2
Pn
M1
MnM2
Data bus
Instruction Stream
Data stream
Processor units
Memory modules
20
Memory
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
21/130
Fig 5: SIMD Architecture Characteristics
- Only one copy of the program exists- A single controller executes one instruction at a time
Examples:Array processorsSystolic arraysAssociative processors
MIMD COMPUTER SYSTEMS
MIMD (Multiple instruction, multiple data stream) refers to a computer system where wehave different processing elements working on different data.In this we classify various multiprocessors and multi computers.
Characteristics
- Multiple processing units- Execution of multiple instructions on multiple data
Fig 6: MIMD Architecture
Types of MIMD computer systems- Shared memory multiprocessors
UMA NUMA
- Message-passing multi computers
SHARED MEMORY MULTIPROCESSORSExample systems
Bus and cache-based systems- Sequent Balance, Encore Multimax
Multistage IN-based systems- Ultra computer, Butterfly, RP3, HEP
Interconnection Network
P1
M1 Pn MnP 2 M2
Shared Memory
21
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
22/130
Crossbar switch-based systems- C.mmp, Alliant FX/8
LimitationsMemory access latencyHot spot problem
SHARED MEMORY MULTIPROCESSORS (UMA)
Fig 7: Uniform Memory access(UMA)Characteristics
All processors have equally direct access to one large memory address space. Thus theaccess time to reach that memory is same for all processors thus it is named as UMA.
SHARED MEMORY MULTIPROCESSORS (NUMA)
Interconnection Network
P 1 P nP 2
M1 MnM2
Interconnection Network
P 1 P nP 2
M MM
22
MnM1 M2
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
23/130
Fig 8: NUMA (Non uniform memory access)
CharacteristicsAll processors have equally direct access to one large memory address space and also
have their own memory. Thus the access time to reach different memories is different for each processor thus it is named as NUMA.
MESSAGE-PASSING MULTICOMPUTER
Fig 9: Message passing multi computer Architecture
Characteristics- Interconnected computers- Each processor has its own memory, and communicates via message-passing
Example systems- Tree structure: Teradata, DADO- Mesh-connected: Rediflow, Series 2010, J-Machine- Hypercube: Cosmic Cube, iPSC, NCUBE, FPS T Series, Mark III
Limitations- Communication overhead- Hard to programming
Summary: Stored Program Control Concept: In this type of organization instructions and data
are stored separately. Flynns classification Of computers: It divided the processing work intodata streams and instruction streams and resulted in:
Message-Passing Network
P 1 P nP 2
M M M
Point-to-point connections
23
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
24/130
o SISD(Single instruction Single data)o SIMD(Single instruction Multiple data)o MISD(Multiple instruction Single data)o MIMD (Multiple instruction Multiple data)
Important Questions:Q1. Explain stored program control concept.Q2. Explain Flynns classification of computers.Q3. Describe the concept of data stream and instruction stream.
Lecture -5
MULTILEVEL VIEWPOINT OF A MACHINE MICRO ARCHITECTURE
ISA MICRO ARCHITECTURE
CPUCACHESMAIN MEMORY AND SECONDARY MEMORY UNITSINPUT / OUTPUT MAPPING
After the discussion of stored program control concept and the various type of parallelcomputers, lets study the different components of the computer structure.
MULTILEVEL VIEWPOINT OF A MACHINEOur computer is build on various layers.
These layers are basically divided into:Software layer Hardware Layer Instruction Set Architecture
24
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
25/130
Fig 1: Multilevel viewpoint of a machine
Computer system architecture is decided on the basis of the type of applications or usage of the computer.
The computer architect decides the different layers and the function of each layer for aspecific computer.These layers or functions of each can vary from one organization to another.
Our layered architecture is basically divided into 3 parts:
Macro-Architecture : as a unit of deployment, we will talk about Clientapplications and COM Servers.Computer Architecture is the conceptual design and fundamental operational structure of a computer system. It is a blueprint and functional description of requirements (especiallyspeeds and interconnections) and design implementations for the various parts of acomputer .
This is basically our software layer of the computer. It comprises of :
User Application layer The user layer is basically to give the interface to the user with the computer for which the computer is designed .At this layer the user gives the inputs as what
processing has to be done .The requirements given by the user has to beimplemented by the computer architect with the help of other layers.
High level language
INSTRUCTION SET ARCHITECTURE (ISA)
PROCESSOR MEMORY I/0 SYSTEM
CIRCUIT LEVEL DESIGN
SILICON LAYOUT LAYER
COMPILER
ASSEMBLER
OS MSDOSWINDOWSUNIX / LINUX
USER APPLICATION LAYERSOFTWARELAYER
HARDWARELAYER
DATA PATH AND CONTROL
GATE LEVEL DESIGN
MACROARCHITECTURE
MICROARCHITECTURE
25
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
26/130
High-level programming language is a programming language with strongabstraction from the details of the computer. In comparison to low-level
programming languages, it may use natural language elements, be easier to use,or more portable across platforms. Such languages hide the details of CPUoperations such as memory access models and management of scope.E.g.
C/Fortran/Pascal .These are not computer dependent.
Assembly languageAssembly Language refers to the lowest-level human-readable method for
programming a particular computer. Assembly Languages are platform specific,and therefore there is a different Assembly Language necessary for
programming every different type of computer.
Machine languageMachine languages consist entirely of numbers and are almost impossible for humans to read and write.
Operating systemOperating systems interface with hardware to provide the necessary servicesfor application software. E.g. OS, LINUX, UNIX etc.
Functions of Operating system: Process management Memory management File management Device management Error Detection Security
Types of Operating system: Multiprogramming Operating System Multiprocessing Operating system Time Sharing Operating system Real time Operating system Distributed Operating system Network Operating system
Compiler Software that translates a program written in a high-level programminglanguage (C/C++, COBOL, etc.) into machine language. A compiler usuallygenerates assembly language first and then translates the assembly languageinto machine language. A utility known as a "linker" then combines all requiredmachine language modules into an executable program that can run in thecomputer.
26
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
27/130
Assembler is the software that translates assembly language into machinelanguage. Contrast with compiler, which is used to translate a high-levellanguage, such as COBOL or C, into assembly language first and then intomachine language.
Instruction set architecture: This is an abstraction on the interface between thehardware and the low-level software. It deals with the functional behaviour of acomputer system as viewed by a programmer . Computer organization deals withstructural relationships that are not visible by a programmer. Instruction setarchitecture is the attribute of a computing system, as seen by the assemblylanguage programmer or compiler.
ISA is determined by:Data Storage.Memory Addressing Modes.
Operations in the Instruction Set.Instruction Formats.Encoding the Instruction Set.Compilers View.
Micro-Architecture: inside a unit of deployment we will talk about running process,COM apartment, thread concurrency and synchronization, memory sharing.
Micro architecture , also known as Computer organization is a lower level, moreconcrete, description of the system that involves how the constituent parts of the
system are interconnected and how they interoperate in order to implement the ISA.The size of a computers cache for instance, is an organizational issue that generallyhas nothing to do with the
Processor memory I /o system These are the basic hardwaredevices required for the processing of any system application.
Data path and control In different computers we have differentnumber and type of registers and other logic circuits .The data pathand control decides the flow of information within the various partsof the computer system in various circuits.
Gate level design These circuits such as register, counters etc areimplemented in the form of various gates available.
Circuit level design to add the gates to form a logical circuit or acomponent we have the basic circuit level design which ultimatelygives birth to all the hardware components of a computer system.
Silicon layout layer
Other than the architecture of the computer , we have some very basic units which areimportant for our computer.Memory units:
27
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
28/130
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
29/130
Encoding the Instruction Set.Compilers View
Other than the structured organization of computer , other importantelements are:
o Memoryo CPUo I/O
Important Questions:Q1. Explain multi level view point of a machine.Q2. Describe micro architecture.Q3. Describe macro architecture.Q4. Explain ISA and why we call it is a link between the hardware and software
components.Q5. What is operating system?
29
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
30/130
Lecture 6:
CPU performance measures MIPS
MFLOPS
After the discussion of all the elements of computer structure in the previous topics , wedescribe the performance of a computer in this lecture with the help of their performancemetrics.
Performance of a machine is determined by: Instruction count Clock cycle time Clock cycles per instruction
Processor design (datapath and control) will determine: Clock cycle time
Clock cycles per instruction Single cycle processor - one clock cycle per instruction Advantages: Simple design, low CPI Disadvantages: Long cycle time, which is limited by the slowest instruction
We have different methods to calculate the performance of a CPU or two comparetwo CPUs but it highly depends on what type of instructions we give to these CPU.
The two phenomenon we generally use are: MIPS MFLOPS
MIPS: For a specific program running on a specific computer MIPS is a measure of how
many millions of instructions are executed per second:
MIPS = Instruction count / (Execution Time x 106)= Instruction count / (CPU clocks x Cycle time x 106)
= (Instruction count x Clock rate) /(Instruction count x CPI x 106)
= Clock rate / (CPI x 106)
Faster execution time usually means faster MIPS rating.
CPI
Inst.Count
CycleTime
30
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
31/130
MIPS is a good technique but it also have some pitfalls.Problems with MIPS rating:
No account for the instruction set used. Program-dependent: A single machine does not have a single MIPS rating
since the MIPS rating may depend on the program used.
Easy to abuse: Program used to get the MIPS rating is often omitted. Cannot be used to compare computers with different instruction sets. A higher MIPS rating in some cases may not mean higher performance or
better execution time i.e. due to compiler design variations. For a machine with instruction classes:
For a given program, two compilers produced the following instruction counts:
The machine is assumed to run at a clock rate of 100 MHz.
MIPS = Clock rate / (CPI x 106) = 100 MHz / (CPI x 106)CPI = CPU execution cycles / Instructions countCPU time = Instruction count x CPI / Clock rate
For compiler 1: CPI1 = (5 x 1 + 1 x 2 + 1 x 3) / (5 + 1 + 1) = 10 / 7 = 1.43 MIP1 = 100 / (1.428 x 106) = 70.0
CPU time1 = ((5 + 1 + 1) x 106 x 1.43) / (100 x 106) = 0.10 seconds
For compiler 2: CPI2 = (10 x 1 + 1 x 2 + 1 x 3) / (10 + 1 + 1) = 15 / 12 = 1.25 MIP2 = 100 / (1.25 x 106) = 80.0 CPU time2 = ((10 + 1 + 1) x 106 x 1.25) / (100 x 106) = 0.15 seconds
Instruction class CPIA 1B 2C 3
Instruction counts (in millions)for each instruction class
Code from: A B CCompiler 1 5 1 1Compiler 2 10 1 1
31
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
32/130
MFLOPS: MFLOPS, for a specific program running on a specific computer, is a measure of
millions of floating point-operation (megaflops) per second.MFLOPS = Number of floating-point operations /(Execution time x 106 )
MFLOPS is a better comparison measure between different machines than MIPS.
This is better than MIPS but it also has some pitfalls.
Problems with MFLOPS: A floating-point operation is an addition, subtraction, multiplication, or division
operation applied to numbers represented by a single or a double precision floating- point representation.
Program-dependent: Different programs have different percentages of floating- point operations present i.e. compilers have no floating- point operations and yielda MFLOPS rating of zero.
Dependent on the type of floating-point operations present in the program.Summary:
Performance of a machine is determined by: Instruction count Clock cycle time Clock cycles per instruction
MIPS = Instruction count / (Execution Time x 106)
MFLOPS = Number of floating-point operations /(Execution time x106 )
Important Questions:Q1. What is MIPS?Q2. What is MFLOPS?Q3. What is the difference between MIPS and MFLOPS?Q4. What are CPU performance measures?
32
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
33/130
Lecture 7:
Cache Memory Main Memory
Secondary Memory
We have basically 3 type of memories attached with our processor.
Cache MemoryMain MemorySecondary Memory
Primary storage , presently known as memory , is the only one directly accessible to theCPU. The CPU continuously reads instructions stored there and executes them as required.Any data actively operated on is also stored there in uniform manner.
there are two more sub-layers of the primary storage, besides main large-capacity RAM:
Processor registers are located inside the processor. Each register typically holds aword of data (often 32 or 64 bits). CPU instructions instruct the arithmetic and logic unit to perform various calculations or other operations on this data (or with thehelp of it). Registers are technically among the fastest of all forms of computer datastorage.
Processor cache is an intermediate stage between ultra-fast registers and muchslower main memory. It's introduced solely to increase performance of the
computer. Most actively used information in the main memory is just duplicated inthe cache memory, which is faster, but of much lesser capacity. On the other hand itis much slower, but much larger than processor registers. Multi-level hierarchicalcache setup is also commonly used primary cache being smallest, fastest andlocated inside the processor; secondary cache being somewhat larger and slower.
These are the type of memories accessed when we work with processor . But if we have tostore some data permanently we need to take help of secondary or auxiliary memory.
Secondary memory (or secondary storage) is the slowest and cheapest form of memory . Itcannot be processed directly by the CPU . It must first be copied into primary storage (alsoknown as RAM ).
Secondary memory devices include magnetic disks like hard drives and floppy disks ;optical disks such as CDs and CDROMs ; and magnetic tapes , which were the first formsof secondary memory.
Primary memory Secondary memory
33
http://en.wikipedia.org/wiki/Processor_registerhttp://en.wikipedia.org/wiki/Word_(computing)http://en.wikipedia.org/wiki/Word_(computing)http://en.wikipedia.org/wiki/Arithmetic_and_logic_unithttp://en.wikipedia.org/wiki/Arithmetic_and_logic_unithttp://en.wikipedia.org/wiki/Arithmetic_and_logic_unithttp://en.wikipedia.org/wiki/Arithmetic_and_logic_unithttp://en.wikipedia.org/wiki/CPU_cachehttp://en.wikipedia.org/wiki/Memory_hierarchyhttp://en.wikipedia.org/wiki/Memory_hierarchyhttp://www.webopedia.com/TERM/S/memory.htmhttp://www.webopedia.com/TERM/S/memory.htmhttp://www.webopedia.com/TERM/S/CPU.htmhttp://www.webopedia.com/TERM/S/CPU.htmhttp://www.webopedia.com/TERM/S/RAM.htmhttp://www.webopedia.com/TERM/S/hard_disk_drive.htmhttp://www.webopedia.com/TERM/S/floppy_disk.htmhttp://www.webopedia.com/TERM/S/optical_disk.htmhttp://www.webopedia.com/TERM/S/compact_disc.htmhttp://www.webopedia.com/TERM/S/compact_disc.htmhttp://www.webopedia.com/TERM/S/compact_disc.htmhttp://www.webopedia.com/TERM/S/CD_ROM.htmhttp://www.webopedia.com/TERM/S/tape.htmhttp://en.wikipedia.org/wiki/Processor_registerhttp://en.wikipedia.org/wiki/Word_(computing)http://en.wikipedia.org/wiki/Arithmetic_and_logic_unithttp://en.wikipedia.org/wiki/Arithmetic_and_logic_unithttp://en.wikipedia.org/wiki/CPU_cachehttp://en.wikipedia.org/wiki/Memory_hierarchyhttp://en.wikipedia.org/wiki/Memory_hierarchyhttp://www.webopedia.com/TERM/S/memory.htmhttp://www.webopedia.com/TERM/S/CPU.htmhttp://www.webopedia.com/TERM/S/RAM.htmhttp://www.webopedia.com/TERM/S/hard_disk_drive.htmhttp://www.webopedia.com/TERM/S/floppy_disk.htmhttp://www.webopedia.com/TERM/S/optical_disk.htmhttp://www.webopedia.com/TERM/S/compact_disc.htmhttp://www.webopedia.com/TERM/S/CD_ROM.htmhttp://www.webopedia.com/TERM/S/tape.htm -
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
34/130
1. Fast 1. Slow2. Expensive 2. Cheap3. Low capacity 3. Large capacity
4. Connects directly to the processor 4. Not connected directly to the processor
Hard Disks:
Hard disks similar to cassette tapes use the magnetic recording techniques - the magnetic mediumcan be easily erased and rewritten, and it will "remember" the magnetic flux patterns stored onto themedium for many years.
Hard drive consists of platter, control circuit board and interface parts.
A hard disk is a sealed unit containing a number of platters in a stack. Hard disks may be mounted in
a horizontal or a vertical position. In this description, the hard drive is mounted horizontally.
Electromagnetic read/write heads are positioned above and below each platter. As the platters spin,the drive heads move in toward the center surface and out toward the edge. In this way, the driveheads can reach the entire surface of each platter.
On a hard disk, data is stored in thin, concentric bands. A drive head, while in one position can reador write a circular ring, or band called a track. There can be more than a thousand tracks on a 3.5-
inch hard disk. Sections within each track are called sectors. A sector is the smallest physical storageunit on a disk, and is almost always 512 bytes (0.5 kB) in size.
The stack of platters rotate at a constant speed. The drive head, while positioned close to the center of the disk reads from a surface that is passing by more slowly than the surface at the outer edges of the disk. To compensate for this physical difference, tracks near the outside of the disk are less-densely populated with data than the tracks near the center of the disk. The result of the differentdata density is that the same amount of data can be read over the same period of time, from any drive
34
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
35/130
head position.
The disk space is filled with data according to a standard plan. One side of one platter contains spacereserved for hardware track-positioning information and is not available to the operating system.Thus, a disk assembly containing two platters has three sides available for data. Track-positioning
data is written to the disk during assembly at the factory. The system disk controller reads this datato place the drive heads in the correct sector position.
Magnetic Tapes:
An electric current in a coil of wire produces a magnetic field similar to that of a bar magnet , andthat field is much stronger if the coil has a ferromagnetic (iron-like) core
Tape heads are made from rings of ferromagnetic material with a gap where the tape contacts it sothe magnetic field can fringe out to magnetize the emulsion on the tape. A coil of wire around thering carries the current to produce a magnetic field proportional to the signal to be recorded. If analready magnetized tape is passed beneath the head, it can induce a voltage in the coil. Thus the samehead can be used for recording and playback .
35
http://hyperphysics.phy-astr.gsu.edu/hbase/electric/elecur.html#c1http://hyperphysics.phy-astr.gsu.edu/hbase/magnetic/elemag.html#c5http://hyperphysics.phy-astr.gsu.edu/hbase/magnetic/elemag.html#c3http://hyperphysics.phy-astr.gsu.edu/hbase/magnetic/elemag.html#c3http://hyperphysics.phy-astr.gsu.edu/hbase/solids/ferro.html#c4http://hyperphysics.phy-astr.gsu.edu/hbase/magnetic/toroid.html#c1http://hyperphysics.phy-astr.gsu.edu/hbase/magnetic/toroid.html#c1http://hyperphysics.phy-astr.gsu.edu/hbase/audio/tape2.html#c3http://hyperphysics.phy-astr.gsu.edu/hbase/audio/tape.html#c3http://hyperphysics.phy-astr.gsu.edu/hbase/audio/tape.html#c3http://hyperphysics.phy-astr.gsu.edu/hbase/audio/tape.html#c4http://hyperphysics.phy-astr.gsu.edu/hbase/audio/tape.html#c4http://hyperphysics.phy-astr.gsu.edu/hbase/electric/elecur.html#c1http://hyperphysics.phy-astr.gsu.edu/hbase/magnetic/elemag.html#c5http://hyperphysics.phy-astr.gsu.edu/hbase/magnetic/elemag.html#c3http://hyperphysics.phy-astr.gsu.edu/hbase/solids/ferro.html#c4http://hyperphysics.phy-astr.gsu.edu/hbase/magnetic/toroid.html#c1http://hyperphysics.phy-astr.gsu.edu/hbase/audio/tape2.html#c3http://hyperphysics.phy-astr.gsu.edu/hbase/audio/tape.html#c3http://hyperphysics.phy-astr.gsu.edu/hbase/audio/tape.html#c4 -
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
36/130
Lecture 8:
Instruction Set based classification of computers Three address instructions
Two address instructions One address instructions Zero address instructions RISC address instructions CISC address instructions RISC Vs CISC
In the last chapter we discussed the various architectures and the layers of the computer architecture. In this chapter we are explaining the middle layer of the multilevel view pointof a machine i.e. Instruction Set Architecture.
Instruction Set Architecture (ISA) is an abstraction on the interface between the hardwareand the low-level software.
It comprises of :Instruction Formats.Memory Addressing Modes.Operations in the Instruction Set.Encoding the Instruction Set.Data Storage.Compilers View.
Instruction FormatIs the representation of the instruction. It contains the various Instruction Fields :
opcode field specify the operations to be performed Address field(s) designate memory address(es) or processor register(s) Mode field(s) determine how the address field is to be interpreted to get effective
address or the operand The number of address fields in the instruction format :
depend on the internal organization of CPU The three most common CPU organizations :
- Single accumulator organization :
ADD X /* AC AC + M[X] */- General register organization :ADD R1, R2, R3 /* R1 R2 + R3 */ADD R1, R2 /* R1 R1 + R2 */MOV R1, R2 /* R1 R2 */ADD R1, X /* R1 R1 + M[X] */
- Stack organization :PUSH X /* TOS M[X] */
36
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
37/130
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
38/130
One goal for CISC machines was to have a machine language instruction to matcheach high-level language statement type.
Criticisms on CISC-Complex Instruction
Format, Length, Addressing Modes Complicated instruction cycle control due to the complex decoding HWand decoding process
- Multiple memory cycle instructions Operations on memory data Multiple memory accesses/instruction- Microprogrammed control is necessity Microprogram control storage takes substantial portion of CPU chip area Semantic Gap is large between machine instruction and microinstruction- General purpose instruction set includes all the features required byindividually different applications
When any one application is running, all the features required bythe other applications are extra burden to the application
RISC
In the late 70s - early 80s, there was a reaction to the shortcomings of the CISC style of processors
Reduced Instruction Set Computers (RISC) were proposed as analternative
The underlying idea behind RISC processors is to simplify the instruction set and reduce
instruction execution time
Note : In RISC type of instructions , we cant access the memory operands directly .
Evaluate X = (A + B) * (C + D) :MOV R1, A /* R1 M[A] */MOV R2, B /* R2 M[B] */ADD R1,R1,R2 /* R1 R1 + R2MOV R2, C /* R2 M[C] */MOV R3, D /* R3 M[D] */ADD R2,R2, R3 /* R2 R2 + R2 */
MUL R1,R1, R2 /* R1 R1 * R2 */MOV X, R1 /* M[X] R1 */
RISC processors often feature : Few instructions Few addressing modes Only load and store instructions access memory
38
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
39/130
All other operations are done using on-processor registers Fixed length instructions Single cycle execution of instructions The control unit is hardwired, not microprogrammed
Since all (but the load and store instructions) use only registers for operands, only a few addressing modes are needed
By having all instructions the same length : reading them in is easy and fast
The fetch and decode stages are simple, looking much more like Manos BCthan a CISC machine
The instruction and address formats are designed to be easy to decode (Unlike the variable length CISC instructions,) the opcode and register fields of RISC instructions can be decoded simultaneously
The control logic of a RISC processor is designed to be simple and fast : The control logic is simple because of the small number of instructions andthe simple addressing modes
The control logic is hardwired, rather than microprogrammed, becausehardwired control is faster
ADVANTAGES OF RISCVLSI Realization- Control area is considerably reduced
RISC chips allow a large number of registers on the chip- Enhancement of performance and HLL support
- Higher regularization factor and lower VLSI design cost
Computing Speed- Simpler, smaller control unit faster - Simpler instruction set; addressing modes; instruction format
faster decoding- Register operation faster than memory operation- Register window enhances the overall speed of execution- Identical instruction length, One cycle instruction execution
suitable for pipelining faster
Design Costs and Reliability- Shorter time to design
reduction in the overall design cost and reduces the problem that the end product will be obsolete by the time the design is completed
- Simpler, smaller control unithigher reliability
- Simple instruction format (of fixed length)
39
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
40/130
ease of virtual memory management High Level Language Support
- A single choice of instruction shorter, simpler compiler - A large number of CPU registers more efficient code- Register window Direct support of HLL
- Reduced burden on compiler writer
RISC VS CISC
The CISC Approach Thus, the entire task of multiplying two numbers can be completed with oneinstruction:
MULT 2:3, 5:2
One of the primary advantages of this system is that the compiler has to do verylittle work to translate a high-level language statement into assembly. Because the
length of the code is relatively short, very little RAM is required to storeinstructions. The emphasis is put on building complex instructions directly into thehardware.
The RISC ApproachIn order to perform the exact series of steps described in the CISC approach, a
programmer would need to code four lines of assembly: LOAD A, 2:3
LOAD B, 5:2PROD A, BSTORE 2:3,
A At first, this may seem like a much less efficient way of completing the
operation. Because there are more lines of code, more RAM is needed to store theassembly level instructions. The compiler must also perform more work to converta high-level language.
RISC vs CISC
Emphasis on hardwareTransistors used for storingcomplex instructions
Emphasis on software
Spends more transistorson memory registers
Includes multi-clock complex instructions,
Single-clock reduced instruction only
Memory-to-memory:"LOAD" and "STORE"incorporated in instructions
Register to register:"LOAD" and "STORE"are independent instructions
Small code sizes large code sizes
40
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
41/130
High cycles per second Low cycles per second
Summary: The instruction format is composed of the opcode field, address field, and mode field. The different types of address instructions used are three-address, two-address, one-
address and zero-address.
RISC and CISC Introduction with its advantages and criticism RISC Vs CISC
Important Questions:Q1.Explain the different addressing formats in detail with example.Q2.Explain RISC AND CISC with their advantages and criticisms.Q3 Numerical
41
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
42/130
Lecture 9:
Addressing modes Implied Mode Immediate Mode Register Mode Register Indirect Mode Autoincrement or Autodecrement Mode Direct Addressing Mode Indirect Addressing Mode Relative addressing Mode
In the last lecture we studied the instruction formats, now we study how the instructionsuse the addressing modes of different types.
Addressing Modes
Addressing Modes* Specifies a rule for interpreting or modifying the address field of the instruction
(before the operand is actually referenced)* Variety of addressing modes
- to give programming flexibility to the user - to use the bits in the address field of the instruction efficiently
In simple words we can say the addressing modes is the way to fetch operands (or Data)from memory.
TYPES OF ADDRESSING MODES Implied Mode
: Address of the operands are specified implicitly in the definition of the instruction- No need to specify address in the instruction- EA = AC, or EA = Stack[SP]- Examples from BC : CLA, CME, INP
Immediate Mode: Instead of specifying the address of the operand,operand itself is specified- No need to specify address in the instruction- However, operand itself needs to be specified- (-)Sometimes, require more bits than the address- (+) Fast to acquire an operand- Useful for initializing registers to a constant value
Register Mode: Address specified in the instruction is the register address- Designated operand need to be in a register
42
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
43/130
- (+) Shorter address than the memory address-- Saving address field in the instruction- (+) Faster to acquire an operand than the memory addressing- EA = IR(R) (IR(R): Register field of IR)
Register Indirect Mode
: Instruction specifies a register which contains the memory address of the operand- (+) Saving instruction bits since register address is shorter than the memoryaddress
- (-) Slower to acquire an operand than both the register addressing or memoryaddressing
- EA = [IR(R)] ([x]: Content of x) Autoincrement or Autodecrement Mode
- Similar to the register indirect mode except :When the address in the register is used to access memory, the value in the register
is incremented or decremented by 1 automatically Direct Address Mode
: Instruction specifies the memory address which can be used directly to access thememory- (+) Faster than the other memory addressing modes- (-) Too many bits are needed to specify the address for a large physical memory
space- EA = IR(addr) (IR(addr): address field of IR)- E.g., the address field in a branch-type instr
Indirect Addressing Mode: The address field of an instruction specifies the address of a memory location that
contains the address of the operand- (-) Slow to acquire an operand because of an additional memory access- EA = M[IR(address)]
Relative Addressing Modes: The Address fields of an instruction specifies the part of the address(abbreviated address) which can be used along with a designatedregister to calculate the address of the operand--> Effective addr = addr part of the instr + content of a special register - (+) Large physical memory can be accessed with a small number of address bits- EA = f(IR(address), R), R is sometimes implied--> Typically EA = IR(address) + R - 3 different Relative Addressing Modes depending on R
* (PC) Relative Addressing Mode (R = PC)* Indexed Addressing Mode (R = IX, where IX: Index Register)* Base Register Addressing Mode (R = BAR(Base Addr Register))* Indexed addressing mode vs. Base register addressing mode
- IR(address) (addr field of instr) : base address vs. displacement- R (index/base register) : displacement vs. base address- Difference: the way they are used (NOT the way they are computed)
* indexed addressing mode : processing many operands in an array using the same instr
43
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
44/130
* base register addressing mode : facilitate the relocation of programs in memory inmultiprogramming systems
Addressing Modes: Examples
Summary: Addressing Modes: Specifies a rule for interpreting or modifying the address field
of the instruction. The different types of addressing modes are: Implied mode, Immediate mode,
Register mode, Register indirect mode, Autoincrement or auto decrement mode,Direct mode, Indirect mode, Relative addressing mode.
Important Questions:Q1. Explain the addressing modes with suitable examples.
44
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
45/130
Lecture 10:
Instruction set Data Transfer Instructions
o Typical Data Transfer Instructionso Data Transfer Instructions with Different Addressing
Modes Data Manipulation Instructions
o Arithmetic instructionso Logical and bit manipulation instructionso Shift instructions
Program Control Instructionso Conditional Branch Instructionso Subroutine Call & Return
DATA TRANSFER INSTRUCTIONS
These are the type of instructions used only for the transfer of data fromregisters to registers, registers to memory operands and other memorycomponents. There is no manipulation done on the data values.
These are the type of instructions in which there is no usage of variousaddressing modes. We have a direct transfer between the various registersand memory components.
Like Load and store we used for the transfer of data to and from theaccumulator.
Load LDStore STMove MOVExchange XCHInput INOutput OUTPush PUSHPop POP
Name Mnemonic
Typical Data Transfer Instructions
Table 3.1
45
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
46/130
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
47/130
Arithmetic Instructions : These are the type of instructions used for arithmeticalcalculations like addition , subtraction , increment etc.
Logical and Bit Manipulation Instructions
These are the type of instructions in which are operations are computed on string of bits. These bits are treated as individual and thus the operation can be done onindividual or a group of bits ignoring the whole value and even new bits insertion is
possible.
For example:CLR R1 will make all the bits as 0.COM R1 will invert all the bits.AND , OR and XOR will produce the result on 2 individual bits of each operand.E.g.: AND of 0011 and 1100 will result to:
0000.AND instruction is also known as mask instruction as if we have to mask some values of operand we can AND that value with 0s giving other inputsas 1(high).E.g.: Suppose we have to mask register with value 11000110
On 1st
, 3rd
and 7th
bit. Then we will have to AND it with value 01011101.
CLRC, SETC and COMC will work only on 1 bit of the operand i.e. Carry.
Similarly in case of EI and DI we work only on 1 bit interrupt flip flop toenable it.
NameMnemonicIncrement INCDecrement DECAdd ADDSubtract SUBMultiply MULDivide DIVAdd with Carry ADDCSubtract with Borrow SUBB
Negate(2s Complement) NEG
Table 3.3
47
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
48/130
Name Mnemonic
Clear CLRComplement COMAND ANDOR ORExclusive-OR XORClear carry CLRCSet carry SETCComplement carry COMCEnable interrupt EIDisable interrupt DI
Shift Instructions : These are the type of instructions which modify the whole valueof operand but by shifting the bits on left or right side.
Say R1 has value 11001100o SHR inserts 0 at the left most position.
Result 01100110o SHL inserts 0 at the right most position.
Result 10011000o SHRA : In case of SHRA the sign bit remains same else every bit shift left
or right accordingly.Result 11100110
o SHLA is same as that of SHL inserting 0 in the end.Result 10011000
o In ROR , all the bits are shifted towards right and the rightmost one movesto leftmost position.
Result : 01100110o In ROL , all the bits are shifted towards left and the leftmost one moves to
rightmost position.Result : 10011001
o In case of RORC , suppose we have a carry bit as 0 with register R1. In thisall the bits of the register will be right shifted and the value of carry will bemoved to leftmost position and the rightmost position will be moved tocarry.
48
Table 3.4
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
49/130
Result : 01100110 with carry as 0o Similarly in case of ROLC , we will get all the bits of the register left
shifted and the value of carry moved to rightmost position and the leftmost position will be moved to carry.
Result : 10011000 with carry as 1.
PROGRAM CONTROL INSTRUCTIONS:
Before starting with program control instructions, lets study the concept of PC i.e.Program counter. Program counter is the register which tells us the address of thenext instruction to be executed. When we fetch the instruction pointed by PC frommemory it changes it value giving us the address of the next instruction to be fetched.In case of sequential instructions it simply increments itself and in case of branchingor modular programs it gives us the address of the first instruction of the calledprogram. After the execution of the called program , the program counter points back to the instruction next to the instruction from which the subprogram was called. Incase of go to kind of instructions the program counter simply changes the value of program counter with out keeping any reference of the previous instruction..
Logical shift right SHRLogical shift left SHLArithmetic shift right SHRAArithmetic shift left SHLARotate right ROR
Rotate left ROLRotate right thru carry RORCRotate left thru carry ROLC
NameMnemonic
PC
+1In-Line Sequencing (Next instruction is fetchedfrom the next adjacent location in the memory)
Address from other source; Current Instruction,Stack, etc; Branch, Conditional Branch,Subroutine, etc.
49
Table 3.5
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
50/130
Program Control Instructions: These instructions are used for the transfer of control toother instructions. That is these instructions are used in case we have to execute the nextinstruction from some other location instead of sequential manner.
The conditions can be :Calling a sub program
Returning to the main programJumping onto some other instruction or locationSkip the instructions in case of break and exit or in case the condition youcheck is false and so on
*CMP and TST instructions do not retain their results of operation (- and AND,respectively).They only set or clear certain flags.
Conditional Branch Instructions: These are the instructions in which we test someconditions and depending on the result we go either for branching or sequential way.
NameMnemonic
Branch BRJump JMPSkip SKP
Call CALLReturn RTNCompare(by ) CMPTest(by AND) TST
50
Table 3.6
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
51/130
Subroutine Call and Return:
Subroutine Call : Call SubroutineJump to SubroutineBranch to SubroutineBranch & save return address
Two most important operations are implied:*Branch to the beginning of the Subroutine
-Same as the branch or conditional branch*Save the return address to get the address of the location of the calling program
upon exit from the subroutine.
Location of storing return address:Fixed Location in the subroutine (Memory)Fixed Location in memory
BZ Branch if zero Z = 1BNZ Branch if not zero Z = 0BC Branch if carry C = 1
BNC Branch if no carry C = 0BP Branch if plus S = 0BM Branch if minus S = 1BV Branch if overflow V = 1BNV Branch if no overflow V = 0
BHI Branch if higher A > BBHE Branch if higher or equal A BBLO Branch if lower A < BBLOE Branch if lower or equal A B
BE Branch if equal A = BBNE Branch if not equal A B
BGT Branch if greater than A > BBGE Branch if greater or equal A BBLT Branch if less than A < BBLE Branch if less or equal A BBE Branch if equal A = BBNE Branch if not equal A B
Unsigned compare conditions (A - B)
Signed compare conditions (A - B)
Mnemonic Branch condition Tested condition
51
Table 3.7
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
52/130
In a processor Register In memory stack
-most efficient way
Summary:Data Transfer Instructions are of two types namely: Typical Data Transfer Instructions and Data Transfer Instructions with Different Addressing Modes.The Data Manipulation Instructions are of three types, which are Arithmeticinstructions, Logical and bit manipulation instructions and Shift instructions.Program Control Instructions can be divided into Conditional Branch Instructionsand Subroutine Call & Return instructions.
Important Questions:Q1.Explain the data Transfer instructions.Q2.Explain the data Manipulation instructions.Q3.Explain the Program control instructions with example.
52
CALLSP SP - 1M[SP] PC
PC EA
RTNPC M[SP]
SP SP + 1
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
53/130
Lecture 11:
Program Interrupts MASM
PROGRAM INTERRUPT:
Types of Interrupts:1. External Interrupt : External interrupts are initiated from outside of CPU &
memory.-I/O device-> Data transfer request or data transfer complete-Timing device ->timeout- Power failure- Operator
2. Internal Interrupts (traps) : Internal Interrupts are caused by thecurrently running program.
- Register, Stack Overflow- Divide by Zero- OP- code violation- Protection Violation
3. Software Interrupts : Both external & internal interrupts areintiated by the computer hardware. Software interrupts are initiated
by the executing instruction.-Supervisor Call -> Switching from user mode to the supervisor
mode-> Allows to execute a certain class of
operations which are not allowed in theuser mode.
MASM:
If you have used a modern word processor such as Microsoft Word and have noticed themacros feature. Where you can record a series of frequently used actions or commands intothe macros. For example, you always need to insert a 2 by 4 column with the title "Date"and "Time". You can start the macro recorder and create the table as you wish. After that,you can save the macro. The next time you need to create the same kind of table, you just
need to execute the macro. The same applies for a macro assembler. It enables you torecord down frequently performed actions or a frequently used block of code so that you donot have to re-type it each time.
The Microsoft Macro Assembler (abbreviated MASM) is an x86 high-level assembler for DOS and Microsoft Windows . Currently it is the most popular x86 assembler . It supports awide variety of macro facilities and structured programming idioms, including high- levelfunctions for looping and procedures . Later versions added the capability of producing
53
http://en.wikipedia.org/wiki/X86_architecturehttp://en.wikipedia.org/wiki/High-level_assemblerhttp://en.wikipedia.org/wiki/High-level_assemblerhttp://en.wikipedia.org/wiki/DOShttp://en.wikipedia.org/wiki/Microsoft_Windowshttp://en.wikipedia.org/wiki/Microsoft_Windowshttp://en.wikipedia.org/wiki/Assembly_language#Assemblerhttp://en.wikipedia.org/wiki/Assembly_language#Assemblerhttp://en.wikipedia.org/wiki/Macro_(computer_science)http://en.wikipedia.org/wiki/Structured_programminghttp://en.wikipedia.org/wiki/High-level_programming_languagehttp://en.wikipedia.org/wiki/Control_flow#Loopshttp://en.wikipedia.org/wiki/Control_flow#Loopshttp://en.wikipedia.org/wiki/Subroutinehttp://en.wikipedia.org/wiki/X86_architecturehttp://en.wikipedia.org/wiki/High-level_assemblerhttp://en.wikipedia.org/wiki/DOShttp://en.wikipedia.org/wiki/Microsoft_Windowshttp://en.wikipedia.org/wiki/Assembly_language#Assemblerhttp://en.wikipedia.org/wiki/Macro_(computer_science)http://en.wikipedia.org/wiki/Structured_programminghttp://en.wikipedia.org/wiki/High-level_programming_languagehttp://en.wikipedia.org/wiki/High-level_programming_languagehttp://en.wikipedia.org/wiki/Control_flow#Loopshttp://en.wikipedia.org/wiki/Subroutine -
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
54/130
programs for Windows. MASM is one of the few Microsoft development tools that target16-bit , 32-bit and 64-bit platforms . Earlier versions were MS-DOS applications. Versions5.1 and 6.0 were OS/2 applications and later versions were Win32 console applications.Versions 6.1 and 6.11 included Phar Lap 's TNT DOS extender so that MASM could run inMS-DOS.[ citation needed
The name MASM originally referred to as MACRO ASSEMBLER but over theyears it has become synonymous with Microsoft Assembler.An Assembly language translator converts macros into several machine languageinstructions.MASM isn't the fastest assembler around (it's not particularly slow, except in acouple of degenerate cases, but there are faster assemblers available).
Though very powerful, there are a couple of assemblers that, arguably, are more powerful (e.g., TASM and HLA).MASM is only usable for creating DOS and Windows applications; you cannot
effectively use it to create software for other operating systems.
Benefits of MASMThere are some benefits to using MASM today:
Steve Hutchessen's ("Hutch") MASM32 package provides the support for MASM thatMicrosoft no longer provides.
You can download MASM (and MASM32) free from Microsoft and other sites. Most Windows' assembly language examples on the Internet today use MASM syntax. You may download MASM directly from Webster as part of the MASM32 package.
Summary:
Program Interrupts can be external, internal or software interrupts.MASM is Microsoft or macro assembler used for implementing macros.
Important Questions:Q1.What are Program interrupts. Explain the types of Program interrupts.Q2. Explain MASM in detail.
54
http://en.wikipedia.org/wiki/16-bithttp://en.wikipedia.org/wiki/32-bithttp://en.wikipedia.org/wiki/64-bithttp://en.wikipedia.org/wiki/Computing_platformhttp://en.wikipedia.org/wiki/MS-DOShttp://en.wikipedia.org/wiki/MS-DOShttp://en.wikipedia.org/wiki/OS/2http://en.wikipedia.org/wiki/OS/2http://en.wikipedia.org/wiki/Win32_consolehttp://en.wikipedia.org/wiki/Win32_consolehttp://en.wikipedia.org/wiki/Phar_Lap_(company)http://en.wikipedia.org/wiki/Phar_Lap_(company)http://en.wikipedia.org/wiki/DOS_extenderhttp://en.wikipedia.org/wiki/DOS_extenderhttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/16-bithttp://en.wikipedia.org/wiki/32-bithttp://en.wikipedia.org/wiki/64-bithttp://en.wikipedia.org/wiki/Computing_platformhttp://en.wikipedia.org/wiki/MS-DOShttp://en.wikipedia.org/wiki/OS/2http://en.wikipedia.org/wiki/Win32_consolehttp://en.wikipedia.org/wiki/Phar_Lap_(company)http://en.wikipedia.org/wiki/DOS_extenderhttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Wikipedia:Citation_needed -
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
55/130
Lecture 10:
CPU Architecture typeso Accumulator o Register o Stack o Memory / Register
Detailed data path of a register based CPU
In Unit -3 we discussed the instruction set computer(ISA) which deals with the varioustypes of address instructions , addressing modes and different types of instructions invarious computer architectures.
In this chapter we will discuss the various type of computer organizations we have. In general, most processors or computers are organized in one of 3 ways
Single register (Accumulator) organization Basic Computer is a good example Accumulator is the only general purpose register
Stack organization All operations are done using the hardware stack For example, an OR instruction will pop the two top elements from
the stack, do a logical OR on them, and push the result on the stack General register organization
Used by most modern computer processors Any of the registers can be used as the source or destination for
computer operations
Accumulator type of Organization:In case of accumulator type of organizations, one operand is in memory and other is inaccumulator.
The instructions we can run with accumulator are :
AC AC DR AND with DR AC AC + DR Add with DR AC DR Transfer from DR AC(0-7) INPR Transfer from INPR AC AC ComplementAC shr AC, AC(15) E Shift rightAC shl AC, AC(0) E Shift leftAC 0 ClearAC AC + 1 Increment
55
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
56/130
Circuit required:
Stack Organization:Stack
- Very useful feature for nested subroutines, nested interrupt services- Also efficient for arithmetic expression evaluation- Storage which can be accessed in LIFO- Pointer: SP- Only PUSH and POP operations are applicable
Stack type of organization is of two types
1616
8
Adder andlogiccircuit
16 AC
Accumulator From DR
From INPR
ControlGates
LD INR CLR
16
To bus
Clock
56
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
57/130
REGISTER STACK ORGANIZATION
Register Stack
Push, Pop operations
ABC
01234
63
Address
FULL EMPTY
SP
DR
Flags
Stack pointer
6 bits
/* Initially, SP = 0, EMPTY = 1, FULL = 0 */
PUSH POPSP SP + 1 DR M[SP]
M[SP] DR SP SP 1If (SP = 0) then (FULL 1) If (SP = 0) then (EMPTY 1)EMPTY 0 FULL 0
57
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
58/130
MEMORY STACK ORGANIZATION
Memory with Program, Data, and Stack Segments
A portion of memory is used as a stack with a processor register as a stack pointer
- PUSH: SP SP - 1M [SP] DR
- POP: DR M [SP]SP SP + 1
Note: Most computers do not provide hardware to check stack overflow (fullstack) or underflow (empty stack) must be done in software
Register type of organization:In this we take the help of various registers , say R1 to R8 for transfer and
manipulation of data.
Detailed data path of a typical register based CPU
4001400039993998
3997
3000
Data(Operands)
Program(Instructions)
1000
PC
AR
SPStack
Stack growsIn this direction
58
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
59/130
To avoid memory access directly (as it is very time consuming and thus a costly technique), we prefer the register organization as it proves to be more efficient and time savingorganization.
In this we are using 7 registers. The two multiplexers and a decoder decide which registersto be used as operands source and what register to be used as a destination for the storageof result.MUX 1 decides the 1st operand register which depends on the values of SELS1 (Selector for source 1).Similarly, for MUX 2, SELs2 works as input for 2nd operand decision.
These two inputs through S1bus and S2 bus reach ALU. OPR denotes the type of operationto be performed and the computation or operation is performed on ALU. Then the result iseither stored back in one of the 7 registers with the help of decoder which decides which isthe resultant register with the help of SELD.
MUX1SELS1 {
MUX2
}SELS2
ALUOPR
R1R2R3R4R5R6R7
Input
3 x 8
Decoder
SELD
Load(7 lines )
Output/Result
S1bus
S2bus
Clock
59
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
60/130
Lecture 13:
Address Sequencing / Microinstruction Sequencing
Implementation of control unit
Address Sequencing/Microinstruction Sequencing:Microinstructions are stored in control memory in groups, with each group specifying aroutine. The hardware that controls the address sequencing of the control memory must becapable of sequencing the microinstructions within a routine and be able to branch fromone routine to another with the help of this circuit.
Steps : An initial address is loaded into CAR at power turned ON that usually is the firstmicroinstruction that activates the instruction fetch routine.This routine may be sequenced
by incrementing.At the end of the fetch routine the instructionm is in the IR of thecomputer.Next the control memory computes the effective address of the operand.The netstep is the execution of the instruction fetched from memory.The transformation from the instruction code bits to an address in control memory wherethe routine is located is reffered to as a mapping process.
Instruction code
Mappinglog ic
Multiplexers
Control memory (ROM)
Subroutineregister (SBR )
Branchlogic
Statusbits
Microoperations
Control address register (CAR)
Incrementer
MUXselect
select a statusbit
Branch address
60
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
61/130
At the completion of the execution of the instruction, control must return to the fetchroutine by executing an unconditional; branch microinstruction to the first address of thefetch routine.Sequencing Capabilities Required in a Control Storage
- Incrementing of the control address register
- Unconditional and conditional branches- A mapping process from the bits of the machineinstruction to an address for control memory
- A facility for subroutine call and return
Design of control Unit:After getting the microoperations we have to execute these microperations but before thatwe need to decode them.
Fig: Decoding of microoperation fields.Because we have 8 microoperations represented with the help of 3 bits in every table andalso we have 3 such tables possible we have decoded these microperations field bits withthree 3*8 decoders.After getting the microoperations, we have to give it to particular circuits, the datamanipulation type of microperations like AND, ADD, Sub and so on we give to ALU and
microoperation fields
3 x 8 decoder
6 5 4 3 2 1 0
F1
3 x 8 decoder
7 6 5 4 3 2 1 0
F2
3 x 8 decoder
7 6 5 4 3 2 1 0
F3
Arithmeticlogic andshift unit
ANDADD
DRTAC
ACLoad
FromPC
FromDR(0-10)
Select 0 1Multiplexers
AR
Load Clock
AC
DR
D R T A R P C T A R
61
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
62/130
the corresponding results moved to AC. The ALU has been provided data from AC andDR.And for data transfer type of instructions like in the case of PCTAR or DRTAR we need tosimply transfer the values .Because we have two options for data transfer in AR we aretaking the help of MUX to choose one . We will take 2*1 MUX and one select line which
is attached with DRTAR microperation signal .That means if DRTAR is high then MUXwill choose DR to transfer the data to AR else PC s data will be moved to AR.And thecorresponding data movement will be done with the help of load high or not .If any of thevalues is high the value will be loaded to AR.
The clock signal is provided for the synchronization of microoperations.
62
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
63/130
Lecture 13:
Fetch and decode cycle Control Unit
Fetch and Decode
T0: AR PC (S0S
1S
2=010, T0=1)
T1: IR M [AR], PC PC + 1 (S0S1S2=111, T1=1)T2: D0, . . . , D7 Decode IR(12-14), AR IR(0-11), I IR(15)
S2
S 1
S0
Bus
7MemoryunitAddress
Read
AR
LD
PC
INR
IR
LD Clock
1
2
5
Common bus
T1
T0
63
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
64/130
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
65/130
Control Unit
Control unit (CU) of a processor translates from machine instructions to thecontrol signals for the microoperations that implement them
Control units are implemented in one of two ways Hardwired Control
CU is made up of sequential and combinational circuits to generate thecontrol signals
Microprogrammed Control A control memory on the processor contains microprograms that
activate the necessary control signals
We will consider a hardwired implementation of the control unit for the BasicComputer
S2
S1
S0
Bus
7MemoryunitAddress
Read
AR
LD
PC
INR
IR
LD Clock
1
2
5
Common bus
T1
T0
65
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
66/130
Lecture 15:
Memory hierarchy and its organization Need of memory hierarchy Locality of reference principle
In the last units we have studied the various instructions , data and the registers associatedwith our computer organization.Lets come on to micro architecture of computer , in which an important part is memory.Lets study what is a memory and what are the various types of memory available.
Memory unit is a very essential component in a computer which is used for storing programs and data. We use main memory for running programs and also additionalcapacity for storage . We have various levels of memory units in terms of memoryhierarchy.
MEMORY HIERARCHY
Memory Hierarchy is to obtain the highest possible access speed while minimizing thetotal cost of the memory system
The various components are:
Main Memory: The memory unit that communicates directly with CPU. The programs anddata currently needed by the processor reside in main memory.
Auxiliary Memory : This is made of devices that provide backup storage. Example :Magnetic tapes , magnetic disks etc.
Cache memory : This is the memory which lies in between your main memory and CPU.
]
Magnetictapes
Magneticdisks
I/Oprocessor
CPU
Mainmemory
Cachememory
66
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
67/130
Fig :Memory Hierarchy
In this hierarchy , we have magnetic tapes at the lowest level which means they are veryslow and very cheap in nature. Moving on to upper levels , we have main memory in whichwe get increased speed but with increased cost per bit.
Thus we can conclude as we go towards upper levels:- Price increases- Speed increases- Cost per bit increases- Access time decreases- Size decreases
Many operating systems are designed to enable the CPU to process a number of independent programs concurrently. This concept is called multiprogramming.This is made
possible by the existence of 2 programs residing in different pats of memory hierarchy atthe same time . Example : CPU and I/O transfer.
The locality of reference , also known as the locality principle , is the phenomenon, that the
collection of the data locations referenced in a short period of time in a runningcomputer, often consists of relatively well predictable clusters .
Analysis of a large number of typical programs has shown that the references to memory atany given interval of time tend to be confined within a few localized areas in memory. This
phenomenon is known as locality of reference
Register
Cache
Main Memory
Magnetic Disk
Magnetic Tape
67
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
68/130
Important special cases of locality are temporal , spatial , equidistant and branch locality.
Temporal locality: if at one point in time a particular memory location isreferenced, then it is likely that the same location will be referenced again in thenear future. There is a temporal proximity between the adjacent references to the
same memory location. In this case it is common to make efforts to store a copy of the referenced data in special memory storage, which can be accessed faster.Temporal locality is a very special case of the spatial locality, namely when the
prospective location is identical to the present location.
Spatial locality: if a particular memory location is referenced at a particular time,then it is likely that nearby memory locations will be referenced in the near future.There is a spatial proximity between the memory locations, referenced at almost thesame time. In this case it is common to make efforts to guess, how bigneighbourhood around the current reference is worthwhile to prepare for faster access.
Equidistant locality: it is halfway between the spatial locality and the branchlocality. Consider a loop accessing locations in an equidistant pattern, i.e. the pathin the spatial-temporal coordinate space is a dotted line. In this case, a simplelinear function can predict which location will be accessed in the near future.
Branch locality: if there are only few amount of possible alternatives for the prospective part of the path in the spatial-temporal coordinate space . This is thecase when an instruction loop has a simple structure, or the possible outcome of asmall system of conditional branching instructions is restricted to a small set of
possibilities. Branch locality is typically not a spatial locality since the few
possibilities can be located far away from each other. Sequential locality: In a typical program the execution of instructions follows asequential order unless branch instructions create out of order execution. This alsotake into consideration spatial locality as the sequential instructions are stored near to each other.
In order to make benefit from the very frequently occurring temporal and spatial kind of locality, most of the information storage systems are hierarchical. The equidistant localityis usually supported by the diverse nontrivial increment instructions of the processors. For the case of branch locality, the contemporary processors have sophisticated branch
predictors, and on the base of this prediction the memory manager of the processor tries tocollect and preprocess the data of the plausible alternatives.
Reasons for locality
There are several reasons for locality. These reasons are either goals to achieve or circumstances to accept, depending on the aspect. The reasons below are not disjoint; infact, the list below goes from the most general case to special cases.
68
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
69/130
Predictability: In fact, locality is merely one type of predictable behavior incomputer systems. Luckily, many of the practical problems are decidable and hencethe corresponding program can behave predictably , if it is well written.
Structure of the program: Locality occurs often because of the way in whichcomputer programs are created, for handling decidable problems. Generally, related
data is stored in nearby locations in storage. One common pattern in computing involves the processing of several items, one at a time. This means that if a lot of processing is done, the single item will be accessed more than once, thus leading totemporal locality of reference. Furthermore, moving to the next item implies thatthe next item will be read, hence spatial locality of reference, since memorylocations are typically read in batches.
Linear data structures: Locality often occurs because code contains loops thattend to reference arrays or other data structures by indices. Sequential locality, aspecial case of spatial locality, occurs when relevant data elements are arranged andaccessed linearly. For example, the simple traversal of elements in a one-
dimensional array, from the base address to the highest element would exploit thesequential locality of the array in memory. [2] The more general equidistant localityoccurs when the linear traversal is over a longer area of adjacent data structures having identical structure and size, and in addition to this, not the whole structuresare in access, but only the mutually corresponding same elements of the structures.This is the case when a matrix is represented as an sequential matrix of rows andthe requirement is to access a single column of the matrix.
Use of locality in general
If most of the time the substantial portion of the references aggregate into clusters, and if
the shape of this system of clusters can be well predicted, then it can be used for speedoptimization. There are several ways to make benefit from locality. The commontechniques for optimization are:
to increase the locality of references. This is achieved usually on the software side.
to exploit the locality of references. This is achieved usually on the hardware side.The temporal and spatial locality can be capitalized by hierarchical storagehardwares. The equidistant locality can be used by the appropriately specializedinstructions of the processors, this possibility is not only the responsibility of hardware, but the software as well, whether its structure is suitable for compiling a
binary program which calls the specialized instructions in question. The branchlocality is a more elaborate possibility, hence more developing effort is needed, butthere is much larger reserve for future exploration in this kind of locality than in allthe remaining ones.
69
http://en.wikipedia.org/wiki/Predictabilityhttp://en.wikipedia.org/wiki/Predictabilityhttp://en.wikipedia.org/wiki/Decidablehttp://en.wikipedia.org/wiki/Decidablehttp://en.wikipedia.org/wiki/Predictabilityhttp://en.wikipedia.org/wiki/Computer_programhttp://en.wikipedia.org/wiki/Decidablehttp://en.wikipedia.org/wiki/Decidablehttp://en.wikipedia.org/wiki/Computinghttp://en.wikipedia.org/wiki/Locality_of_reference#cite_note-1http://en.wikipedia.org/wiki/Data_structurehttp://en.wikipedia.org/wiki/Matrix_(mathematics)http://en.wikipedia.org/wiki/Optimization_(computer_science)http://en.wikipedia.org/wiki/Predictabilityhttp://en.wikipedia.org/wiki/Decidablehttp://en.wikipedia.org/wiki/Predictabilityhttp://en.wikipedia.org/wiki/Computer_programhttp://en.wikipedia.org/wiki/Decidablehttp://en.wikipedia.org/wiki/Computinghttp://en.wikipedia.org/wiki/Locality_of_reference#cite_note-1http://en.wikipedia.org/wiki/Data_structurehttp://en.wikipedia.org/wiki/Matrix_(mathematics)http://en.wikipedia.org/wiki/Optimization_(computer_science) -
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
70/130
Lecture 16:
Main Memoryo RAM chip organizationo ROM chip organization
Expansion of main memoryo Memory connections to CPUo Memory address map
Till now we have discussed the memory interconnections and their comparisons.Lets take each in detail.
Main Memory: Main memory is a large (w.r.t Cache Memory ) and fast memory (w.r.tmagnetic tapes , disks etc) used to store the programs and data during the computer operation. I/O processor manages data transfers between auxiliary memory and mainmemory.
Main Memory is available in 2 types :
The principal technology used for main memory is based on semiconductor integratedcircuits.RAM : This is part of main memory where we can both read and write data.
Typical RAM chip:
CS1 and CS2 are used to enable or disable a particular RAM..
We have corresponding truth table as:
Chip select 1
Chip select 2ReadWrite
7-bit address
CS1
CS2RDWRAD 7
128 x 8RAM 8-bit data bus
70
-
7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g
71/130
We have RAM enabled when CS1 as 1 and CS2 as 0.Else we will have inhibitoperation or high impedence state. When we have 1 and 0 we will have RAM enabled.But if we have both read and write as 0 we dont have any operation and thus RAM is inhigh impedence state .RD pin tells us that RAM is getting used fro read operation.
Similarly WR pin is used to show that Write operation is getting performed on RAM.In this if we have option of both WR and RD as high we choose read operation else we willhave inconsistency of data.
Since we have 128 * 8 words RAM that means we have 128 words and each word of length 8 bits.
Thus we need 8 bit data bus to transfer the data and we have bidirection 8 bit databus .
To access 128 words we need 2 7 i.e. 7 bits to access 128 words.
Integrated c