cao-notess by girdhar gopal gautam 3g

Upload: girdhar-gopal-gautam

Post on 04-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    1/130

    Gopal sharma MVN institute of technology &management

    Branch: ECE (5 th SEM)

    Session-2012

    Computer Architecture and Organization

    MRS. Rama Pathak

    Submitted byGirdhar gopal gautam ([email protected] )

    1

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    2/130

    Lecture 1:

    Digital logics Boolean Algebra Logic Gates Truth table

    Here we deal with the basic digital circuits of our computer. That is what are the hardwarecomponents we are using , how these hardware components are related and interacted toeach other and how this hardware is accessed or seen by the user.

    This gives the origin of the classification of our computer study into: Computer design: This is concerned with the hardware design of the computer. In this

    designer decides on the specifications of the computer system. Computer Organization: This is concerned with the way the hardware components operate

    and the way they are connected to form the computer system. Computer Architecture: This is concerned with the structure and behavior of the computer

    as seen by the user. It includes the information formats, the instruction set and addressingmodes for accessing memory.

    In our course we will be dealing with computer architecture and organization.

    Before starting with the computer architecture and organization lets discuss thecomponents which make the hardware or the organization of the computer which iscomposed of digital circuits which are handled by digital computer.

    Digital Computers Imply that the computer deals with digital information Digital information: is represented by binary digits (0 and 1)

    Gates blocks of Hardware that produce 1 or 0 when input logic requirements are

    satisfied

    2

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    3/130

    Functions of gates can be described by: Truth Table Boolean Function Karnaugh Map

    Table for various logic gates -1.1

    Gate

    GATEBinary digitalinput signal

    Binary digitaloutput signal

    3

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    4/130

    Boolean algebra

    Algebra with Binary (Boolean) Variable and Logic Operations Boolean Algebra is useful in Analysis and Synthesis of Digital

    Logic Circuits

    - Input and Output signals can be represented by BooleanVariables and

    - Function of the Digital Logic Circuits can be represented by Logic Operations , i.e., Boolean Function(s)

    - From a Boolean function, a logic diagram can beconstructed using AND, OR, and I

    Note: We can have many circuits for the same Boolean expression.

    For example:

    Truth TableThe most elementary specification of the function of a Digital

    Logic Circuit is the Truth Table

    4

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    5/130

    Table that describes the Output Values for all the combinationsof the Input Values, called MINTERMS

    n input variables 2n minterms

    Summary: Computer Design: what hardware components we need. Computer Organization: how these hardware components are interacted. Computer Architecture: how these are connected with the user. Logic Gates: Blocks of hardware giving result in 0 or 1. Basic 8 logic gatesout of 3 (AND , OR and I ) are basic Boolean Algebra: The representation of input and output signals in the formof expressions. Truth table: Table that describes the Output Values for all the combinationsof the Input Values

    Lecture 2:

    Combinational logic BlocksMultiplexersAddersEncodersDecoders

    Combinational circuits are circuits without memory where the outputs are obtained fromthe inputs only. An n-input m-output combinational circuit is of the form.

    Multiplexer is the combinational circuit which selects one of the many inputs depending onthe selection criteria.The no of selection inputs depends on the number of inputs in the manner as 2 x = y

    By this if y is the no of inputs then x is the no of selection lines.Thus if we have 4 input lines, we use 2 selection lines as 2 2 =4 and so on.And this will be called as 4:1 multiplexer or 4*1 multiplexer.

    This has been explained in the diagram as:

    Combinationalcircuits

    n input m output

    5

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    6/130

    AddersHalf Adder Full Adder

    Half Adder: Adds 2 bits and give out carry and sum as result

    4-to-1 Multiplexer

    I0

    I1

    I2

    I3

    S0

    S1

    Y

    0 0 I0

    0 1 I 11 0 I

    2

    1 1 I3

    Select OutputS

    1S

    0Y

    6

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    7/130

    Full Adder: Adds 2 bits with carry in and gives carry out and sum as result.

    x

    y

    x

    y

    c = xy s = xy + xy= x y

    x

    c

    s

    0 0 0 00 1 0 11 0 0 11 1 1 0

    x y c s 010 0

    01

    1

    y

    Truth Table

    Digital Circuit

    0

    XY

    Cin

    S

    cout

    0 0 0 0 00 0 1 0 10 1 0 0 10 1 1 1 01 0 0 0 11 0 1 1 01 1 0 1 01 1 1 1 1

    Cout

    = xy + xcin+ yc

    in

    = xy + (x y) Cin

    s = xy cin+xyc

    in+xyc

    in+xyC

    in

    = x y Cin

    = (x y) Cin

    xC

    in

    xC

    in

    Cout s

    x y c in cout s

    0

    0

    10

    0

    1

    11

    0

    101

    1

    010

    7

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    8/130

    Decoder: Decoder takes n inputs and gives 2 n outputs.That is we get 8 outputs for 3 inputs and is called as 3* 8 decoder.We also have 2* 4 decoder and 4*16 decoder and so on.

    We are implementing a decoder with the help of NAND gates.

    Using NAND gates, it becomes more economical.

    8

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    9/130

    Summary: Combinational circuits: where the outputs are obtained from the inputs only. Various combinational circuits are:o Multiplexers: No of selection inputs depends on the number of inputs in the manner as 2 x = y.o Half Adder: Adds 2 bits and give result as carry and sum.o Full Adder: Adds 2 bits with carry in and gives result as carry outand Sum.o Encoder: Takes 2 n inputs and gives n outputs.o Decoder: Takes n inputs and gives 2 n outputs.

    Important Questions derived from this:Q1. What is the difference in multiplexer and decoder?Q2.Draw a 4*1 decoder with the help of AND gates.

    9

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    10/130

    Lecture 3:

    Sequential logic BlocksLatchesFlip flopsRegistersCounters

    Sequential logic Blocks : logic blocks whose output logic valuedepends on the input values and the state of the blocks

    In this we have the concept of memory which was notapplicable for combinational circuits.

    The various sequential blocks or circuits are:

    Latches: A latch is a kind of bistable multivibrator , an electronic circuit which has two

    stable states and thereby can store one bit of information. Today the word is mainlyused for simple transparent storage elements, while slightly more advanced non-transparent (or clocked ) devices are described as flip-flops . Informally, as thisdistinction is quite new, the two words are sometimes used interchangeably.

    S-R latch:

    To overcome the restricted combination, one can add gates to the inputs that would convert(S,R) = (1,1) to one of non-restricted combinations. That can be:Q = 1 (1,0) referred to as an S-latch Q = 0 (0,1) referred to as an R-latch

    10

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    11/130

    Keep state (0,0) referred to as an E-latch

    D-LATCHForbidden input values are forced not to occur by using an inverter between the inputs.

    Flip Flops:

    D flip flop:

    Q

    QD(data)

    E(enable)

    D Q

    E Q

    E Q

    D Q

    D Q(t+1)

    0 01 1

    11

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    12/130

    If you compare the D-flip flop and D latch the only difference you find in the circuit isthat latches do not have clocks and flip flops have it.

    So you can note down the difference between latches and flip flops as: Latch is an edge triggered device whereas Flip flop is a level triggered. The output of a latch changes independent of a clock signal whereas the Output of a

    Flip - Flop changes at specific times determined by a clocking signal. In Latch We do not require clock pulses and flip flops are clocked devices.

    Characteristics- State transition occurs at the rising edge or

    falling edge of the clock pulse

    Latches

    respond to the input only during these periods

    Edge-triggered Flip Flops (positive)

    respond to the input only at this time

    12

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    13/130

    Counters: A counter is a device which stores (and sometimes displays) the number of times a particular event or process has occurred, often in relationship to a clock signal.

    4 bit binary counter:

    RING COUNTER:

    In Ring Counter the output of 1 st flip flop is moved to the input of 2 nd flip flop.

    J K

    Q

    J K

    Q

    J K

    Q

    J K

    Q

    Clock

    Counter Enable

    A0 A1 A2 A3

    OutputCarry

    13

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    14/130

    JOHNSON COUNTER :

    In Johnson counter the output of last flip flop is inverted and given to the first flip flop.

    Registers: It refers to a group of flip-flops operating as a coherent unit to hold data. This isdifferent from a counter, which is a group of flip-flops operating to generate new data bytabulating it.

    14

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    15/130

    Shift register: A register that is capable of shifting data one bit at a time is called a shiftregister. The logical configuration of a serial shift register consists of a chain of flip-flopsconnected in cascade, with the output of one flip-flop being connected to the input of itsneighbor. The operation of the shift register is synchronous; thus each flip-flop isconnected to a common clock. Using D flip-flops forms the simplest type of shift-registers .

    Bi- directional shift register with parallel load

    Summary:

    DQ

    C DQ

    C DQ

    C DQ

    C

    A0

    A1 A2 A3

    4 x 1MUX

    4 x 1MUX

    4 x 1MUX

    4 x 1MUX

    Clock S 0S 1 SeriaIInput

    I0 I1 I2 I3SerialInput

    15

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    16/130

    Sequential circuits: output logic value depends on the input values and thestate of the blocks. These circuits have memory. Various combinational circuits are:o Latches: An electronic circuit which has two stable states andthereby can store one bit of informationo Flip flops: It also has 2 stable states but with memory.o Counter: A device which stores number of times a particular eventor process has occurred.o Registers: A group of flip-flops operating as a coherent unit to holddata.

    Important Questions derived from this:Q1. What is the difference in latch and flip flop?Q2. Explain Johnson counter?Q3. Draw shift register with parallel load.

    16

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    17/130

    Lecture 4:

    Stored Program control concept

    Flynns classification of computers: SISD SIMD MISD MIMD

    After the discussion of basic principles of hardware and the combinational and sequentialcircuits we have in our computer system. Lets see how these components are interacted tomake our computer system which we use. We will be starting with the basic architecturesof the computer system. And the most basic one which comes is how the programs arestored in our computer system or how the different programs and data are arranged in our system.

    Stored Program control concept

    The simplest way to organize a computer is to have one processor register and aninstruction code with 2 parts.

    Opcode (What operation is to be completed) Address (Address of the operands on which the operation is to be

    computed) A computer that by design includes an instruction set architecture and can store in

    memory a set of instructions (a program) that details the computation and the dataon which computation is to be done.

    Memory 4096*16

    The opcode tells us the operation to be performed. Address tells us the memory location where to find the operand. For a memory unit of 4096 bits we need 12 bits to specify the address.

    Instruction Format

    011Opcode

    15Address

    12

    015Binary Operand

    Fig 1: Stored Program Organization

    Processor register

    (Accumulator or AC)

    Instructions(Program)

    Operands(Data)

    17

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    18/130

    When we store an instruction code in memory, 4 bits are specified for 16 operations(as 12 bits are for operand address).

    For an operation control fetches the instruction from memory, it decodes theoperation (one out of 16) and finds out the operands and then do the operation.

    Computers with one processor register generally name it accumulator (or AC). Theoperation is performed with the operand and the content of AC. In case no operand is specified, we compute the operation on

    accumulator .E.g. Clear AC, complement AC etc.

    PARALLEL COMPUTERSThe one we studied was very basic one but sometimes we have very large computations inwhich one processor with general architecture will not of much help. Thus we take the helpof many processors or divide the processor functions into many functional units and alsodoing the same computation on many data values. So to give solutions to all these we have

    various types of computers.Architectural Classification

    Flynn's classification Based on the multiplicity of Instruction Streams and Data Streams Instruction Stream

    Sequence of Instructions read from memory Data Stream

    Operations performed on the data in the processor

    Fig 2: Classification accordance to Instruction and Data stream

    There are a variety of ways parallel processing can be classified. M.J.Flynn considered the organization of a computer system by the number of

    instructions and data items manipulated simultaneously. The normal operation of a computer is to fetch instructions from memory and

    execute them in the processor.

    Number of Data Streams

    Number of InstructionStreams

    Single

    Multiple

    Single Multiple

    SISD SIMD

    MISD MIMD

    18

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    19/130

    The sequence of instructions read from memory constitutes an instructionstream.

    The operations performed on the data in the processor constitute a datastream.

    Parallel processing can be implemented with either instruction stream, data stream

    or both.

    SISD COMPUTER SYSTEMS

    SISD (Single instruction single data stream) is the simplest computer available. It containsno parallelism. It has single instruction and single data stream. The instructions associatedwith SISD are executed sequentially and the system may or may not have external; parallel

    processing capabilities.

    Fig 3: SISD ArchitectureCharacteristics

    - Standard von Neumann machine- Instructions and data are stored in memory- One operation at a time

    LimitationsVon Neumann bottleneck Maximum speed of the system is limited by the

    Memory Bandwidth (bits/sec or bytes/sec)- Limitation on Memory Bandwidth - Memory is shared by CPU and I/O

    Examples: Superscalar processorsSuper pipelined processorsVLIW

    MISD COMPUTER SYSTEMS

    MISD (Multiple instruction, single data stream) is of no practical usage as there is leastchance where a lot of instructions get executed on a single data.

    Control

    Unit

    Processor

    UnitMemory

    Instruction stream

    Data stream

    19

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    20/130

    Fig 4: MISD Architecture Characteristics

    - There is no computer at present that can beClassified as MISD

    SIMD COMPUTER SYSTEMS

    SIMD (Single instruction Multiple data stream) is the computer where a single instructiongets operated with different sets of data. It gets executed with the help of many processingunits controlled by a single control unit. The shared memory must contain various modulesso that it can communicate with all the processors at the same time.

    Main memory is used for storage of programs. Master control unit decodes the instruction and determine the instruction to be

    executed.

    M1

    CU1 P 1

    M2

    CU2 P 2

    Mn CUn P n

    Memory

    Instruction stream

    Data stream

    Control Unit

    Alignment network

    P1

    P2

    Pn

    M1

    MnM2

    Data bus

    Instruction Stream

    Data stream

    Processor units

    Memory modules

    20

    Memory

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    21/130

    Fig 5: SIMD Architecture Characteristics

    - Only one copy of the program exists- A single controller executes one instruction at a time

    Examples:Array processorsSystolic arraysAssociative processors

    MIMD COMPUTER SYSTEMS

    MIMD (Multiple instruction, multiple data stream) refers to a computer system where wehave different processing elements working on different data.In this we classify various multiprocessors and multi computers.

    Characteristics

    - Multiple processing units- Execution of multiple instructions on multiple data

    Fig 6: MIMD Architecture

    Types of MIMD computer systems- Shared memory multiprocessors

    UMA NUMA

    - Message-passing multi computers

    SHARED MEMORY MULTIPROCESSORSExample systems

    Bus and cache-based systems- Sequent Balance, Encore Multimax

    Multistage IN-based systems- Ultra computer, Butterfly, RP3, HEP

    Interconnection Network

    P1

    M1 Pn MnP 2 M2

    Shared Memory

    21

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    22/130

    Crossbar switch-based systems- C.mmp, Alliant FX/8

    LimitationsMemory access latencyHot spot problem

    SHARED MEMORY MULTIPROCESSORS (UMA)

    Fig 7: Uniform Memory access(UMA)Characteristics

    All processors have equally direct access to one large memory address space. Thus theaccess time to reach that memory is same for all processors thus it is named as UMA.

    SHARED MEMORY MULTIPROCESSORS (NUMA)

    Interconnection Network

    P 1 P nP 2

    M1 MnM2

    Interconnection Network

    P 1 P nP 2

    M MM

    22

    MnM1 M2

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    23/130

    Fig 8: NUMA (Non uniform memory access)

    CharacteristicsAll processors have equally direct access to one large memory address space and also

    have their own memory. Thus the access time to reach different memories is different for each processor thus it is named as NUMA.

    MESSAGE-PASSING MULTICOMPUTER

    Fig 9: Message passing multi computer Architecture

    Characteristics- Interconnected computers- Each processor has its own memory, and communicates via message-passing

    Example systems- Tree structure: Teradata, DADO- Mesh-connected: Rediflow, Series 2010, J-Machine- Hypercube: Cosmic Cube, iPSC, NCUBE, FPS T Series, Mark III

    Limitations- Communication overhead- Hard to programming

    Summary: Stored Program Control Concept: In this type of organization instructions and data

    are stored separately. Flynns classification Of computers: It divided the processing work intodata streams and instruction streams and resulted in:

    Message-Passing Network

    P 1 P nP 2

    M M M

    Point-to-point connections

    23

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    24/130

    o SISD(Single instruction Single data)o SIMD(Single instruction Multiple data)o MISD(Multiple instruction Single data)o MIMD (Multiple instruction Multiple data)

    Important Questions:Q1. Explain stored program control concept.Q2. Explain Flynns classification of computers.Q3. Describe the concept of data stream and instruction stream.

    Lecture -5

    MULTILEVEL VIEWPOINT OF A MACHINE MICRO ARCHITECTURE

    ISA MICRO ARCHITECTURE

    CPUCACHESMAIN MEMORY AND SECONDARY MEMORY UNITSINPUT / OUTPUT MAPPING

    After the discussion of stored program control concept and the various type of parallelcomputers, lets study the different components of the computer structure.

    MULTILEVEL VIEWPOINT OF A MACHINEOur computer is build on various layers.

    These layers are basically divided into:Software layer Hardware Layer Instruction Set Architecture

    24

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    25/130

    Fig 1: Multilevel viewpoint of a machine

    Computer system architecture is decided on the basis of the type of applications or usage of the computer.

    The computer architect decides the different layers and the function of each layer for aspecific computer.These layers or functions of each can vary from one organization to another.

    Our layered architecture is basically divided into 3 parts:

    Macro-Architecture : as a unit of deployment, we will talk about Clientapplications and COM Servers.Computer Architecture is the conceptual design and fundamental operational structure of a computer system. It is a blueprint and functional description of requirements (especiallyspeeds and interconnections) and design implementations for the various parts of acomputer .

    This is basically our software layer of the computer. It comprises of :

    User Application layer The user layer is basically to give the interface to the user with the computer for which the computer is designed .At this layer the user gives the inputs as what

    processing has to be done .The requirements given by the user has to beimplemented by the computer architect with the help of other layers.

    High level language

    INSTRUCTION SET ARCHITECTURE (ISA)

    PROCESSOR MEMORY I/0 SYSTEM

    CIRCUIT LEVEL DESIGN

    SILICON LAYOUT LAYER

    COMPILER

    ASSEMBLER

    OS MSDOSWINDOWSUNIX / LINUX

    USER APPLICATION LAYERSOFTWARELAYER

    HARDWARELAYER

    DATA PATH AND CONTROL

    GATE LEVEL DESIGN

    MACROARCHITECTURE

    MICROARCHITECTURE

    25

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    26/130

    High-level programming language is a programming language with strongabstraction from the details of the computer. In comparison to low-level

    programming languages, it may use natural language elements, be easier to use,or more portable across platforms. Such languages hide the details of CPUoperations such as memory access models and management of scope.E.g.

    C/Fortran/Pascal .These are not computer dependent.

    Assembly languageAssembly Language refers to the lowest-level human-readable method for

    programming a particular computer. Assembly Languages are platform specific,and therefore there is a different Assembly Language necessary for

    programming every different type of computer.

    Machine languageMachine languages consist entirely of numbers and are almost impossible for humans to read and write.

    Operating systemOperating systems interface with hardware to provide the necessary servicesfor application software. E.g. OS, LINUX, UNIX etc.

    Functions of Operating system: Process management Memory management File management Device management Error Detection Security

    Types of Operating system: Multiprogramming Operating System Multiprocessing Operating system Time Sharing Operating system Real time Operating system Distributed Operating system Network Operating system

    Compiler Software that translates a program written in a high-level programminglanguage (C/C++, COBOL, etc.) into machine language. A compiler usuallygenerates assembly language first and then translates the assembly languageinto machine language. A utility known as a "linker" then combines all requiredmachine language modules into an executable program that can run in thecomputer.

    26

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    27/130

    Assembler is the software that translates assembly language into machinelanguage. Contrast with compiler, which is used to translate a high-levellanguage, such as COBOL or C, into assembly language first and then intomachine language.

    Instruction set architecture: This is an abstraction on the interface between thehardware and the low-level software. It deals with the functional behaviour of acomputer system as viewed by a programmer . Computer organization deals withstructural relationships that are not visible by a programmer. Instruction setarchitecture is the attribute of a computing system, as seen by the assemblylanguage programmer or compiler.

    ISA is determined by:Data Storage.Memory Addressing Modes.

    Operations in the Instruction Set.Instruction Formats.Encoding the Instruction Set.Compilers View.

    Micro-Architecture: inside a unit of deployment we will talk about running process,COM apartment, thread concurrency and synchronization, memory sharing.

    Micro architecture , also known as Computer organization is a lower level, moreconcrete, description of the system that involves how the constituent parts of the

    system are interconnected and how they interoperate in order to implement the ISA.The size of a computers cache for instance, is an organizational issue that generallyhas nothing to do with the

    Processor memory I /o system These are the basic hardwaredevices required for the processing of any system application.

    Data path and control In different computers we have differentnumber and type of registers and other logic circuits .The data pathand control decides the flow of information within the various partsof the computer system in various circuits.

    Gate level design These circuits such as register, counters etc areimplemented in the form of various gates available.

    Circuit level design to add the gates to form a logical circuit or acomponent we have the basic circuit level design which ultimatelygives birth to all the hardware components of a computer system.

    Silicon layout layer

    Other than the architecture of the computer , we have some very basic units which areimportant for our computer.Memory units:

    27

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    28/130

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    29/130

    Encoding the Instruction Set.Compilers View

    Other than the structured organization of computer , other importantelements are:

    o Memoryo CPUo I/O

    Important Questions:Q1. Explain multi level view point of a machine.Q2. Describe micro architecture.Q3. Describe macro architecture.Q4. Explain ISA and why we call it is a link between the hardware and software

    components.Q5. What is operating system?

    29

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    30/130

    Lecture 6:

    CPU performance measures MIPS

    MFLOPS

    After the discussion of all the elements of computer structure in the previous topics , wedescribe the performance of a computer in this lecture with the help of their performancemetrics.

    Performance of a machine is determined by: Instruction count Clock cycle time Clock cycles per instruction

    Processor design (datapath and control) will determine: Clock cycle time

    Clock cycles per instruction Single cycle processor - one clock cycle per instruction Advantages: Simple design, low CPI Disadvantages: Long cycle time, which is limited by the slowest instruction

    We have different methods to calculate the performance of a CPU or two comparetwo CPUs but it highly depends on what type of instructions we give to these CPU.

    The two phenomenon we generally use are: MIPS MFLOPS

    MIPS: For a specific program running on a specific computer MIPS is a measure of how

    many millions of instructions are executed per second:

    MIPS = Instruction count / (Execution Time x 106)= Instruction count / (CPU clocks x Cycle time x 106)

    = (Instruction count x Clock rate) /(Instruction count x CPI x 106)

    = Clock rate / (CPI x 106)

    Faster execution time usually means faster MIPS rating.

    CPI

    Inst.Count

    CycleTime

    30

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    31/130

    MIPS is a good technique but it also have some pitfalls.Problems with MIPS rating:

    No account for the instruction set used. Program-dependent: A single machine does not have a single MIPS rating

    since the MIPS rating may depend on the program used.

    Easy to abuse: Program used to get the MIPS rating is often omitted. Cannot be used to compare computers with different instruction sets. A higher MIPS rating in some cases may not mean higher performance or

    better execution time i.e. due to compiler design variations. For a machine with instruction classes:

    For a given program, two compilers produced the following instruction counts:

    The machine is assumed to run at a clock rate of 100 MHz.

    MIPS = Clock rate / (CPI x 106) = 100 MHz / (CPI x 106)CPI = CPU execution cycles / Instructions countCPU time = Instruction count x CPI / Clock rate

    For compiler 1: CPI1 = (5 x 1 + 1 x 2 + 1 x 3) / (5 + 1 + 1) = 10 / 7 = 1.43 MIP1 = 100 / (1.428 x 106) = 70.0

    CPU time1 = ((5 + 1 + 1) x 106 x 1.43) / (100 x 106) = 0.10 seconds

    For compiler 2: CPI2 = (10 x 1 + 1 x 2 + 1 x 3) / (10 + 1 + 1) = 15 / 12 = 1.25 MIP2 = 100 / (1.25 x 106) = 80.0 CPU time2 = ((10 + 1 + 1) x 106 x 1.25) / (100 x 106) = 0.15 seconds

    Instruction class CPIA 1B 2C 3

    Instruction counts (in millions)for each instruction class

    Code from: A B CCompiler 1 5 1 1Compiler 2 10 1 1

    31

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    32/130

    MFLOPS: MFLOPS, for a specific program running on a specific computer, is a measure of

    millions of floating point-operation (megaflops) per second.MFLOPS = Number of floating-point operations /(Execution time x 106 )

    MFLOPS is a better comparison measure between different machines than MIPS.

    This is better than MIPS but it also has some pitfalls.

    Problems with MFLOPS: A floating-point operation is an addition, subtraction, multiplication, or division

    operation applied to numbers represented by a single or a double precision floating- point representation.

    Program-dependent: Different programs have different percentages of floating- point operations present i.e. compilers have no floating- point operations and yielda MFLOPS rating of zero.

    Dependent on the type of floating-point operations present in the program.Summary:

    Performance of a machine is determined by: Instruction count Clock cycle time Clock cycles per instruction

    MIPS = Instruction count / (Execution Time x 106)

    MFLOPS = Number of floating-point operations /(Execution time x106 )

    Important Questions:Q1. What is MIPS?Q2. What is MFLOPS?Q3. What is the difference between MIPS and MFLOPS?Q4. What are CPU performance measures?

    32

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    33/130

    Lecture 7:

    Cache Memory Main Memory

    Secondary Memory

    We have basically 3 type of memories attached with our processor.

    Cache MemoryMain MemorySecondary Memory

    Primary storage , presently known as memory , is the only one directly accessible to theCPU. The CPU continuously reads instructions stored there and executes them as required.Any data actively operated on is also stored there in uniform manner.

    there are two more sub-layers of the primary storage, besides main large-capacity RAM:

    Processor registers are located inside the processor. Each register typically holds aword of data (often 32 or 64 bits). CPU instructions instruct the arithmetic and logic unit to perform various calculations or other operations on this data (or with thehelp of it). Registers are technically among the fastest of all forms of computer datastorage.

    Processor cache is an intermediate stage between ultra-fast registers and muchslower main memory. It's introduced solely to increase performance of the

    computer. Most actively used information in the main memory is just duplicated inthe cache memory, which is faster, but of much lesser capacity. On the other hand itis much slower, but much larger than processor registers. Multi-level hierarchicalcache setup is also commonly used primary cache being smallest, fastest andlocated inside the processor; secondary cache being somewhat larger and slower.

    These are the type of memories accessed when we work with processor . But if we have tostore some data permanently we need to take help of secondary or auxiliary memory.

    Secondary memory (or secondary storage) is the slowest and cheapest form of memory . Itcannot be processed directly by the CPU . It must first be copied into primary storage (alsoknown as RAM ).

    Secondary memory devices include magnetic disks like hard drives and floppy disks ;optical disks such as CDs and CDROMs ; and magnetic tapes , which were the first formsof secondary memory.

    Primary memory Secondary memory

    33

    http://en.wikipedia.org/wiki/Processor_registerhttp://en.wikipedia.org/wiki/Word_(computing)http://en.wikipedia.org/wiki/Word_(computing)http://en.wikipedia.org/wiki/Arithmetic_and_logic_unithttp://en.wikipedia.org/wiki/Arithmetic_and_logic_unithttp://en.wikipedia.org/wiki/Arithmetic_and_logic_unithttp://en.wikipedia.org/wiki/Arithmetic_and_logic_unithttp://en.wikipedia.org/wiki/CPU_cachehttp://en.wikipedia.org/wiki/Memory_hierarchyhttp://en.wikipedia.org/wiki/Memory_hierarchyhttp://www.webopedia.com/TERM/S/memory.htmhttp://www.webopedia.com/TERM/S/memory.htmhttp://www.webopedia.com/TERM/S/CPU.htmhttp://www.webopedia.com/TERM/S/CPU.htmhttp://www.webopedia.com/TERM/S/RAM.htmhttp://www.webopedia.com/TERM/S/hard_disk_drive.htmhttp://www.webopedia.com/TERM/S/floppy_disk.htmhttp://www.webopedia.com/TERM/S/optical_disk.htmhttp://www.webopedia.com/TERM/S/compact_disc.htmhttp://www.webopedia.com/TERM/S/compact_disc.htmhttp://www.webopedia.com/TERM/S/compact_disc.htmhttp://www.webopedia.com/TERM/S/CD_ROM.htmhttp://www.webopedia.com/TERM/S/tape.htmhttp://en.wikipedia.org/wiki/Processor_registerhttp://en.wikipedia.org/wiki/Word_(computing)http://en.wikipedia.org/wiki/Arithmetic_and_logic_unithttp://en.wikipedia.org/wiki/Arithmetic_and_logic_unithttp://en.wikipedia.org/wiki/CPU_cachehttp://en.wikipedia.org/wiki/Memory_hierarchyhttp://en.wikipedia.org/wiki/Memory_hierarchyhttp://www.webopedia.com/TERM/S/memory.htmhttp://www.webopedia.com/TERM/S/CPU.htmhttp://www.webopedia.com/TERM/S/RAM.htmhttp://www.webopedia.com/TERM/S/hard_disk_drive.htmhttp://www.webopedia.com/TERM/S/floppy_disk.htmhttp://www.webopedia.com/TERM/S/optical_disk.htmhttp://www.webopedia.com/TERM/S/compact_disc.htmhttp://www.webopedia.com/TERM/S/CD_ROM.htmhttp://www.webopedia.com/TERM/S/tape.htm
  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    34/130

    1. Fast 1. Slow2. Expensive 2. Cheap3. Low capacity 3. Large capacity

    4. Connects directly to the processor 4. Not connected directly to the processor

    Hard Disks:

    Hard disks similar to cassette tapes use the magnetic recording techniques - the magnetic mediumcan be easily erased and rewritten, and it will "remember" the magnetic flux patterns stored onto themedium for many years.

    Hard drive consists of platter, control circuit board and interface parts.

    A hard disk is a sealed unit containing a number of platters in a stack. Hard disks may be mounted in

    a horizontal or a vertical position. In this description, the hard drive is mounted horizontally.

    Electromagnetic read/write heads are positioned above and below each platter. As the platters spin,the drive heads move in toward the center surface and out toward the edge. In this way, the driveheads can reach the entire surface of each platter.

    On a hard disk, data is stored in thin, concentric bands. A drive head, while in one position can reador write a circular ring, or band called a track. There can be more than a thousand tracks on a 3.5-

    inch hard disk. Sections within each track are called sectors. A sector is the smallest physical storageunit on a disk, and is almost always 512 bytes (0.5 kB) in size.

    The stack of platters rotate at a constant speed. The drive head, while positioned close to the center of the disk reads from a surface that is passing by more slowly than the surface at the outer edges of the disk. To compensate for this physical difference, tracks near the outside of the disk are less-densely populated with data than the tracks near the center of the disk. The result of the differentdata density is that the same amount of data can be read over the same period of time, from any drive

    34

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    35/130

    head position.

    The disk space is filled with data according to a standard plan. One side of one platter contains spacereserved for hardware track-positioning information and is not available to the operating system.Thus, a disk assembly containing two platters has three sides available for data. Track-positioning

    data is written to the disk during assembly at the factory. The system disk controller reads this datato place the drive heads in the correct sector position.

    Magnetic Tapes:

    An electric current in a coil of wire produces a magnetic field similar to that of a bar magnet , andthat field is much stronger if the coil has a ferromagnetic (iron-like) core

    Tape heads are made from rings of ferromagnetic material with a gap where the tape contacts it sothe magnetic field can fringe out to magnetize the emulsion on the tape. A coil of wire around thering carries the current to produce a magnetic field proportional to the signal to be recorded. If analready magnetized tape is passed beneath the head, it can induce a voltage in the coil. Thus the samehead can be used for recording and playback .

    35

    http://hyperphysics.phy-astr.gsu.edu/hbase/electric/elecur.html#c1http://hyperphysics.phy-astr.gsu.edu/hbase/magnetic/elemag.html#c5http://hyperphysics.phy-astr.gsu.edu/hbase/magnetic/elemag.html#c3http://hyperphysics.phy-astr.gsu.edu/hbase/magnetic/elemag.html#c3http://hyperphysics.phy-astr.gsu.edu/hbase/solids/ferro.html#c4http://hyperphysics.phy-astr.gsu.edu/hbase/magnetic/toroid.html#c1http://hyperphysics.phy-astr.gsu.edu/hbase/magnetic/toroid.html#c1http://hyperphysics.phy-astr.gsu.edu/hbase/audio/tape2.html#c3http://hyperphysics.phy-astr.gsu.edu/hbase/audio/tape.html#c3http://hyperphysics.phy-astr.gsu.edu/hbase/audio/tape.html#c3http://hyperphysics.phy-astr.gsu.edu/hbase/audio/tape.html#c4http://hyperphysics.phy-astr.gsu.edu/hbase/audio/tape.html#c4http://hyperphysics.phy-astr.gsu.edu/hbase/electric/elecur.html#c1http://hyperphysics.phy-astr.gsu.edu/hbase/magnetic/elemag.html#c5http://hyperphysics.phy-astr.gsu.edu/hbase/magnetic/elemag.html#c3http://hyperphysics.phy-astr.gsu.edu/hbase/solids/ferro.html#c4http://hyperphysics.phy-astr.gsu.edu/hbase/magnetic/toroid.html#c1http://hyperphysics.phy-astr.gsu.edu/hbase/audio/tape2.html#c3http://hyperphysics.phy-astr.gsu.edu/hbase/audio/tape.html#c3http://hyperphysics.phy-astr.gsu.edu/hbase/audio/tape.html#c4
  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    36/130

    Lecture 8:

    Instruction Set based classification of computers Three address instructions

    Two address instructions One address instructions Zero address instructions RISC address instructions CISC address instructions RISC Vs CISC

    In the last chapter we discussed the various architectures and the layers of the computer architecture. In this chapter we are explaining the middle layer of the multilevel view pointof a machine i.e. Instruction Set Architecture.

    Instruction Set Architecture (ISA) is an abstraction on the interface between the hardwareand the low-level software.

    It comprises of :Instruction Formats.Memory Addressing Modes.Operations in the Instruction Set.Encoding the Instruction Set.Data Storage.Compilers View.

    Instruction FormatIs the representation of the instruction. It contains the various Instruction Fields :

    opcode field specify the operations to be performed Address field(s) designate memory address(es) or processor register(s) Mode field(s) determine how the address field is to be interpreted to get effective

    address or the operand The number of address fields in the instruction format :

    depend on the internal organization of CPU The three most common CPU organizations :

    - Single accumulator organization :

    ADD X /* AC AC + M[X] */- General register organization :ADD R1, R2, R3 /* R1 R2 + R3 */ADD R1, R2 /* R1 R1 + R2 */MOV R1, R2 /* R1 R2 */ADD R1, X /* R1 R1 + M[X] */

    - Stack organization :PUSH X /* TOS M[X] */

    36

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    37/130

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    38/130

    One goal for CISC machines was to have a machine language instruction to matcheach high-level language statement type.

    Criticisms on CISC-Complex Instruction

    Format, Length, Addressing Modes Complicated instruction cycle control due to the complex decoding HWand decoding process

    - Multiple memory cycle instructions Operations on memory data Multiple memory accesses/instruction- Microprogrammed control is necessity Microprogram control storage takes substantial portion of CPU chip area Semantic Gap is large between machine instruction and microinstruction- General purpose instruction set includes all the features required byindividually different applications

    When any one application is running, all the features required bythe other applications are extra burden to the application

    RISC

    In the late 70s - early 80s, there was a reaction to the shortcomings of the CISC style of processors

    Reduced Instruction Set Computers (RISC) were proposed as analternative

    The underlying idea behind RISC processors is to simplify the instruction set and reduce

    instruction execution time

    Note : In RISC type of instructions , we cant access the memory operands directly .

    Evaluate X = (A + B) * (C + D) :MOV R1, A /* R1 M[A] */MOV R2, B /* R2 M[B] */ADD R1,R1,R2 /* R1 R1 + R2MOV R2, C /* R2 M[C] */MOV R3, D /* R3 M[D] */ADD R2,R2, R3 /* R2 R2 + R2 */

    MUL R1,R1, R2 /* R1 R1 * R2 */MOV X, R1 /* M[X] R1 */

    RISC processors often feature : Few instructions Few addressing modes Only load and store instructions access memory

    38

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    39/130

    All other operations are done using on-processor registers Fixed length instructions Single cycle execution of instructions The control unit is hardwired, not microprogrammed

    Since all (but the load and store instructions) use only registers for operands, only a few addressing modes are needed

    By having all instructions the same length : reading them in is easy and fast

    The fetch and decode stages are simple, looking much more like Manos BCthan a CISC machine

    The instruction and address formats are designed to be easy to decode (Unlike the variable length CISC instructions,) the opcode and register fields of RISC instructions can be decoded simultaneously

    The control logic of a RISC processor is designed to be simple and fast : The control logic is simple because of the small number of instructions andthe simple addressing modes

    The control logic is hardwired, rather than microprogrammed, becausehardwired control is faster

    ADVANTAGES OF RISCVLSI Realization- Control area is considerably reduced

    RISC chips allow a large number of registers on the chip- Enhancement of performance and HLL support

    - Higher regularization factor and lower VLSI design cost

    Computing Speed- Simpler, smaller control unit faster - Simpler instruction set; addressing modes; instruction format

    faster decoding- Register operation faster than memory operation- Register window enhances the overall speed of execution- Identical instruction length, One cycle instruction execution

    suitable for pipelining faster

    Design Costs and Reliability- Shorter time to design

    reduction in the overall design cost and reduces the problem that the end product will be obsolete by the time the design is completed

    - Simpler, smaller control unithigher reliability

    - Simple instruction format (of fixed length)

    39

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    40/130

    ease of virtual memory management High Level Language Support

    - A single choice of instruction shorter, simpler compiler - A large number of CPU registers more efficient code- Register window Direct support of HLL

    - Reduced burden on compiler writer

    RISC VS CISC

    The CISC Approach Thus, the entire task of multiplying two numbers can be completed with oneinstruction:

    MULT 2:3, 5:2

    One of the primary advantages of this system is that the compiler has to do verylittle work to translate a high-level language statement into assembly. Because the

    length of the code is relatively short, very little RAM is required to storeinstructions. The emphasis is put on building complex instructions directly into thehardware.

    The RISC ApproachIn order to perform the exact series of steps described in the CISC approach, a

    programmer would need to code four lines of assembly: LOAD A, 2:3

    LOAD B, 5:2PROD A, BSTORE 2:3,

    A At first, this may seem like a much less efficient way of completing the

    operation. Because there are more lines of code, more RAM is needed to store theassembly level instructions. The compiler must also perform more work to converta high-level language.

    RISC vs CISC

    Emphasis on hardwareTransistors used for storingcomplex instructions

    Emphasis on software

    Spends more transistorson memory registers

    Includes multi-clock complex instructions,

    Single-clock reduced instruction only

    Memory-to-memory:"LOAD" and "STORE"incorporated in instructions

    Register to register:"LOAD" and "STORE"are independent instructions

    Small code sizes large code sizes

    40

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    41/130

    High cycles per second Low cycles per second

    Summary: The instruction format is composed of the opcode field, address field, and mode field. The different types of address instructions used are three-address, two-address, one-

    address and zero-address.

    RISC and CISC Introduction with its advantages and criticism RISC Vs CISC

    Important Questions:Q1.Explain the different addressing formats in detail with example.Q2.Explain RISC AND CISC with their advantages and criticisms.Q3 Numerical

    41

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    42/130

    Lecture 9:

    Addressing modes Implied Mode Immediate Mode Register Mode Register Indirect Mode Autoincrement or Autodecrement Mode Direct Addressing Mode Indirect Addressing Mode Relative addressing Mode

    In the last lecture we studied the instruction formats, now we study how the instructionsuse the addressing modes of different types.

    Addressing Modes

    Addressing Modes* Specifies a rule for interpreting or modifying the address field of the instruction

    (before the operand is actually referenced)* Variety of addressing modes

    - to give programming flexibility to the user - to use the bits in the address field of the instruction efficiently

    In simple words we can say the addressing modes is the way to fetch operands (or Data)from memory.

    TYPES OF ADDRESSING MODES Implied Mode

    : Address of the operands are specified implicitly in the definition of the instruction- No need to specify address in the instruction- EA = AC, or EA = Stack[SP]- Examples from BC : CLA, CME, INP

    Immediate Mode: Instead of specifying the address of the operand,operand itself is specified- No need to specify address in the instruction- However, operand itself needs to be specified- (-)Sometimes, require more bits than the address- (+) Fast to acquire an operand- Useful for initializing registers to a constant value

    Register Mode: Address specified in the instruction is the register address- Designated operand need to be in a register

    42

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    43/130

    - (+) Shorter address than the memory address-- Saving address field in the instruction- (+) Faster to acquire an operand than the memory addressing- EA = IR(R) (IR(R): Register field of IR)

    Register Indirect Mode

    : Instruction specifies a register which contains the memory address of the operand- (+) Saving instruction bits since register address is shorter than the memoryaddress

    - (-) Slower to acquire an operand than both the register addressing or memoryaddressing

    - EA = [IR(R)] ([x]: Content of x) Autoincrement or Autodecrement Mode

    - Similar to the register indirect mode except :When the address in the register is used to access memory, the value in the register

    is incremented or decremented by 1 automatically Direct Address Mode

    : Instruction specifies the memory address which can be used directly to access thememory- (+) Faster than the other memory addressing modes- (-) Too many bits are needed to specify the address for a large physical memory

    space- EA = IR(addr) (IR(addr): address field of IR)- E.g., the address field in a branch-type instr

    Indirect Addressing Mode: The address field of an instruction specifies the address of a memory location that

    contains the address of the operand- (-) Slow to acquire an operand because of an additional memory access- EA = M[IR(address)]

    Relative Addressing Modes: The Address fields of an instruction specifies the part of the address(abbreviated address) which can be used along with a designatedregister to calculate the address of the operand--> Effective addr = addr part of the instr + content of a special register - (+) Large physical memory can be accessed with a small number of address bits- EA = f(IR(address), R), R is sometimes implied--> Typically EA = IR(address) + R - 3 different Relative Addressing Modes depending on R

    * (PC) Relative Addressing Mode (R = PC)* Indexed Addressing Mode (R = IX, where IX: Index Register)* Base Register Addressing Mode (R = BAR(Base Addr Register))* Indexed addressing mode vs. Base register addressing mode

    - IR(address) (addr field of instr) : base address vs. displacement- R (index/base register) : displacement vs. base address- Difference: the way they are used (NOT the way they are computed)

    * indexed addressing mode : processing many operands in an array using the same instr

    43

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    44/130

    * base register addressing mode : facilitate the relocation of programs in memory inmultiprogramming systems

    Addressing Modes: Examples

    Summary: Addressing Modes: Specifies a rule for interpreting or modifying the address field

    of the instruction. The different types of addressing modes are: Implied mode, Immediate mode,

    Register mode, Register indirect mode, Autoincrement or auto decrement mode,Direct mode, Indirect mode, Relative addressing mode.

    Important Questions:Q1. Explain the addressing modes with suitable examples.

    44

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    45/130

    Lecture 10:

    Instruction set Data Transfer Instructions

    o Typical Data Transfer Instructionso Data Transfer Instructions with Different Addressing

    Modes Data Manipulation Instructions

    o Arithmetic instructionso Logical and bit manipulation instructionso Shift instructions

    Program Control Instructionso Conditional Branch Instructionso Subroutine Call & Return

    DATA TRANSFER INSTRUCTIONS

    These are the type of instructions used only for the transfer of data fromregisters to registers, registers to memory operands and other memorycomponents. There is no manipulation done on the data values.

    These are the type of instructions in which there is no usage of variousaddressing modes. We have a direct transfer between the various registersand memory components.

    Like Load and store we used for the transfer of data to and from theaccumulator.

    Load LDStore STMove MOVExchange XCHInput INOutput OUTPush PUSHPop POP

    Name Mnemonic

    Typical Data Transfer Instructions

    Table 3.1

    45

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    46/130

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    47/130

    Arithmetic Instructions : These are the type of instructions used for arithmeticalcalculations like addition , subtraction , increment etc.

    Logical and Bit Manipulation Instructions

    These are the type of instructions in which are operations are computed on string of bits. These bits are treated as individual and thus the operation can be done onindividual or a group of bits ignoring the whole value and even new bits insertion is

    possible.

    For example:CLR R1 will make all the bits as 0.COM R1 will invert all the bits.AND , OR and XOR will produce the result on 2 individual bits of each operand.E.g.: AND of 0011 and 1100 will result to:

    0000.AND instruction is also known as mask instruction as if we have to mask some values of operand we can AND that value with 0s giving other inputsas 1(high).E.g.: Suppose we have to mask register with value 11000110

    On 1st

    , 3rd

    and 7th

    bit. Then we will have to AND it with value 01011101.

    CLRC, SETC and COMC will work only on 1 bit of the operand i.e. Carry.

    Similarly in case of EI and DI we work only on 1 bit interrupt flip flop toenable it.

    NameMnemonicIncrement INCDecrement DECAdd ADDSubtract SUBMultiply MULDivide DIVAdd with Carry ADDCSubtract with Borrow SUBB

    Negate(2s Complement) NEG

    Table 3.3

    47

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    48/130

    Name Mnemonic

    Clear CLRComplement COMAND ANDOR ORExclusive-OR XORClear carry CLRCSet carry SETCComplement carry COMCEnable interrupt EIDisable interrupt DI

    Shift Instructions : These are the type of instructions which modify the whole valueof operand but by shifting the bits on left or right side.

    Say R1 has value 11001100o SHR inserts 0 at the left most position.

    Result 01100110o SHL inserts 0 at the right most position.

    Result 10011000o SHRA : In case of SHRA the sign bit remains same else every bit shift left

    or right accordingly.Result 11100110

    o SHLA is same as that of SHL inserting 0 in the end.Result 10011000

    o In ROR , all the bits are shifted towards right and the rightmost one movesto leftmost position.

    Result : 01100110o In ROL , all the bits are shifted towards left and the leftmost one moves to

    rightmost position.Result : 10011001

    o In case of RORC , suppose we have a carry bit as 0 with register R1. In thisall the bits of the register will be right shifted and the value of carry will bemoved to leftmost position and the rightmost position will be moved tocarry.

    48

    Table 3.4

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    49/130

    Result : 01100110 with carry as 0o Similarly in case of ROLC , we will get all the bits of the register left

    shifted and the value of carry moved to rightmost position and the leftmost position will be moved to carry.

    Result : 10011000 with carry as 1.

    PROGRAM CONTROL INSTRUCTIONS:

    Before starting with program control instructions, lets study the concept of PC i.e.Program counter. Program counter is the register which tells us the address of thenext instruction to be executed. When we fetch the instruction pointed by PC frommemory it changes it value giving us the address of the next instruction to be fetched.In case of sequential instructions it simply increments itself and in case of branchingor modular programs it gives us the address of the first instruction of the calledprogram. After the execution of the called program , the program counter points back to the instruction next to the instruction from which the subprogram was called. Incase of go to kind of instructions the program counter simply changes the value of program counter with out keeping any reference of the previous instruction..

    Logical shift right SHRLogical shift left SHLArithmetic shift right SHRAArithmetic shift left SHLARotate right ROR

    Rotate left ROLRotate right thru carry RORCRotate left thru carry ROLC

    NameMnemonic

    PC

    +1In-Line Sequencing (Next instruction is fetchedfrom the next adjacent location in the memory)

    Address from other source; Current Instruction,Stack, etc; Branch, Conditional Branch,Subroutine, etc.

    49

    Table 3.5

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    50/130

    Program Control Instructions: These instructions are used for the transfer of control toother instructions. That is these instructions are used in case we have to execute the nextinstruction from some other location instead of sequential manner.

    The conditions can be :Calling a sub program

    Returning to the main programJumping onto some other instruction or locationSkip the instructions in case of break and exit or in case the condition youcheck is false and so on

    *CMP and TST instructions do not retain their results of operation (- and AND,respectively).They only set or clear certain flags.

    Conditional Branch Instructions: These are the instructions in which we test someconditions and depending on the result we go either for branching or sequential way.

    NameMnemonic

    Branch BRJump JMPSkip SKP

    Call CALLReturn RTNCompare(by ) CMPTest(by AND) TST

    50

    Table 3.6

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    51/130

    Subroutine Call and Return:

    Subroutine Call : Call SubroutineJump to SubroutineBranch to SubroutineBranch & save return address

    Two most important operations are implied:*Branch to the beginning of the Subroutine

    -Same as the branch or conditional branch*Save the return address to get the address of the location of the calling program

    upon exit from the subroutine.

    Location of storing return address:Fixed Location in the subroutine (Memory)Fixed Location in memory

    BZ Branch if zero Z = 1BNZ Branch if not zero Z = 0BC Branch if carry C = 1

    BNC Branch if no carry C = 0BP Branch if plus S = 0BM Branch if minus S = 1BV Branch if overflow V = 1BNV Branch if no overflow V = 0

    BHI Branch if higher A > BBHE Branch if higher or equal A BBLO Branch if lower A < BBLOE Branch if lower or equal A B

    BE Branch if equal A = BBNE Branch if not equal A B

    BGT Branch if greater than A > BBGE Branch if greater or equal A BBLT Branch if less than A < BBLE Branch if less or equal A BBE Branch if equal A = BBNE Branch if not equal A B

    Unsigned compare conditions (A - B)

    Signed compare conditions (A - B)

    Mnemonic Branch condition Tested condition

    51

    Table 3.7

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    52/130

    In a processor Register In memory stack

    -most efficient way

    Summary:Data Transfer Instructions are of two types namely: Typical Data Transfer Instructions and Data Transfer Instructions with Different Addressing Modes.The Data Manipulation Instructions are of three types, which are Arithmeticinstructions, Logical and bit manipulation instructions and Shift instructions.Program Control Instructions can be divided into Conditional Branch Instructionsand Subroutine Call & Return instructions.

    Important Questions:Q1.Explain the data Transfer instructions.Q2.Explain the data Manipulation instructions.Q3.Explain the Program control instructions with example.

    52

    CALLSP SP - 1M[SP] PC

    PC EA

    RTNPC M[SP]

    SP SP + 1

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    53/130

    Lecture 11:

    Program Interrupts MASM

    PROGRAM INTERRUPT:

    Types of Interrupts:1. External Interrupt : External interrupts are initiated from outside of CPU &

    memory.-I/O device-> Data transfer request or data transfer complete-Timing device ->timeout- Power failure- Operator

    2. Internal Interrupts (traps) : Internal Interrupts are caused by thecurrently running program.

    - Register, Stack Overflow- Divide by Zero- OP- code violation- Protection Violation

    3. Software Interrupts : Both external & internal interrupts areintiated by the computer hardware. Software interrupts are initiated

    by the executing instruction.-Supervisor Call -> Switching from user mode to the supervisor

    mode-> Allows to execute a certain class of

    operations which are not allowed in theuser mode.

    MASM:

    If you have used a modern word processor such as Microsoft Word and have noticed themacros feature. Where you can record a series of frequently used actions or commands intothe macros. For example, you always need to insert a 2 by 4 column with the title "Date"and "Time". You can start the macro recorder and create the table as you wish. After that,you can save the macro. The next time you need to create the same kind of table, you just

    need to execute the macro. The same applies for a macro assembler. It enables you torecord down frequently performed actions or a frequently used block of code so that you donot have to re-type it each time.

    The Microsoft Macro Assembler (abbreviated MASM) is an x86 high-level assembler for DOS and Microsoft Windows . Currently it is the most popular x86 assembler . It supports awide variety of macro facilities and structured programming idioms, including high- levelfunctions for looping and procedures . Later versions added the capability of producing

    53

    http://en.wikipedia.org/wiki/X86_architecturehttp://en.wikipedia.org/wiki/High-level_assemblerhttp://en.wikipedia.org/wiki/High-level_assemblerhttp://en.wikipedia.org/wiki/DOShttp://en.wikipedia.org/wiki/Microsoft_Windowshttp://en.wikipedia.org/wiki/Microsoft_Windowshttp://en.wikipedia.org/wiki/Assembly_language#Assemblerhttp://en.wikipedia.org/wiki/Assembly_language#Assemblerhttp://en.wikipedia.org/wiki/Macro_(computer_science)http://en.wikipedia.org/wiki/Structured_programminghttp://en.wikipedia.org/wiki/High-level_programming_languagehttp://en.wikipedia.org/wiki/Control_flow#Loopshttp://en.wikipedia.org/wiki/Control_flow#Loopshttp://en.wikipedia.org/wiki/Subroutinehttp://en.wikipedia.org/wiki/X86_architecturehttp://en.wikipedia.org/wiki/High-level_assemblerhttp://en.wikipedia.org/wiki/DOShttp://en.wikipedia.org/wiki/Microsoft_Windowshttp://en.wikipedia.org/wiki/Assembly_language#Assemblerhttp://en.wikipedia.org/wiki/Macro_(computer_science)http://en.wikipedia.org/wiki/Structured_programminghttp://en.wikipedia.org/wiki/High-level_programming_languagehttp://en.wikipedia.org/wiki/High-level_programming_languagehttp://en.wikipedia.org/wiki/Control_flow#Loopshttp://en.wikipedia.org/wiki/Subroutine
  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    54/130

    programs for Windows. MASM is one of the few Microsoft development tools that target16-bit , 32-bit and 64-bit platforms . Earlier versions were MS-DOS applications. Versions5.1 and 6.0 were OS/2 applications and later versions were Win32 console applications.Versions 6.1 and 6.11 included Phar Lap 's TNT DOS extender so that MASM could run inMS-DOS.[ citation needed

    The name MASM originally referred to as MACRO ASSEMBLER but over theyears it has become synonymous with Microsoft Assembler.An Assembly language translator converts macros into several machine languageinstructions.MASM isn't the fastest assembler around (it's not particularly slow, except in acouple of degenerate cases, but there are faster assemblers available).

    Though very powerful, there are a couple of assemblers that, arguably, are more powerful (e.g., TASM and HLA).MASM is only usable for creating DOS and Windows applications; you cannot

    effectively use it to create software for other operating systems.

    Benefits of MASMThere are some benefits to using MASM today:

    Steve Hutchessen's ("Hutch") MASM32 package provides the support for MASM thatMicrosoft no longer provides.

    You can download MASM (and MASM32) free from Microsoft and other sites. Most Windows' assembly language examples on the Internet today use MASM syntax. You may download MASM directly from Webster as part of the MASM32 package.

    Summary:

    Program Interrupts can be external, internal or software interrupts.MASM is Microsoft or macro assembler used for implementing macros.

    Important Questions:Q1.What are Program interrupts. Explain the types of Program interrupts.Q2. Explain MASM in detail.

    54

    http://en.wikipedia.org/wiki/16-bithttp://en.wikipedia.org/wiki/32-bithttp://en.wikipedia.org/wiki/64-bithttp://en.wikipedia.org/wiki/Computing_platformhttp://en.wikipedia.org/wiki/MS-DOShttp://en.wikipedia.org/wiki/MS-DOShttp://en.wikipedia.org/wiki/OS/2http://en.wikipedia.org/wiki/OS/2http://en.wikipedia.org/wiki/Win32_consolehttp://en.wikipedia.org/wiki/Win32_consolehttp://en.wikipedia.org/wiki/Phar_Lap_(company)http://en.wikipedia.org/wiki/Phar_Lap_(company)http://en.wikipedia.org/wiki/DOS_extenderhttp://en.wikipedia.org/wiki/DOS_extenderhttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/16-bithttp://en.wikipedia.org/wiki/32-bithttp://en.wikipedia.org/wiki/64-bithttp://en.wikipedia.org/wiki/Computing_platformhttp://en.wikipedia.org/wiki/MS-DOShttp://en.wikipedia.org/wiki/OS/2http://en.wikipedia.org/wiki/Win32_consolehttp://en.wikipedia.org/wiki/Phar_Lap_(company)http://en.wikipedia.org/wiki/DOS_extenderhttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Wikipedia:Citation_needed
  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    55/130

    Lecture 10:

    CPU Architecture typeso Accumulator o Register o Stack o Memory / Register

    Detailed data path of a register based CPU

    In Unit -3 we discussed the instruction set computer(ISA) which deals with the varioustypes of address instructions , addressing modes and different types of instructions invarious computer architectures.

    In this chapter we will discuss the various type of computer organizations we have. In general, most processors or computers are organized in one of 3 ways

    Single register (Accumulator) organization Basic Computer is a good example Accumulator is the only general purpose register

    Stack organization All operations are done using the hardware stack For example, an OR instruction will pop the two top elements from

    the stack, do a logical OR on them, and push the result on the stack General register organization

    Used by most modern computer processors Any of the registers can be used as the source or destination for

    computer operations

    Accumulator type of Organization:In case of accumulator type of organizations, one operand is in memory and other is inaccumulator.

    The instructions we can run with accumulator are :

    AC AC DR AND with DR AC AC + DR Add with DR AC DR Transfer from DR AC(0-7) INPR Transfer from INPR AC AC ComplementAC shr AC, AC(15) E Shift rightAC shl AC, AC(0) E Shift leftAC 0 ClearAC AC + 1 Increment

    55

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    56/130

    Circuit required:

    Stack Organization:Stack

    - Very useful feature for nested subroutines, nested interrupt services- Also efficient for arithmetic expression evaluation- Storage which can be accessed in LIFO- Pointer: SP- Only PUSH and POP operations are applicable

    Stack type of organization is of two types

    1616

    8

    Adder andlogiccircuit

    16 AC

    Accumulator From DR

    From INPR

    ControlGates

    LD INR CLR

    16

    To bus

    Clock

    56

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    57/130

    REGISTER STACK ORGANIZATION

    Register Stack

    Push, Pop operations

    ABC

    01234

    63

    Address

    FULL EMPTY

    SP

    DR

    Flags

    Stack pointer

    6 bits

    /* Initially, SP = 0, EMPTY = 1, FULL = 0 */

    PUSH POPSP SP + 1 DR M[SP]

    M[SP] DR SP SP 1If (SP = 0) then (FULL 1) If (SP = 0) then (EMPTY 1)EMPTY 0 FULL 0

    57

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    58/130

    MEMORY STACK ORGANIZATION

    Memory with Program, Data, and Stack Segments

    A portion of memory is used as a stack with a processor register as a stack pointer

    - PUSH: SP SP - 1M [SP] DR

    - POP: DR M [SP]SP SP + 1

    Note: Most computers do not provide hardware to check stack overflow (fullstack) or underflow (empty stack) must be done in software

    Register type of organization:In this we take the help of various registers , say R1 to R8 for transfer and

    manipulation of data.

    Detailed data path of a typical register based CPU

    4001400039993998

    3997

    3000

    Data(Operands)

    Program(Instructions)

    1000

    PC

    AR

    SPStack

    Stack growsIn this direction

    58

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    59/130

    To avoid memory access directly (as it is very time consuming and thus a costly technique), we prefer the register organization as it proves to be more efficient and time savingorganization.

    In this we are using 7 registers. The two multiplexers and a decoder decide which registersto be used as operands source and what register to be used as a destination for the storageof result.MUX 1 decides the 1st operand register which depends on the values of SELS1 (Selector for source 1).Similarly, for MUX 2, SELs2 works as input for 2nd operand decision.

    These two inputs through S1bus and S2 bus reach ALU. OPR denotes the type of operationto be performed and the computation or operation is performed on ALU. Then the result iseither stored back in one of the 7 registers with the help of decoder which decides which isthe resultant register with the help of SELD.

    MUX1SELS1 {

    MUX2

    }SELS2

    ALUOPR

    R1R2R3R4R5R6R7

    Input

    3 x 8

    Decoder

    SELD

    Load(7 lines )

    Output/Result

    S1bus

    S2bus

    Clock

    59

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    60/130

    Lecture 13:

    Address Sequencing / Microinstruction Sequencing

    Implementation of control unit

    Address Sequencing/Microinstruction Sequencing:Microinstructions are stored in control memory in groups, with each group specifying aroutine. The hardware that controls the address sequencing of the control memory must becapable of sequencing the microinstructions within a routine and be able to branch fromone routine to another with the help of this circuit.

    Steps : An initial address is loaded into CAR at power turned ON that usually is the firstmicroinstruction that activates the instruction fetch routine.This routine may be sequenced

    by incrementing.At the end of the fetch routine the instructionm is in the IR of thecomputer.Next the control memory computes the effective address of the operand.The netstep is the execution of the instruction fetched from memory.The transformation from the instruction code bits to an address in control memory wherethe routine is located is reffered to as a mapping process.

    Instruction code

    Mappinglog ic

    Multiplexers

    Control memory (ROM)

    Subroutineregister (SBR )

    Branchlogic

    Statusbits

    Microoperations

    Control address register (CAR)

    Incrementer

    MUXselect

    select a statusbit

    Branch address

    60

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    61/130

    At the completion of the execution of the instruction, control must return to the fetchroutine by executing an unconditional; branch microinstruction to the first address of thefetch routine.Sequencing Capabilities Required in a Control Storage

    - Incrementing of the control address register

    - Unconditional and conditional branches- A mapping process from the bits of the machineinstruction to an address for control memory

    - A facility for subroutine call and return

    Design of control Unit:After getting the microoperations we have to execute these microperations but before thatwe need to decode them.

    Fig: Decoding of microoperation fields.Because we have 8 microoperations represented with the help of 3 bits in every table andalso we have 3 such tables possible we have decoded these microperations field bits withthree 3*8 decoders.After getting the microoperations, we have to give it to particular circuits, the datamanipulation type of microperations like AND, ADD, Sub and so on we give to ALU and

    microoperation fields

    3 x 8 decoder

    6 5 4 3 2 1 0

    F1

    3 x 8 decoder

    7 6 5 4 3 2 1 0

    F2

    3 x 8 decoder

    7 6 5 4 3 2 1 0

    F3

    Arithmeticlogic andshift unit

    ANDADD

    DRTAC

    ACLoad

    FromPC

    FromDR(0-10)

    Select 0 1Multiplexers

    AR

    Load Clock

    AC

    DR

    D R T A R P C T A R

    61

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    62/130

    the corresponding results moved to AC. The ALU has been provided data from AC andDR.And for data transfer type of instructions like in the case of PCTAR or DRTAR we need tosimply transfer the values .Because we have two options for data transfer in AR we aretaking the help of MUX to choose one . We will take 2*1 MUX and one select line which

    is attached with DRTAR microperation signal .That means if DRTAR is high then MUXwill choose DR to transfer the data to AR else PC s data will be moved to AR.And thecorresponding data movement will be done with the help of load high or not .If any of thevalues is high the value will be loaded to AR.

    The clock signal is provided for the synchronization of microoperations.

    62

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    63/130

    Lecture 13:

    Fetch and decode cycle Control Unit

    Fetch and Decode

    T0: AR PC (S0S

    1S

    2=010, T0=1)

    T1: IR M [AR], PC PC + 1 (S0S1S2=111, T1=1)T2: D0, . . . , D7 Decode IR(12-14), AR IR(0-11), I IR(15)

    S2

    S 1

    S0

    Bus

    7MemoryunitAddress

    Read

    AR

    LD

    PC

    INR

    IR

    LD Clock

    1

    2

    5

    Common bus

    T1

    T0

    63

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    64/130

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    65/130

    Control Unit

    Control unit (CU) of a processor translates from machine instructions to thecontrol signals for the microoperations that implement them

    Control units are implemented in one of two ways Hardwired Control

    CU is made up of sequential and combinational circuits to generate thecontrol signals

    Microprogrammed Control A control memory on the processor contains microprograms that

    activate the necessary control signals

    We will consider a hardwired implementation of the control unit for the BasicComputer

    S2

    S1

    S0

    Bus

    7MemoryunitAddress

    Read

    AR

    LD

    PC

    INR

    IR

    LD Clock

    1

    2

    5

    Common bus

    T1

    T0

    65

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    66/130

    Lecture 15:

    Memory hierarchy and its organization Need of memory hierarchy Locality of reference principle

    In the last units we have studied the various instructions , data and the registers associatedwith our computer organization.Lets come on to micro architecture of computer , in which an important part is memory.Lets study what is a memory and what are the various types of memory available.

    Memory unit is a very essential component in a computer which is used for storing programs and data. We use main memory for running programs and also additionalcapacity for storage . We have various levels of memory units in terms of memoryhierarchy.

    MEMORY HIERARCHY

    Memory Hierarchy is to obtain the highest possible access speed while minimizing thetotal cost of the memory system

    The various components are:

    Main Memory: The memory unit that communicates directly with CPU. The programs anddata currently needed by the processor reside in main memory.

    Auxiliary Memory : This is made of devices that provide backup storage. Example :Magnetic tapes , magnetic disks etc.

    Cache memory : This is the memory which lies in between your main memory and CPU.

    ]

    Magnetictapes

    Magneticdisks

    I/Oprocessor

    CPU

    Mainmemory

    Cachememory

    66

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    67/130

    Fig :Memory Hierarchy

    In this hierarchy , we have magnetic tapes at the lowest level which means they are veryslow and very cheap in nature. Moving on to upper levels , we have main memory in whichwe get increased speed but with increased cost per bit.

    Thus we can conclude as we go towards upper levels:- Price increases- Speed increases- Cost per bit increases- Access time decreases- Size decreases

    Many operating systems are designed to enable the CPU to process a number of independent programs concurrently. This concept is called multiprogramming.This is made

    possible by the existence of 2 programs residing in different pats of memory hierarchy atthe same time . Example : CPU and I/O transfer.

    The locality of reference , also known as the locality principle , is the phenomenon, that the

    collection of the data locations referenced in a short period of time in a runningcomputer, often consists of relatively well predictable clusters .

    Analysis of a large number of typical programs has shown that the references to memory atany given interval of time tend to be confined within a few localized areas in memory. This

    phenomenon is known as locality of reference

    Register

    Cache

    Main Memory

    Magnetic Disk

    Magnetic Tape

    67

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    68/130

    Important special cases of locality are temporal , spatial , equidistant and branch locality.

    Temporal locality: if at one point in time a particular memory location isreferenced, then it is likely that the same location will be referenced again in thenear future. There is a temporal proximity between the adjacent references to the

    same memory location. In this case it is common to make efforts to store a copy of the referenced data in special memory storage, which can be accessed faster.Temporal locality is a very special case of the spatial locality, namely when the

    prospective location is identical to the present location.

    Spatial locality: if a particular memory location is referenced at a particular time,then it is likely that nearby memory locations will be referenced in the near future.There is a spatial proximity between the memory locations, referenced at almost thesame time. In this case it is common to make efforts to guess, how bigneighbourhood around the current reference is worthwhile to prepare for faster access.

    Equidistant locality: it is halfway between the spatial locality and the branchlocality. Consider a loop accessing locations in an equidistant pattern, i.e. the pathin the spatial-temporal coordinate space is a dotted line. In this case, a simplelinear function can predict which location will be accessed in the near future.

    Branch locality: if there are only few amount of possible alternatives for the prospective part of the path in the spatial-temporal coordinate space . This is thecase when an instruction loop has a simple structure, or the possible outcome of asmall system of conditional branching instructions is restricted to a small set of

    possibilities. Branch locality is typically not a spatial locality since the few

    possibilities can be located far away from each other. Sequential locality: In a typical program the execution of instructions follows asequential order unless branch instructions create out of order execution. This alsotake into consideration spatial locality as the sequential instructions are stored near to each other.

    In order to make benefit from the very frequently occurring temporal and spatial kind of locality, most of the information storage systems are hierarchical. The equidistant localityis usually supported by the diverse nontrivial increment instructions of the processors. For the case of branch locality, the contemporary processors have sophisticated branch

    predictors, and on the base of this prediction the memory manager of the processor tries tocollect and preprocess the data of the plausible alternatives.

    Reasons for locality

    There are several reasons for locality. These reasons are either goals to achieve or circumstances to accept, depending on the aspect. The reasons below are not disjoint; infact, the list below goes from the most general case to special cases.

    68

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    69/130

    Predictability: In fact, locality is merely one type of predictable behavior incomputer systems. Luckily, many of the practical problems are decidable and hencethe corresponding program can behave predictably , if it is well written.

    Structure of the program: Locality occurs often because of the way in whichcomputer programs are created, for handling decidable problems. Generally, related

    data is stored in nearby locations in storage. One common pattern in computing involves the processing of several items, one at a time. This means that if a lot of processing is done, the single item will be accessed more than once, thus leading totemporal locality of reference. Furthermore, moving to the next item implies thatthe next item will be read, hence spatial locality of reference, since memorylocations are typically read in batches.

    Linear data structures: Locality often occurs because code contains loops thattend to reference arrays or other data structures by indices. Sequential locality, aspecial case of spatial locality, occurs when relevant data elements are arranged andaccessed linearly. For example, the simple traversal of elements in a one-

    dimensional array, from the base address to the highest element would exploit thesequential locality of the array in memory. [2] The more general equidistant localityoccurs when the linear traversal is over a longer area of adjacent data structures having identical structure and size, and in addition to this, not the whole structuresare in access, but only the mutually corresponding same elements of the structures.This is the case when a matrix is represented as an sequential matrix of rows andthe requirement is to access a single column of the matrix.

    Use of locality in general

    If most of the time the substantial portion of the references aggregate into clusters, and if

    the shape of this system of clusters can be well predicted, then it can be used for speedoptimization. There are several ways to make benefit from locality. The commontechniques for optimization are:

    to increase the locality of references. This is achieved usually on the software side.

    to exploit the locality of references. This is achieved usually on the hardware side.The temporal and spatial locality can be capitalized by hierarchical storagehardwares. The equidistant locality can be used by the appropriately specializedinstructions of the processors, this possibility is not only the responsibility of hardware, but the software as well, whether its structure is suitable for compiling a

    binary program which calls the specialized instructions in question. The branchlocality is a more elaborate possibility, hence more developing effort is needed, butthere is much larger reserve for future exploration in this kind of locality than in allthe remaining ones.

    69

    http://en.wikipedia.org/wiki/Predictabilityhttp://en.wikipedia.org/wiki/Predictabilityhttp://en.wikipedia.org/wiki/Decidablehttp://en.wikipedia.org/wiki/Decidablehttp://en.wikipedia.org/wiki/Predictabilityhttp://en.wikipedia.org/wiki/Computer_programhttp://en.wikipedia.org/wiki/Decidablehttp://en.wikipedia.org/wiki/Decidablehttp://en.wikipedia.org/wiki/Computinghttp://en.wikipedia.org/wiki/Locality_of_reference#cite_note-1http://en.wikipedia.org/wiki/Data_structurehttp://en.wikipedia.org/wiki/Matrix_(mathematics)http://en.wikipedia.org/wiki/Optimization_(computer_science)http://en.wikipedia.org/wiki/Predictabilityhttp://en.wikipedia.org/wiki/Decidablehttp://en.wikipedia.org/wiki/Predictabilityhttp://en.wikipedia.org/wiki/Computer_programhttp://en.wikipedia.org/wiki/Decidablehttp://en.wikipedia.org/wiki/Computinghttp://en.wikipedia.org/wiki/Locality_of_reference#cite_note-1http://en.wikipedia.org/wiki/Data_structurehttp://en.wikipedia.org/wiki/Matrix_(mathematics)http://en.wikipedia.org/wiki/Optimization_(computer_science)
  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    70/130

    Lecture 16:

    Main Memoryo RAM chip organizationo ROM chip organization

    Expansion of main memoryo Memory connections to CPUo Memory address map

    Till now we have discussed the memory interconnections and their comparisons.Lets take each in detail.

    Main Memory: Main memory is a large (w.r.t Cache Memory ) and fast memory (w.r.tmagnetic tapes , disks etc) used to store the programs and data during the computer operation. I/O processor manages data transfers between auxiliary memory and mainmemory.

    Main Memory is available in 2 types :

    The principal technology used for main memory is based on semiconductor integratedcircuits.RAM : This is part of main memory where we can both read and write data.

    Typical RAM chip:

    CS1 and CS2 are used to enable or disable a particular RAM..

    We have corresponding truth table as:

    Chip select 1

    Chip select 2ReadWrite

    7-bit address

    CS1

    CS2RDWRAD 7

    128 x 8RAM 8-bit data bus

    70

  • 7/31/2019 Cao-notess by Girdhar Gopal Gautam 3g

    71/130

    We have RAM enabled when CS1 as 1 and CS2 as 0.Else we will have inhibitoperation or high impedence state. When we have 1 and 0 we will have RAM enabled.But if we have both read and write as 0 we dont have any operation and thus RAM is inhigh impedence state .RD pin tells us that RAM is getting used fro read operation.

    Similarly WR pin is used to show that Write operation is getting performed on RAM.In this if we have option of both WR and RD as high we choose read operation else we willhave inconsistency of data.

    Since we have 128 * 8 words RAM that means we have 128 words and each word of length 8 bits.

    Thus we need 8 bit data bus to transfer the data and we have bidirection 8 bit databus .

    To access 128 words we need 2 7 i.e. 7 bits to access 128 words.

    Integrated c