execution control comp

Upload: priyambharathi73981

Post on 03-Apr-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 Execution Control Comp

    1/47

    Execution Control

  • 7/29/2019 Execution Control Comp

    2/47

    Execution Control

    Execution control refers to the rules or

    mechanisms used in a processor for

    determining the next instruction to execute.

    several features of execution control

    hardware looping,

    interrupt handling,stacks, and

    relative branch support.

  • 7/29/2019 Execution Control Comp

    3/47

    Hardware Looping

    DSP algorithms frequently involve the repetitiveexecution of a small number of instructions- so-called inner loops or kernels.

    problems associated with traditional approachesto repeated instruction execution: a natural approach to looping uses a branch

    instruction to jump back to the start of the loop.

    most loops execute a fixed number of times, the

    processor must usually use a register to maintain theloop index, that is, the count of the number of timesthe processor has been through the loop.

  • 7/29/2019 Execution Control Comp

    4/47

    DSP processors have evolved to avoid these problemsvia hardware looping, also known as zero-overheadlooping.

    Hardware loops are special hardware controlconstructs that repeat either a single instruction or agroup of instructions some number of times.

    The key difference between hardware loops andsoftware loops is that hardware loops lose no timeincrementing or decrementing counters, checking tosee if the loop is finished, or branching back to the topof the loop. This can result in considerable savings.

  • 7/29/2019 Execution Control Comp

    5/47

    The software loop takes roughly three times

    as long to execute, assuming that all

    instructions execute in one instruction cycle.

    In fact, branch instructions usually take several

    cycles to execute, so the hardware looping

    advantage is usually even larger.

  • 7/29/2019 Execution Control Comp

    6/47

  • 7/29/2019 Execution Control Comp

    7/47

    Types of H/w looping

    Single-Instruction Hardware Loops and

    Multi-Instruction Hardware Loops

  • 7/29/2019 Execution Control Comp

    8/47

    Single-Instruction Hardware Loops

    A single-instruction hardware loop repeats a

    single instruction some number of times.

    Eg. Texas Instruments TMS320C2x

    A multi-instruction hardware loop repeats a

    group of instructions some number of times.

    Eg. Analog Devices ADSP-21 xx

  • 7/29/2019 Execution Control Comp

    9/47

    Single-Instruction Hardware Loops

    Single-instruction hardware loop

    executes one instruction repeatedly

    the instruction needs to be fetched from program

    memory only once.

    the program bus can be used for accessing

    memory for purposes other than fetching an

    instruction, e.g., for fetching data or coefficientvalues that are stored in program memory.

  • 7/29/2019 Execution Control Comp

    10/47

    Multi-Instruction Hardware Loops

    must refetch the instructions in the block of code

    being repeated each time the processor proceeds

    through the loop.

    the processor's program bus is not available toaccess other data.

  • 7/29/2019 Execution Control Comp

    11/47

    Loop Repetition Count

    A feature that differentiates processors' hardwarelooping capabilities is the minimum and maximumnumber of times a loop can be repeated.

    Almost all processors support a minimum repetitioncount of one, and 65,536 is a common upper limit.

    Frequently, the maximum number of repetitions islower if the repetition count is specified usingimmediate data, simply because the processor may

    place restrictions on the size of immediate data words.

  • 7/29/2019 Execution Control Comp

    12/47

    Loop Repetition Count

    A hardware looping pitfall found on someprocessors (e.g., Motorola DSP5600x andZoran ZR3800x) is the following: a loop count

    of zero causes the processor to repeat theloop the maximum number of times.

    While this is not a problem for loops whosesize is fixed, it can be a problem if the programdynamically determines the repetition countat run time.

  • 7/29/2019 Execution Control Comp

    13/47

    Loop Effects on Interrupt Latency

    Single-instruction hardware loops disablesinterrupts for the duration of their execution.

    system designers making use of both single-

    instruction hardware loops and interrupts mustcarefully consider the maximum interrupt lockouttime they can accept and code their single-instruction loops accordingly.

    Alternative: use multi-instruction loops on singleinstructions and breaking the single-instruction loopup into a number of smaller loops

  • 7/29/2019 Execution Control Comp

    14/47

    Nesting Depth

    A nested loop is one loop placed within

    another.

    The most common approaches to hardware

    loop nesting are:

    Directly nestable

    Partially nestable

    Software nestable

    Nonnestable

  • 7/29/2019 Execution Control Comp

    15/47

    Hardware-Assisted Software Loops

    An alternative to nesting hardware loops is to nest a singlehardware loop within a software loop that uses specializedinstructions for software looping.

    TMS320C3x and TMS320C4x support a decrement-andbranch-if-not-zero instruction.

    While this instruction costs one instruction cycle on theTMS320C3x and TMS320C4x and three on the DSP16xx, thenumber of cycles is less than would be required to executeindividual decrement, test, and branch instructions.

    This is a reasonable compromise between the hardware cost required to support nested hardware loops

    the execution time cost of implementing loops entirely insoftware.

  • 7/29/2019 Execution Control Comp

    16/47

    Interrupts

    An external event that causes the processor to

    stop executing its current program and branch

    to a special block of code called an interrupt

    service routine.

    All DSP processors support interrupts and

    most use interrupts as their primary means of

    communicating with peripherals.

  • 7/29/2019 Execution Control Comp

    17/47

    Interrupt Sources

    On-chip peripherals : generate interrupts

    when certain conditions are met

    External interrupt lines : asserted by external

    circuitry to interrupt the processor

    Software interrupts : generated either under

    software control or due to a software-initiated

    operation

  • 7/29/2019 Execution Control Comp

    18/47

    Interrupt Vectors

    All processors associate a different memory address witheach interrupt. These locations are called interruptvectors.

    Processors that provide different locations for eachinterrupt are said to support vectored interrupts.

    Typical interrupt vectors are one or two words long and arelocated in low memory.

    The interrupt vector usually contains a branch orsubroutine call instruction that causes the processor tobegin execution of the interrupt service routine located

    elsewhere in memory. On some processors, interrupt vector locations are spaced

    apart by several words.

  • 7/29/2019 Execution Control Comp

    19/47

    Interrupt Enables

    All processors provide mechanisms to globallydisable interrupts.

    On some processors, this may be via special

    interrupt enable and interrupt disableinstructions, while on others it may involvewriting to a special control register.

    Most processors also provide individual interrupt

    enables for each interrupt source. This allows theprocessor to selectively decide which interruptsare allowed at any given time.

  • 7/29/2019 Execution Control Comp

    20/47

    Interrupt Priorities and Automatically

    Nestable Interrupts

    Prioritized interrupts

    some interrupts have ahigher priority than others.

    the one with the higher priority will be serviced,

    the one with the lower priority must wait for theprocessor to finish servicing the higher priorityinterrupt.

    When a processor allows a higher-priority

    interrupt to interrupt a lower priority interruptthat is already executing, we call thisautomatically nestable interrupts.

  • 7/29/2019 Execution Control Comp

    21/47

    Interrupt Latency

    The amount of time between an interrupt

    occurring and the processor doing something

    in response to it.

    Formal definition: The minimum time from

    the assertion of an external interrupt line to

    the execution of the first word of the interrupt

    vector that can be guaranteed under certainassumptions.

  • 7/29/2019 Execution Control Comp

    22/47

    Interrupt Latency

    The details of and assumptions used in thisdefinition are as follows:

    Most processors sample the status of externalinterrupt lines every instruction cycle. For an interruptto be recognized as occurring in a given instructioncycle, the interrupt line must be asserted someamount of time prior to the start of the instructioncycle; this time is referred to as the set-up time. we

    assume that these setup time requirements aremissed. This lengthens interrupt latency by oneinstruction cycle.

  • 7/29/2019 Execution Control Comp

    23/47

    Interrupt Latency

    Depending on the processor synchronization can

    add from one to three instruction cycles to the

    processor's interrupt latency.

    we assume the processor is in an interruptiblestate, which typically means that it is executing

    the shortest interruptible instruction possible.

    we assume the processor is in an interruptible

    state, which typically means that it is executing

    the shortest interruptible instruction possible.

  • 7/29/2019 Execution Control Comp

    24/47

    Mechanisms for Reducing Interrupts

    Some processors provide "autobuffering" ontheir serial ports.

    This feature allows the serial port to save its

    received data directly to the processor'smemory without interrupting the processor.

    After a certain number of samples have been

    transferred, the serial port interrupts theprocessor.

    This is a specialized form of DMA.

  • 7/29/2019 Execution Control Comp

    25/47

    Stacks

    Processor stack support is closely tied to execution control.For example, subroutine calls typically place their returnaddress on the stack, while interrupts typically use thestack to save both return address and status information.

    DSP processors typically provide one of three kinds of stacksupport: Shadow registers: Shadow registers are dedicated backup

    registers that hold the contents of key processor registers duringinterrupt processing.

    Hardware stack: A hardware stack holds selected registers

    during interrupt processing or subroutine calls. Software stack: A software stack is a conventional stack using

    the processor's main memory to store values during interruptprocessing or subroutine calls.

  • 7/29/2019 Execution Control Comp

    26/47

    Relative Branch Support

    All DSP processors support branch or jump instructionsas one of their most basic forms of execution control.

    In PC-relative branching, the address to which theprocessor is to branch is specified as an offset from the

    current address.

    PC-relative addressing is useful for creating position-independent programs can be relocated in memorywhen it is loaded into the processor's memory.

    In addition to supporting position-independent code,PC-relative branching can also save program memoryin certain situations.

  • 7/29/2019 Execution Control Comp

    27/47

    Pipelining

  • 7/29/2019 Execution Control Comp

    28/47

    Pipelining

    A technique for increasing the performance ofa processor by breaking a sequence ofoperations into smaller pieces and executing

    these pieces in parallel when possible, therebydecreasing the overall time required tocomplete the set of operations.

    Unfortunately, in the process of improvingperformance, pipelining frequentlycomplicates programming.

  • 7/29/2019 Execution Control Comp

    29/47

    Pipelining and Performance

    A hypothetical processor uses separate executionunits to accomplish the following actionssequentially for a single instruction: Fetch an instruction word from memory

    Decode the instruction Read a data operand from or write a data operand to

    memory

    Execute the ALU or MAC portion of the instruction

    Assuming that each of the four stages abovetakes 20 ns to execute, and that they must bedone sequentially

  • 7/29/2019 Execution Control Comp

    30/47

  • 7/29/2019 Execution Control Comp

    31/47

    A pipelined implementation of this processor

    starts a new instruction fetch immediately

    after the previous instruction has beenfetched .

    Similarly, it begins decoding each instruction

    as soon as the previous instruction is finisheddecoding.

    In essence, it overlaps the various stages of

    execution. As a result, the execution stages now work in

    parallel.

  • 7/29/2019 Execution Control Comp

    32/47

  • 7/29/2019 Execution Control Comp

    33/47

    Pipeline Depth

    Although most DSP processors are pipelined, the depth(number of stages) of the pipeline may vary from oneprocessor to another.

    In general, a deeper pipeline allows the processor to

    execute faster but makes the processor harder toprogram.

    Most processors use three stages (instruction fetch,decode, and execute) or four stages (instruction fetch,

    decode, operand fetch, and execute). In three-stage pipelines the operand fetch is typically

    done in the latter part of the decode stage.

  • 7/29/2019 Execution Control Comp

    34/47

    Interlocking

    The execution sequence shown in Figure 9-2 isreferred to as a perfect overlap, because thepipeline phases mesh together perfectly and

    provide 100 percent utilization of theprocessor's execution stages. In reality,processors may not perform as well as wehave shown in our hypothetical example.

    The most common reason for this is resourcecontention.

  • 7/29/2019 Execution Control Comp

    35/47

    Interlocking

  • 7/29/2019 Execution Control Comp

    36/47

    Interlocking

    One solution to this problem is interlocking.

    An interlocking pipeline delays the progression of the latter of

    the conflicting instructions through the pipeline

  • 7/29/2019 Execution Control Comp

    37/47

    Interlocking

  • 7/29/2019 Execution Control Comp

    38/47

    Branching Effects

    Branching effects occur whenever there is achange in program flow, and not just forbranch instructions.

    For example, subroutine call instructions,subroutine return instructions, and returnfrom interrupt instructions are all candidatesfor the pipeline effects described above.

    Processors offering delayed branchesfrequently also offer delayed returns.

  • 7/29/2019 Execution Control Comp

    39/47

    Branching Effects

  • 7/29/2019 Execution Control Comp

    40/47

    Branching Effects

  • 7/29/2019 Execution Control Comp

    41/47

    Interrupt Effects

    Interrupts typically involve a change in a program'sflow of control to branch to the interrupt serviceroutine. The pipeline often increases the processor'sinterrupt response time, much as it slows down branch

    execution. When an interrupt occurs, almost all processors allow

    instructions at the decode stage or further in thepipeline to finish executing, because these instructions

    may be partially executed. What occurs past this pointvaries from processor to processor.

    We discuss several examples below:

  • 7/29/2019 Execution Control Comp

    42/47

    Interrupt Effects

  • 7/29/2019 Execution Control Comp

    43/47

    Interrupt Effects

  • 7/29/2019 Execution Control Comp

    44/47

    Interrupt Effects

  • 7/29/2019 Execution Control Comp

    45/47

    Pipeline Programming Models

    The examples in the sections above have

    concentrated on the instruction pipeline and

    its behavior and interaction with other parts

    of the processor under various circumstances.

    Two major assembly code formats for

    pipelined processors:

    time-stationary

    data-stationary

  • 7/29/2019 Execution Control Comp

    46/47

    Pipeline Programming Models

    In the time-stationary programming model,the processor's instructions specify the actionsto be performed by the execution units

    (multiplier, accumulator, and so on) during asingle instruction cycle.

    A good example is the AT&T DSP16xx familywhere a multiply-accumulate instruction looks

    like:

    a0=a0+p p=x*y y=*r0++ p=*pt++

  • 7/29/2019 Execution Control Comp

    47/47

    Pipeline Programming Models

    Data-stationary programming specifies the

    operations that are to be performed, but not

    the exact times during which the actions are

    to be executed.

    As an example, consider the following AT&T

    DSP32xx instruction:

    a1 = a1 + (*r5++ = *r4++) * *r3++