pipelining and co processor

Pipeliningand

Co-processor

What is Pipelining

In simple words Pipelining means starting the execution of 2nd process before 1st is completed.

Overview

Pipelining is widely used in modern processors.

Pipelining improves system performance in terms of throughput.

Pipelined organization requires sophisticated compilation techniques.

Basic Concept

Faster Execution

Multi Tasking

Making the Execution of Programs Faster

Use faster circuit technology to build the processor and the main memory.

Arrange the hardware so that more than one operation can be performed at the same time.

In the latter way, the number of operations performed per second is increased even though the elapsed time needed to perform any one operation is not changed.

Traditional Pipeline Concept

A, B, C, D

each have one load of clothes to wash, dry, and fold.

“Washer” takes 30 minutes

“Dryer” takes 40 minutes

“Folder” takes 20 minutes

A B C D

Laundry Example


Sequential laundry takes 6 hours for 4 loads

If they learned pipelining, how long would laundry take?

A

B

C

D

30 40 20 30 40 20 30 40 20 30 40 20

6 PM 7 8 9 10 11 Midnight

Time


Pipelined laundry takes 3.5 hours for 4 loads

A

B

C

D

6 PM 7 8 9 10 11 Midnight

T

a

s

k

O

r

d

e

r

Time

30 40 40 40 40 20


A

B

C

D

6 PM 7 8 9

T

a

s

k

O

r

d

e

r

Time

30 40 40 40 40 20

Use the Idea of Pipelining in a Computer

F1

E1

F2

E2

F3

E3

I1 I2 I3

(a) Sequential execution

T time

F1 E1

F2 E2

F3 E3

I1

I2

I3

Instruction

(c) Pipelined execution

Figure of Basic idea of instruction pipelining.

Clock cycle 1 2 3 4

TTime

Fetch + Execution

Role of Cache Memory

Each pipeline stage is expected to complete in one clock cycle.

The clock period should be long enough to let the slowest pipeline stage to complete.

Faster stages can only wait for the slowest one to complete.

Since main memory is very slow compared to the execution, if each instruction needs to be fetched from main memory, pipeline is almost useless.

Fortunately, we have cache.

Pipeline Performance

The potential increase in performance resulting from pipelining is proportional to the number of pipeline stages.

However, this increase would be achieved only if all pipeline stages require the same time to complete, and there is no interruption throughout program execution.

Unfortunately, this is not true.


F1

F2

F3

I1

I2

I3

D1

D2

D3

E1

E2

E3

W1

W2

W3

Instruction

Figure 8.4. Pipeline stall caused by a cache miss in F2.

1 2 3 4 5 6 7 8 9Clock cycle

(a) Instruction execution steps in successive clock cycles

1 2 3 4 5 6 7 8Clock cycle

Stage

F: Fetch

D: Decode

E: Execute

W: Write

F1 F2 F3

D1 D2 D3idle idle idle

E1 E2 E3idle idle idle

W1 W2idle idle idle

(b) Function performed by each processor stage in successive clock cycles

9

W3

F2 F2 F2

Time

Time

Idle periods – stalls (bubbles)


F1

F2

F3

I1

I2 (Load)

I3

E1

M2

D1

D2

D3

W1

W2

Instruction

F4I4

Clock cycle 1 2 3 4 5 6 7

Figure 8.5. Effect of a Load instruction on pipeline timing.

F5I5 D5

Time

E2

E3 W3

E4D4

Load X(R1), R2Structural hazard


Again, pipelining does not result in individual instructions being executed faster; rather, it is the throughput that increases.

Throughput is measured by the rate at which instruction execution is completed.

Pipeline stall causes degradation in pipeline performance.

We need to identify all hazards that may cause the pipeline to stall and to find ways to minimize their impact.

Pipeline Hazards

There are situations, called hazards, that prevent the next instruction in the instruction stream from executing during its designated cycle

There are three classes of hazards Structural hazard Data hazard Branch hazard

Pipeline Hazards

Structural hazard Resource conflicts when the hardware cannot support

all possible combination of instructions simultaneously Data hazard

An instruction depends on the results of a previous instruction

Branch hazard Instructions that change the PC

Pipeline Stall

When a hazard prevents an instruction step from happening, the processor pauses the executing the step until hazard will restored.

Pipeline stalls slow the execution of an Instruction , but do not prevent it from executing correctly.

CO-PROCESSOR’s

WHAT IS CO-PROCESSOR

A computer co-processor is processor used to supplement the function of primary processor.

First seen on mainframe computers.

Accelerate the system performance.

HISTORY OF CO-PROCESSOR Co-processor for floating point arithmetic first

appeared in desktop computers in 1970s. The coprocessors become common in 1980s

and into the early 1990s. Early 8_Bit and 16 Bit processor uses

software to carryout the floating point arithmetic operations.

Math co-processor were popular purchase for users of computer-aided design (CAD) software and scientific and engineering calculations.

OPERATION PERFORMED BY COPROCESSOR

Floating point arithmetic Graphic & Signal processing. String processing. Encryption

Coprocessor are Unable to fetch the code from the memory so they work under the control of main processor .

Architecture of 8087

INTEL 8087

Numeric Processor. Packed in 40 pin ceramic DIP package. Available in 5 MHz, 8MHz, 10MHz

versions compatible with 8086, 8088, 80186, 80188.

It adds 68 new instruction to the instruction set of 8086.

How it works

The 8087 instruction may lie interleaved in the 8086 program, but it is the task of 8086 to identify the 8087 instructions from the program, send it to 8087 for further execution & after the completion of execution cycle the result may be referred back to CPU.

Operation of 8087 does not require any software support from the system software or operating system.

Architecture of 8087

Two major sections:

1) Control unit2) Numeric Execution unit

Control Unit

Function : It interface the coprocessor to the

microprocessor – system data bus. Monitors the instruction stream. If the instruction is an ESCape (coprocessor)

instruction, the coprocessor executes it; if not the microprocessor executes it.

It receives , decodes instructions, read and write memory operands and executes the 8087 instruction

Numeric Execution Unit (NEU)

Functions : Execute all the numeric processor

instructions. It has 8 register (80 bit) stack that holds

the operands for arithmetic instructions & the result.

Instruction either address data in specific stack data – register or uses push and pop mechanism to store and retrieve data.

Control Word Register of 8087

Coprocessor Control Instructions

The coprocessor has control instructions for initialization, exception handling, and task switching.

All control instructions have two forms.


FINIT/FNINIT Performs a reset (initialize) operation on the

arithmetic coprocessor. The coprocessor operates with a closure of

projective (unsigned infinity), rounds to the nearest or even, and uses extended-precision when reset or initialized. also sets register 0 as the top of the stack


FSETPM Changes the coprocessor to the protected-

addressing mode. used when the microprocessor is protected mode

Protected mode can only be exited by a hardware reset. or in 80386-Pentium 4, with a change to the

control register


FLDCW Loads the control register with the

word addressed by the operand.

FSTCW Stores the control register into the

word-sized memory operand.


FSTSW AX Copies the contents of the control register

to the AX register. not available to 8087

FCLEX Clears the error flags in the status register

and also the busy flag.

Graphics Coprocessor

noun a high-speed display adapter that is dedicated to graphics operations such as line drawing and plotting

A coprocessor utilized to accelerate the displaying of graphics, significantly speeding up the updating of the images on a screen, and freeing the CPU to take care of other tasks.

A graphics coprocessor maybe incorporated into a graphics accelerator, or may be part of a separate subsystem. Also called graphics processor .

pipelining and co processor

Education

pipeline stalls

instruction execution

number of pipeline stages

slowest pipeline stage

computer coprocessor

math coprocessor

instruction step

midnight time