basics and architectures. 2 risc and cisc processor

Basics and Architectures

2

RISC and CISC Processor

3

IntroductionThere are two fundamentally different ways of

designing CPUsThe CPU can be designed to have an instruction set

with:very basic instructions ORa wide range of complex instructions

Exercise – List typical instructions for each case add, move-data etc multiply, dsp orientated instructions etc

4

CISC Processor• Complex Instruction Set Computer (CISC).• The primary goal of CISC architecture is to complete a task in as few lines of assembly as

possible.• This is achieved by building processor hardware that is capable of understanding and

executing a series of operations.• For this particular task, a CISC processor would come prepared with a specific instruction

(we'll call it "MULT"). When executed, this instruction loads the two values into separate registers, multiplies the operands in the execution unit, and then stores the product in the appropriate register. Thus, the entire task of multiplying two numbers can be completed with one instruction:

MULT data1, data2

• MULT is what is known as a "complex instruction." It operates directly on the computer's memory banks and does not require the programmer to explicitly call any loading or storing functions. It closely resembles a command in a higher level language. For instance, if we let "a" represent the value of “data1” and "b" represent the value of “data2”, then this command is identical to the C statement "a = a * b."

5

• One of the primary advantages of this system is that the compiler has to do very little work to translate a high-level language statement into assembly. Because the length of the code is relatively short, very little RAM is required to store instructions.

6

RISC Processor• Reduce Instruction Set Computer (RISC).• RISC processors only use simple instructions that can be executed within

one clock cycle. • Thus, the "MULT" command described above could be divided into three

separate commands: "LOAD," which moves data from the memory bank to a register, "PROD," which finds the product of two operands located within the registers, and "STORE," which moves data from a register to the memory banks. In order to perform the exact series of steps described in the CISC approach, a programmer would need to code four lines of assembly:

LOAD A, data1LOAD B, data2PROD A, BSTORE data1, A

7

CISC v/s RISC

CISC RISC• Emphasis on hardware. • Emphasis on software.

• Includes multi-clock complex instructions

• Single-clock, reduced instruction only

• Memory-to-memory:"LOAD" and "STORE"incorporated in instructions

• Register to register:"LOAD" and "STORE"are independent instructions

• Small code sizes,high cycles per second

• Low cycles per second,large code sizes

• Transistors used for storingcomplex instructions

• Spends more transistorson memory registers

8

Conclusion CISC v/s RISC

• At first, this may seem like a much less efficient way of completing the operation. Because there are more lines of code, more RAM is needed to store the assembly level instructions. The compiler must also perform more work to convert a high-level language statement into code of this form.

• However, the RISC strategy also brings some very important advantages. Because each instruction requires only one clock cycle to execute, the entire program will execute in approximately the same amount of time as the multi-cycle "MULT" command.

• These RISC "reduced instructions" require less transistors of hardware space than the complex instructions, leaving more room for general purpose registers. Because all of the instructions execute in a uniform amount of

time (i.e. one clock), pipelining is possible.

9

Performance Equation

The following equation is commonly used for expressing a computer's performance ability:

The CISC approach attempts to minimize the number of instructions per program, sacrificing the number of cycles per instruction.

RISC does the opposite, reducing the cycles per instruction at the cost of the number of instructions per program.

10

The Overall RISC Advantage • Today, the Intel x86 is arguable the only chip which retains CISC

architecture. This is primarily due to advancements in other areas of computer technology.

• The price of RAM has decreased dramatically. In 1977, 1MB of DRAM cost about $5,000. By 1994, the same amount of memory cost only $6 (when adjusted for inflation).

• Compiler technology has also become more sophisticated, so that the RISC use of RAM and emphasis on software has become ideal.

PIPELINING CONCEPT

Pipelining Concept

DecodeFetch Execute

• The instruction is fetched from memory

• The instruction is decoded and the data path

• control signals prepared for the next cycle

• The operands are read from the register bank, shifted, combined in the ALU and

• The result written back

3 stage Pipelining

• That is basically 3 stage pipelining, – Fetch– Decode– Execute

How it works????

3 Basic Architecture

1. Super Scalar Architecture/Von-Neuman Architecture

2. Very Long Instruction word Architecture (VLIW)/ HARWARD Architecture

3. Single Instruction Multiple data (SIMD)

Superscalar Architecture

• Superscalar processing is the ability to initiate multiple instructions during the same clock cycle.

• A superscalar architecture consists of a number of pipelines that are working in parallel.

• Depending on the number and kind of parallel units available, a certain number of instructions can be executed in parallel.

• It is known as Instruction level parallelism (ILP).

Degree of Superscalar Architecture

Superscalar Execution

Advantage of Superscalar Architecture

• Maximum Performance • Propagation delay is less.

Example of Superscalar Architecture• Athlon(AMD) processor has these type of Architecture

Disadvantage of Superscalar Architecture

• Problem in Scheduling.

Very Long Instruction Word (VLIW) Architecture/ HARWARD Architecture

• Problem in Superscalar Architecture :– Problem of Scheduling which operation and which one have to wait

and has done sequentially after others.

• A typical VLIW (very long instruction word) machine has instruction words hundreds of bits in length.

• Every VLIW contains enough bits to specify not just one

machine operation but several to be performed simultaneously.

How it can be achieved ?

• It’s name implies: a machine language instruction format that is fixed in length(as in RISC Arch.) but much longer than 32 bit to 64 bits.

• The format of VLIW word has fixed length and each instructions are scheduled to execute. Now compiler has to more work than a control unit.

• It is compiler’s job to analyze the program for data and resources and pack the slots of each VLIW with as many concurrently executable operations as possible

Ope

ratio

n 1

Ope

rand

1A

Ope

rand

1B

Ope

rand

1C

Ope

ratio

n 2

Ope

rand

2A

Ope

rand

2B

Ope

rand

2C

Ope

ratio

n 3

Ope

rand

3A

Ope

rand

3B

Ope

rand

3C

The Word format of VLIW Architecture

Advantages of VLIW

• Reducing delay by Scheduling.• Allow to work the system at higher frequency.• It can execute more operation per cycle.• Increases CPU Clock speed.

• Poor code density ( Number of bits required to store each instruction).

• Wasted bit fields.• Take up more space.

Disadvantages of VLIW

SIMD Architecture

• SIMD stands for single instruction, multiple data.

• It is the ability to do the same instruction on multiple pieces of data. This can lead to significantly better performance, since it is using less cycles with the same amount of data.

• This is done by packing the numbers into a vector. For instance, to add {1,2,3,4} to {5,6,7,8} you could add 1 to 5, 2 to 6, and so on. Or you could use SIMD and add 1,2,3,4 to 5,6,7,8, and the result gets stored, when you can then unpack it and store it in a register. This is why it is sometimes called vector math.

• SIMD has far-reaching applications; although the bulk and focus has been on multimedia. Why? Because it is an area of computing that needs as much computing power as possible, is popular, and in most cases, it is necessary to compute a lot of data at once.

• SIMD works by performing an instruction on multiple pieces of data at the same time by packing a vector with data and sending each data parallel to one another.

Application of SIMD

• Two such real world examples of usage of SIMD are Fast-Fourier Transform (FFT) and three-dimensional transform.

• Fast Fourier Transform is used primarily with applications dealing with waveforms, such as Digital Signal Processors (DSPs), radar, sonar, and more popularly, audio/mpeg encoding (MP3s).

basics and architectures. 2 risc and cisc processor

Documents