processing unit - · pdf filearithmetic & logic unit •does the calculations (perform...
TRANSCRIPT
Processing Unit
CS206T
Microprocessors
• The density of elements on processor chips continued to rise
– More and more elements were placed on each chip so that fewer and fewer chips were needed to construct a single computer processor
Microprocessor Speed • Techniques built into contemporary processors include
• Branch prediction – Processor looks ahead in the instruction code fetched from memory and
predicts which branches, or groups of instructions, are likely to be processed next
• Data flow analysis – Processor analyzes which instructions are dependent on each other’s
results, or data, to create an optimized schedule of instructions
• Speculative execution – Using branch prediction and data flow analysis, some processors
speculatively execute instructions ahead of their actual appearance in the program execution, holding the results in temporary locations, keeping execution engines as busy as possible
Improved Performance
• Organizational enhancement to the processor can improve performance, such as
1. Use of multiple register
2. Cache memory
3. Multicore: placing multiple processors on the same chip.
4. Pipelining: execute more than one instruction at one
time Pipelining breaks instruction execution down
into several stages
Improved Performance
• Pipelining: But not doubled:
– Fetch usually shorter than execution (cf. reading and storing operands)
• Prefetching more than one instruction?
–Any jump or branch means that prefetched instructions are not the required instructions
• Add more stages to improve performance
Pipelining (six stages) 1. Fetch instruction
2. Decode instruction
3. Calculate operands (i.e., EAs)
4. Fetch operands
5. Execute instructions
6. Write result
• Overlap these operations
Timing Diagram for Instruction Pipeline Operation (assuming independence)
Intel Evolution (1)
• 8080 – first general purpose microprocessor
– 8 bit data path
– Used in first personal computer – Altair
• 8086 – much more powerful
– 16 bit
– instruction cache, prefetch few instructions
– 8088 (8 bit external bus) used in first IBM PC
• 80286 – 16 Mbyte memory addressable
– up from 1Mb
• 80386 – 32 bit
– Support for multitasking (running multiple program at same time)
Intel Evolution (2)
• 80486 – sophisticated powerful cache and instruction pipelining – built in maths co-processor
• Pentium – Superscalar – Multiple instructions executed in parallel
• Pentium Pro – Increased superscalar organization – Aggressive register renaming – branch prediction – data flow analysis – speculative execution
Intel Evolution (3)
• Pentium II – MMX technology , design to process graphics, video & audio
• Pentium III – Additional floating point instructions to support 3D graphics S/W
• Pentium 4 – Further floating point and multimedia enhancements
• core – Dual core
– Implement two processor in one chip
• Core 2 – Provide four processor in one chip
Processor organization
• Basic Elements of Processor
1. Registers
2. Control Unit
3. ALU
4. Internal data paths
5. External data paths
CPU Function
• CPU must:
– Fetch instructions: the processor read an instruction from memory.
– Interpret/decode instructions: the instruction is decoded to determine what action is required.
– Fetch data: the execution of an instruction may require reading data from memory or an I/O module
– Process data: the execution of an instruction may require performing some arithmetic or logical operation on data
– Write data: the result of an execution may require writing data to memory or an I/O module.
CPU With Systems Bus
Registers
• CPU must have some working space (temporary storage) - registers
• Number and function vary between processor designs - one of the major design decisions
• Top level of memory hierarchy
Registers
• CPU must have some working space (temporary storage) - registers
• Number and function vary between processor designs - one of the major design decisions
• The register in the processor perform two roles:
1. User-visible registers
2. Control and status registers
User Visible Registers
• General Purpose
• Data
• Address
• Condition Codes
General Purpose Registers (1)
• May be true general purpose ( can contain the operand for any opcode).
• May be restricted ( may be dedicated registers for floating-point and stack operations).
• May be used for data or addressing
• Data register: used to hold data only accumulator (AC)
General Purpose Registers (2)
• Addressing register: may be GPR or devoted to a particular addressing mode. It include:
1. segment pointer (cf. virtual memory),
2. Index register
3. stack (points to top of stack, cf. implicit addressing)
General Purpose Registers (3)
• Make them general purpose
– Increased flexibility and programmer options
– Increased instruction size & complexity, addressing
• Make them specialized
– Smaller (faster) but more instructions
– Less flexibility, addresses implicit in opcode
How Many GP Registers?
• Between 8 - 32
• Less = more memory references
• More takes up processor real estate
• See also RISC
How big?
• Large enough to hold full address
• Large enough to hold full data types
• But often possible to combine two data registers or two address registers by using more complex addressing (e.g., page and offset)
Control & Status Registers(1)
• Control Register: They control the operation of the processor. Not visible to the user
1. Program Counter (PC): contains the address of an instruction to be fetched.
2. Instruction Register (IR): contains the instruction most recently fetched.
Control & Status Registers(2)
3. Memory Address Register (MAR) – connects to address bus and contains the address of a location in memory.
4. Memory Buffer Register (MBR) – connects to data bus, feeds other registers. It contains a word of data to be written to memory or the word most recently read.
These registers are used to for the movement of data between the processor and memory.
Condition Code Registers – Flags
• They bits set by processor as the result of operations.
– e.g., result of last operation was zero
• Can be read by programs
– e.g., Jump if zero – simplifies branch taking
• Can not (usually) be alter by programmer.
• Usually, they are form part of control register
Program Status Word (PSW)
• A set of bits
• Condition Codes:
– Sign (of last result): contains the sign bit of the result of the last arithmetic operation.
– Zero (last result): set when the result is 0
– Carry (multiword arithmetic): set if an operation resulted in a carry (addition) into or borrow (subtraction).
Program Status Word (PSW)
– Equal (two latest results): set if logical compare result is equality.
– Overflow: used to indicate arithmetic overflow.
• Interrupts enabled/disabled: used to enable or disable.
• Supervisor/user mode
Instruction Cycle
• Two steps:
– Fetch: processor read instructions from memory one at a time
– Execute
Fetch Sequence
• Address of next instruction is in PC
• Address in (MAR) is placed on address bus
• Control unit issues READ command
• Result (data from memory) appears on data bus
• Data from data bus copied into MBR
• PC incremented by 1 (in parallel with data fetch from memory)
• Data (instruction) moved from MBR to IR
• MBR is now free for further data fetches
Data Flow (Fetch Diagram)
Fetch - 4 Registers
• Memory Address Register (MAR) – Connected to address bus – Specifies address for read or write op
• Memory Buffer Register (MBR) – Connected to data bus – Holds data to write or last data read
• Program Counter (PC) – Holds address of next instruction to be fetched
• Instruction Register (IR) – Holds last instruction fetched
Execute Cycle
• May take many forms, depends on instruction being executed
• Processor-memory – data transfer between CPU and main memory
• Processor I/O – Data transfer between CPU and I/O module
• Data processing – Some arithmetic or logical operation on data
• Control – Alteration of sequence of operations – e.g. jump
• Combination of above
Control Unit
• Functions of Control Unit: two basic tasks
1. Sequencing
– Causing the CPU to step through a series of micro-operations based on the program being executed.
2. Execution
– Causing the performance of each micro-op
• This is done using Control Signals
Functional Requirements
• Define basic elements of processor
• Describe micro-operations processor performs
• Determine functions control unit must perform
Control Signals - input
• Clock – One micro-instruction (or set of parallel micro-instructions)
per clock cycle. Sometimes refer as processor cycle time or clock cycle time.
• Instruction register – Op-code for current instruction is used to Determine which
micro-instructions are performed • Flags
– State of CPU – Results of previous operations
• From control bus – Interrupts – Acknowledgements
Control Signals - output
• Within CPU: two types
– Cause data movement between registers
– Activate specific functions
• Via control bus: two types
– To memory
– To I/O modules
Model of Control Unit
Block diagram of the control unit
Micro-Operations
• A processor executes a program
• Fetch/execute cycle
• Each cycle has a number of steps
– see pipelining
• Called micro-operations
• Each step does very little
• Atomic operation of CPU
Control unit Implementation
• Wide variety techniques have been used, most of them fall into one of two categories:
1. Hardwired Implementation
2. Microprogrammed Implementation
• In a hardwired the control unit is essentially a state machine circuit. Its input logic signals transformed into a set of output logic signals.
Hardwired Implementation (1)
• Control unit inputs are instruction register, clock, flags and control bus signals.
– Each bit means something
• Instruction register
– Op-code causes different control signals for each different instruction
– Unique logic input for each op-code (perform by a decoder)
– Decoder takes encoded input and produces single output
– n binary inputs and 2n outputs
Hardwired Implementation (2)
• Clock – Issue a repetitive sequence of pulses
– Useful for measuring duration of micro-ops
– The period must be long enough to allow signal propagation through data path and processor circuitry.
– Different control signals at different times within instruction cycle
– Need a counter with different control signals for t1, t2 etc.
Control Unit with Decoded Inputs(Hardwired)
Problems With Hard Wired Designs
• Complex sequencing & micro-operation logic
• Difficult to design and test
• Inflexible design
• Difficult to add new instructions
Micro-programmed Control
• Concept of microprogramming is used to implement the control unit.
• Use sequences of instructions which known as micro-programming or firmware
•
Implementation (1)
• All the control unit does is generate a set of control signals for each micro-operation.
• Each control signal is on or off
• Represent each control signal by a bit
• Have a control word for each micro-operation
• Have a sequence of control words for each machine code instruction
• Add an address to specify the next micro-instruction, depending on conditions
Implementation (2)
• Today’s large microprocessor
– Many instructions and associated register-level hardware
– Many control points to be manipulated
• This results in control memory that
– Contains a large number of words
• co-responding to the number of instructions to be executed
– Has a wide word width
• Due to the large number of control points to be manipulated
Control Unit Function
1. To execute an instruction, sequence login unit issues read command to the control memory.
2. Word specified in control address register is read into control buffer register
3. Control buffer register contents generates control signals and next address information
4. Sequence logic unit loads new address into control buffer register based on next address information from control buffer register and ALU flags
• All this happen during one clock cycle
Control Unit
Micro-program Word Length
• Based on 3 factors
– Maximum number of simultaneous micro-operations supported
– The way control information is represented or encoded
– The way in which the next micro-instruction address is specified
Micro-instruction Types
• Vertical micro-programming
– Each micro-instruction specifies single (or few) micro-operations to be performed
• Horizontal micro-programming
– Each micro-instruction specifies many different micro-operations to be performed in parallel.
Vertical Micro-programming
• Width is narrow
• n control signals encoded into log2 n bits
• Limited ability to express parallelism
• Considerable encoding of control information requires external memory word decoder to identify the exact control line being manipulated
Horizontal Micro-programming
• Wide memory word
• High degree of parallel operations possible
• Little encoding of control information
Advantages and Disadvantages of Microprogramming
• Simplifies design of control unit
– Cheaper
– Less error-prone
• Slower
Arithmetic & Logic Unit
• Does the calculations (perform arithmetic and logical operations on data).
• Everything else in the computer is there to service this unit
• Data presented to ALU in registers and the result of an operation are stored in registers.
• Set flags as the result of an operation. • Handles integers • May handle floating point (real) numbers • Control unit may provide signals that control the
operation of the ALU and data movement.
ALU Inputs and Outputs