cs2422 assembly language and system programming ia-32 processor architecture department of computer...
TRANSCRIPT
CS2422 Assembly Language and System Programming
IA-32 Processor Architecture
Department of Computer ScienceNational Tsing Hua University
CS2422 Assembly Language and System ProgrammingAssembly Language for Intel-Based Computers, 5th Edition
Chapter 2: IA-32 Processor Architecture
(c) Pearson Education, 2006-2007. All rights reserved. You may modify and copy this slide show for your personal use, or for use in the classroom, as long as this copyright statement, the author's name, and the title are not changed.
Slides prepared by the author
Revision date: June 4, 2006
Kip Irvine
3
Chapter Overview
Goal: Understand IA-32 architecture
Basic Concepts of Computer Organization Instruction execution cycle Basic computer organization Data storage in memory How programs run
IA-32 Processor Architecture IA-32 Memory Management Components of an IA-32 Microcomputer Input-Output System
4
Recall: Computer Model for ASM
CPU
MemoryMOV AX, aADD AX, bMOV x, AX…
010100110010101a
110010110001010b
000000000010010x
AXBX
...
+ -
PC
Register
ALU
x a + b
Meanings of the Code (assumed)
Assembly code Machine codeMOV AX, a(Take the data stored inmemory address ‘a’, and move it to register AX)
ADD AX, b(Take the data stored inmemory address ‘b’, and add it to register AX)
MOV x, AX(Take the data stored inregister AX, and move it tomemory address ‘x’)
5
01 0000001 1000010
MOV registeraddress
AX memoryaddress
a
01 1000011 0000001
11 0000001 1000110
ADD
Another Computer Model for ASM
6
…
ALU
MemoryRegister
AXBX
address
01 0000001 1000010 11 0000001 1000110
…
a
b
x
01 1000011 0000001
MOV AX, aADD AX, bMOV x, AX
data
PCIR
PC: program counterIR: instruction register
Stored program
architecture
Processor
Step 1: Fetch (MOV AX, a)
7
…
ALU
MemoryRegister
AXBX
address
01 0000001 1000010 11 0000001 1000110
…
a
b
x
01 1000011 0000001
MOV AX, aADD AX, bMOV x, AX
data
PC
000011101 0000001 1000010
IR
Step 2: Decode (MOV AX,a)
8
…
ALU
MemoryRegister
AXBX
address
01 0000001 1000010 11 0000001 1000110
…
a
b
x
01 1000011 0000001
MOV AX, aADD AX, bMOV x, AX
data
PCIR
000011101 0000001 1000010
Controller
clock
Step 3: Execute (MOV AX,a)
9
…
ALU
MemoryRegister
AXBX
address
00000000 000000001
01 0000001 1000010 11 0000001 1000110
…
a
b
x
01 1000011 0000001
MOV AX, aADD AX, bMOV x, AX
data
PCIR
000011101 0000001 1000010
Controller
clock
00000000 00000001
Step 1: Fetch (ADD AX,b)
10
…
ALU
MemoryRegister
AXBX
address
01 0000001 1000010 11 0000001 1000110
…
a
b
x
01 1000011 0000001
MOV AX, aADD AX, bMOV x, AX
data
PC
000100011 0000001 1000110
IR
00000000 00000001
Step 2: Decode (ADD AX,b)
11
…
ALU
MemoryRegister
AXBX
address
01 0000001 1000010 11 0000001 1000110
…
a
b
x
01 1000011 0000001
MOV AX, aADD AX, bMOV x, AX
data
PCIR
000100011 0000001 1000110
Controller
clock
00000000 00000001
Step 3a: Execute (ADD AX,b)
12
…
ALU
MemoryRegister
AXBX
address
00000000 0000001001 0000001 1000010 11 0000001 1000110
…
a
b
x
01 1000011 0000001
MOV AX, aADD AX, bMOV x, AX
data
PCIR
000100011 0000001 1000110
Controller
clock
00000000 00000001
00000000 00000011
++
Step 3b: Write Back (ADD AX,b)
13
…
ALU
MemoryRegister
AXBX
address
01 0000001 1000010 11 0000001 1000110
…
a
b
x
01 1000011 0000001
MOV AX, aADD AX, bMOV x, AX
data
PCIR
000100011 0000001 1000110
Controller
clock
00000000 00000011
00000000 00000011
00000000 00000001
14
Basic Computer Organization
Clock synchronizes CPU operations Control unit (CU) coordinates execution sequence ALU performs arithmetic and bitwise processing
15
Clock
Operations in a computer are triggered and thus synchronized by a clock
Clock tells “when”: (no need to ask each other!!) When to put data on output lines When to read data from input lines
Clock cycle measures time of a single operation Must long enough to allow signal propagation
one cycle
1
0
16
Instruction/Data for Operations
Where are the instructions needed for computer operations from?
Stored-program architecture: The whole program is stored in main memory,
including program instructions (code) and data CPU loads the instructions and data from memory
for execution Don’t worry about the disk for now
Where are the data needed for execution? Registers (inside the CPU, discussed later) Memory Constant encoded in the instructions
17
Memory
Organized like mailboxes, numbered 0, 1, 2, 3,…, 2n-1. Each box can hold 8 bits (1 byte) So it is called byte-addressing
Address of mailboxes: 16-bit address is enough for up to 64K 20-bit for 1M 32-bit for 4G Most servers need more than 4G!!
That’s why we need 64-bit CPUs like Alpha (DEC/Compaq/HP) or Merced (Intel)
…
18
Storing Data in Memory
Character String: So how are strings like “Hello, World!” are stored
in memory? ASCII Code! (or Unicode…etc.) Each character is stored as a byte Review: how is “1234” stored in memory?
Integer: A byte can hold an integer number:
‒ between 0 and 255 (unsigned) or ‒ between –128 and 127 (2’s complement)
How to store a bigger number? Review: how is 1234 stored in memory?
19
Big or Little Endian?
Example: 1234 is stored in 2 bytes. = 100 1101 0010 in binary= 04 D2 in hexadecimal
Do you store 04 or D2 first? Big Endian: 04 first Little Endian: D2 first Intel’s choice
Reason: more consistent for variable length (e.g., 2 bytes, 4 bytes, 8 bytes…etc.)
20
Cache Memory
High-speed expensive static RAM both inside and outside the CPU. Level-1 cache: inside the CPU chip Level-2 cache: often outside the CPU chip
Cache hit: when data to be read is already in cache memory
Cache miss: when data to be read is not in cache memory
21
How a Program Runs?
22
Load and Execute Process
OS searches for program’s filename in current directory and then in directory path
If found, OS reads information from directory OS loads file into memory from disk OS allocates memory for program information OS executes a branch to cause CPU to execute
the program. A running program is called a process
Process runs by itself. OS tracks execution and responds to requests for resources
When the process ends, its handle is removed and memory is released
How?OS is only a program!
23
Multitasking
OS can run multiple programs at same time Multiple threads of execution within the same
program Scheduler utility assigns a given amount of CPU
time to each running program Rapid switching of tasks
Gives illusion that all programs are running at the same time
Processor must support task switching
What supports are needed from hardware?
24
What's Next
General Concepts IA-32 Processor Architecture
Modes of operation Basic execution environment Floating-point unit Intel microprocessor history
IA-32 Memory Management Components of an IA-32 Microcomputer Input-Output System
25
Modes of Operation
Protected mode native mode (Windows, Linux) Programs are given separate memory areas
named segments Real-address mode
native MS-DOS System management mode
power management, system security, diagnostics
• Virtual-8086 mode hybrid of Protected each program has its own 8086 computer
26
Basic Execution Environment
Address space: Protected mode
4 GB 32-bit address
Real-address and Virtual-8086 modes 1 MB space 20-bit address
27
Basic Execution Environment
Program execution registers: named storage locations inside the CPU, optimized for speed
CS
SS
DS
ES
EIP
EFLAGS
16-bit Segment Registers
EAX
EBX
ECX
EDX
32-bit General-Purpose Registers
FS
GS
EBP
ESP
ESI
EDI
ZN
Register
Memory
PCIR
ALU
clock
Controller
28
General Purpose Registers
Used for arithmetic and data movement Addressing:
AX, BX, CX, DX: 16 bits
Split into H and L parts, 8 bits each Extended into E?X to become 32-bit register (i.e.,
EAX, EBX,…etc.)
29
Index and Base Registers
Some registers have only a 16-bit name for their lower half:
30
Some Specialized Register Uses
General purpose registers EAX: accumulator, automatically used by
multiplication and division instructions ECX: loop counter ESP: stack pointer ESI, EDI: index registers (source, destination) for
memory transfer, e.g. a[i,j] EBP: frame pointer to reference function
parameters and local variables on stack EIP: instruction pointer (i.e. program counter)
31
Some Specialized Register Uses
Segment registers In real-address mode: indicate base addresses of
preassigned memory areas named segments In protected mode: hold pointers to segment
descriptor tables CS: code segment DS: data segment SS: stack segment ES, FS, GS: additional segments
EFLAGS Status and control flags (single binary bits) Control the operation of the CPU or reflect the
outcome of some CPU operation
32
Status Flags (EFLAGS)
Reflect the outcomes of arithmetic and logical operations performed by the CPU
Carry: unsigned arithmetic out of range Overflow: signed arithmetic out of range Sign: result is negative Zero: result is zero Auxiliary Carry: carry from bit 3 to bit 4 Parity: sum of 1 bits is an even number
ZN
Register
Memory
PCIR
ALU
clock
Controller
33
System Registers
Application programs cannot access system registers
IDTR (Interrupt Descriptor Table Register) GDTR (Global Descriptor Table Register) LDTR (Local Descriptor Table Register) Task Register Debug Registers Control registers CR0, CR2, CR3, CR4 Model-Specific Registers
34
Floating-Point, MMX, XMM Reg.
Eight 80-bit floating-point data registers ST(0), ST(1), . . . , ST(7) arranged in a stack used for all floating-point arithmetic
Eight 64-bit MMX registers Eight 128-bit XMM registers for
single-instruction multiple-data(SIMD) operations
35
Intel Microprocessors
Early microprocessors: Intel 8080:
‒ 64K addressable RAM, 8-bit registers‒ CP/M operating system‒ S-100 BUS architecture‒ 8-inch floppy disks!
Intel 8086/8088‒ IBM-PC used 8088‒ 1 MB addressable RAM, 16-bit registers‒ 16-bit data bus (8-bit for 8088)‒ separate floating-point unit (8087)
This is where “real-address mode” comes from!
36
Intel Microprocessors
The IBM-AT Intel 80286
‒ 16 MB addressable RAM‒ Protected memory‒ Introduced IDE bus architecture‒ 80287 floating point unit
Intel IA-32 Family Intel386: 4 GB addressable RAM, 32-bit registers,
paging (virtual memory) Intel486: instruction pipelining Pentium: superscalar, 32-bit address bus, 64-bit
internal data path
37
Intel Microprocessors
Intel P6 Family Pentium Pro: advanced optimization techniques in
microcode Pentium II: MMX (multimedia) instruction set Pentium III: SIMD (streaming extensions)
instructions Pentium 4 and Xeon: Intel NetBurst micro-
architecture, tuned for multimedia
38
What’s Next
General Concepts of Computer Architecture IA-32 Processor Architecture IA-32 Memory Management
Real-address mode Calculating linear addresses Protected mode Multi-segment model Paging
Components of an IA-32 Microcomputer Input-Output System
Understand it from the view point of the processor
39
Real-Address Mode
Programs have 1 MB RAM maximum addressable with 20-bit addresses
Application programs can access any area of the 1MB memory
Single tasking: one program at a time, but CPU can momentarily interrupt that program to process requests (called interrupts) from peripherals
MS-DOS runs in real-address mode
40
Ancient History
IBM PC XT (Intel 8088/8086) is a so-called 16-bit machine
Each register has 16 bits 216 = 65536 = 64K Want to use more memory (640K, 1M)…
How to hold 20-bit addresses with 16-bit registers in the 8086 processor?
Solution: segmented memory All of memory is divided into 64KB units called
segments 16 segments in total One 16-bit register for segment value and another
for 16-bit offset within the segment
41
Segmented Memory
l inea
r ad
dres
s es
one segment
Segmented memory addressing: absolute (linear) address is a combination of a 16-bit segment value (in CS, DS, SS, or ES) added to a 16-bit offset
8000 0000segmentvalue
offset
representedas
42
Calculating Linear Addresses
Given a segment address, multiply it by 16 (add a hexadecimal zero), and add it to the offset all done by the processor
Example:convert 08F1:0100 to a linear address
Adjusted Segment value: 0 8 F 1 0
Add the offset: 0 1 0 0
Linear address: 0 9 0 1 0
43
Protected Mode
Designed for multitasking Each process (running program) is assigned a
total of 4GB of addressable RAM Two parts:
Segmentation: provides a mechanism of isolating individual code, data, and stack so that multiple programs can run without interfering one another
Paging: provides demand-paged virtual memory where sections of a program’s execution environ. are moved into physical memory as needed Give segmentation the illusion that it has 4GB of physical memory
44
Segmentation in Protected Mode
Segment: a logical unit of storage (not the same as the “segment” in real-address mode) e.g., code/data/stack of a program, system data
structures Variable size Processor hardware provides protection All segments in the system are in the processor’s
linear address space (physical space if without paging)
Need to specify: base address, size, type, … segment descriptor & descriptor table linear address = base address + offset
45
Flat Segment Model
Use a single global descriptor table (GDT) All segments (at least 1 code and 1 data)
mapped to entire 32-bit address space
46
Multi-Segment Model
Local descriptor table (LDT) for each program One descriptor for each segment
located in a system segment of LDT type
47
Segmentation Addressing
Program references a memory location with a logical address: segment selector + offset Segment selector: provides an offset into the
descriptor table CS/DS/SS points to descriptor table for
code/data/stack segment
48
Convert Logical to Linear Address
Segment selector points to a segment descriptor, which contains base address of the segment.The 32-bit offset from the logical address is added to the segment’s base address, generating a 32-bit linear address
Selector Offset
Logical address
Segment Descriptor
Descriptor table
+
GDTR/LDTR
(contains base address ofdescriptor table)
Linear address
49
Paging
Supported directly by the processor Divides each segment into 4096-byte blocks
called pages Part of running program is in memory, part is on
disk Sum of all programs can be larger than physical
memory Virtual memory manager (VMM): An OS utility
that manages loading and unloading of pages Page fault: issued by processor when a page
must be loaded from disk
50
What's Next
General Concepts IA-32 Processor Architecture IA-32 Memory Management Components of an IA-32 Microcomputer
Skipped … Input-Output System
51
What's Next
General Concepts IA-32 Processor Architecture IA-32 Memory Management Components of an IA-32 Microcomputer Input-Output System
How to access I/O systems?
52
Different Access Levels of I/O Call a HLL library function (C++, Java)
easy to do; abstracted from hardware slowest performance
Call an operating system function specific to one OS; device-independent medium performance
Call a BIOS (basic input-output system) function may produce different results on different systems knowledge of hardware required usually good performance
Communicate directly with the hardware May not be allowed by some operating systems
53
Displaying a String of Characters
When a HLL program displays a string of characters, the following steps take place: Calls an HLL library function to write the string to
standard output Library function (Level 3) calls an OS function,
passing a string pointer OS function (Level 2) calls a BIOS subroutine,
passing ASCII code and color of each character BIOS subroutine (Level 1) maps the character to a
system font, and sends it to a hardware port attached to the video controller card
Video controller card (Level 0) generates timed hardware signals to video display to display pixels
54
Summary
Central Processing Unit (CPU) Arithmetic Logic Unit (ALU) Instruction execution cycle Multitasking Floating Point Unit (FPU) Complex Instruction Set Real mode and Protected mode Motherboard components Memory types Input/Output and access levels