· web viewin the meantime, operating systems like os/2 tried to ping-pong the processor between...

www.bookspar.com | Website for students | VTU NOTES

MicroprocessorA microprocessor (abbreviated as µP or uP) is an electronic computer central processing unit

(CPU) made from miniaturized transistors and other circuit elements on a single semiconductor

integrated circuit (IC)

Before the advent of microprocessors, electronic CPUs were made from discrete (separate) TTL

integrated circuits; before that, individual transistors; and before that, from vacuum tubes. There

have even been designs for simple computing machines based on mechanical parts such as

gears, shafts, levers, Tinkertoys, etc. Leonardo DaVinci made one such design, although none

were possible to construct using the manufacturing techniques of the time.

The evolution of microprocessors have been known to follow Moore's Law

when it comes to steadily increasing performance over the years. This suggests that computing

power will double every eighteen months, a process that has been generally followed since the

early 1970's — a surprise to everyone involved. From humble beginnings as the drivers for

calculators, the continued increase in power has led to the dominance of microprocessors over

every other form of computer; every system from the largest mainframes to the smallest handheld

computers now use a microprocessor at their core.

History

The first chipsAs with many advances in technology, the microprocessor was an idea whose time had come.

Three projects arguably delivered a complete microprocessor at about the same time, Intel's 4004,

Texas Instruments' TMS 1000, and Garrett AiResearch's Central Air Data Computer.

In 1968 Garrett was invited to produce a digital computer to compete with electromechanical

systems then under development for the main flight control computer in the US Navy's new F-14

Tomcat fighter. The design was complete by 1970, and used a MOS-based chipset as the core

CPU. The design was smaller and much more reliable than the mechanical systems it competed

against, and was used in all of the early Tomcat models. However the system was considered so

advanced that the Navy refused to allow publication of the design, and continued to refuse until

1997. For this reason the CADC, and the MP944 chipset it used, are fairly unknown even today.

TI developed the 4-bit TMS 1000 and stressed pre-programmed embedded applications,

introducing a version called the TMS1802NC on September 17, 1971, which implemented a

calculator on a chip. The Intel chip was the 4-bit 4004, released on November 15, 1971, developed

by Federico Faggin.

TI filed for the patent on the microprocessor. Gary Boone was awarded for the single-chip

microprocessor architecture on September 4, 1973. It may never be known which company

actually had the first working microprocessor running on the lab bench. In both 1971 and 1976,

Intel and TI entered into broad patent cross-licensing agreements, with Intel paying royalties to TI

for the microprocessor patent. A nice history of these events is contained in court documentation

from a legal dispute between Cyrix and Intel, with TI as and owner of the microprocessor

patent.

Interestingly, a third party claims to have been awarded a patent which might cover the

"microprocessor". See a webpage claiming an inventor pre-dating both TI and Intel, describing

a "microcontroller", which may or may not count as a "microprocessor".

A computer-on-a-chip is a variation of a microprocessor which combines the microprocessor

core (CPU), some memory, and I/O (input/output) lines, all on one chip. The computer-on-a-chip

patent, called the microcomputer patent at the time, , was awarded to Gary Boone and Michael J.

Cochran of TI. Aside from this patent the proper meaning of microcomputer is a computer using a

(number of) microprocessor(s) as its CPU(s), while the concept of the patent is somewhat more

similar to a microcontroller.

According to A History of Modern Computing, (MIT Press), pp. 220–21,

Intel entered into a contract with Computer Terminals Corporation, later called Datapoint, of San

Antonio TX, for a chip for a terminal they were designing. Datapoint later decided not to use the

chip, and Intel marketed it as the 8008 in April, 1972. This was the world's first 8-bit

microprocessor. It was the basis for the famous "Mark-8" computer kit advertised in the magazine

Radio-Electronics. The 8008 and its successor, the world-famous 8080, opened up the

microprocessor component marketplace.

Notable 8-bit designs

The 4004 was later followed by the 8008, the world's first 8-bit microprocessor. These processors

are the precursors to the very successful Intel 8080 Zilog Z80, and derivative Intel 8-bit

processors. The competing Motorola 6800 architecture was cloned and improved in the MOS

Technology 6502, rivaling the Z80 in popularity during the 1980s.

Both the Z80 and 6502 concentrated on low overall cost, through a combination of small

packaging, simple computer bus requirements, and the inclusion of circuitry that would normally

have to be provided in a separate chip (for instance, the Z80 included a memory controller). It was

these features that allowed the home computer "revolution" to take off in the early 1980s,

eventually delivering semi-usable machines that sold for US$99.

Motorola trumped the entire 8-bit world by introducing the MC6809, arguably one of the most

powerful, orthogonal, and clean 8-bit microprocessor designs ever fielded – and also one of the

most complex hardwired logic designs that ever made it into production for any microprocessor.

Microcoding replaced hardwired logic at about this point in time for all designs more powerful

than the MC6809 – specifically because the design requirements were getting too complex for

hardwired logic.

Another early 8-bit microprocessor was the Signetics 2650, which enjoyed a brief flurry of interest

due to its innovative and powerful instruction set architecture.

A seminal microprocessor in the world of spaceflight was RCA's RCA 1802(aka CDP1802, RCA

COSMAC) which was used in NASA's Voyager and Viking spaceprobes of the 1970s, and onboard

the Galileo probe to Jupiter (launched 1989, arrived 1995). The CDP1802 was used because it

could be run at very low power,* and because its production process (Silicon on Sapphire)

ensured much better protection against cosmic radiation and electrostatic discharges than that of

any other processor of the era; thus, the 1802 is said to be the first radiation-hardened

microprocessor.

16-bitThe first multi-chip 16-bit microprocessor was the National Semiconductor IMP-16,

introduced in early 1973. An 8-bit version of the chipset introduced in 1974 as the IMP-8. In 1975,

National introduced the first 16-bit single-chip microproessor, the PACE, which was later followed

by an NMOS version, the INS8900.

Other early multi-chip 16-bit microprocessors include one used by Digital Equipment Corporation

(DEC) in the LSI-11 OEM board set and the packaged PDP 11/03 minicomputer, and the Fairchild

Semiconductor MicroFlame 9440, both of which were introduced in the 1975 to 1976 timeframe.

Another early single-chip 16-bit microprocessor was TI's TMS 9900, which was also compatible

with their TI 990 line of minicomputers. The 9900 was used in the TI 990/4 minicomputer, the TI-

99/4A home computer, and the TM990 line of OEM microcomputer boards. The chip was packaged

in a large ceramic 64-pin DIP package package, while most 8-bit microprocessors such as the

Intel 8080 used the more common, smaller, and less expensive 40-pin DIP. A follow-on chip, the

TMS 9980, was designed to compete with the Intel 8080, had the full TI 990 16-bit instruction set,

used a plastic 40-pin package, moved data 8 bits at a time, but could only address 16 KB. A third

chip, the TMS 9995, was a new design. The family later expanded to include the 99105 and 99110.

Intel followed a different path, having no minicomputers to emulate, and instead "upsized" their

8080 design into the 16-bit Intel 8086, the first member of the x86 family which powers most

modern PC type computers. Intel introduced the 8086 as a cost effective way of porting software

from the 8080 lines, and succeeded in winning much business on that premise. Following up their

8086 and 8088, Intel released the 80186, 80286 and, in 1985, the 32-bit 80386, cementing their PC

market dominance with the processor family's backwards compatibility.

The integrated microprocessor memory management unit (MMU) was developed by Childs et al.

of Intel, and awarded US patent number 4,442,484.

32-bit designs16-bit designs were in the market only briefly when full 32-bit implementations started to appear.

The world's first single-chip 32-bit microprocessor was the AT&T Bell Labs BELLMAC-32A, with

first samples in 1980, and general production in 1982 (See this webpage for a bibliographic

reference and this webpage for a general reference). After the divestiture of AT&T in 1984, it

was renamed the WE 32000 (WE for Western Electric), and had two follow-on generations, the WE

32100 and WE 32200. These microprocessors were used in the AT&T 3B5 and 3B15

minicomputers; in the 3B2, the world's first desktop supermicrocomputer; in the "Companion",

the world's first 32-bit laptop computer; and in "Alexander", the world's first book-sized

supermicrocomputer, featuring ROM-pack memory cartridges similar to today's gaming consoles.

All these systems ran the original Bell Labs UNIX Operating System, which included the first

Windows-type software called xt-layers.

The most famous of the 32-bit designs is the MC68000, introduced in 1979. The 68K, as it was

widely known, had 32-bit registers but used 16-bit internal data paths, and a 16-bit external data

bus to reduce pin count. Motorola generally described it as a 16-bit processor, though it clearly

has 32-bit architecture. The combination of high speed, large (16 megabyte) memory space and

fairly low costs made it the most popular CPU design of its class. The Apple Lisa and Macintosh

designs made use of the 68000, as did a host of other designs in the mid-1980s, including the

Atari ST and Commodore Amiga.

Intel's first 32-bit microprocessor was the iAPX 432, which was introduced in 1981 but was not a

commercial success. It had an advanced capability-based object-oriented architecture, but poor

performance compared to other competing architectures such as the Motorola 68000.

Motorola's success with the 68000 led to the MC68010, which added virtual memory support. The

MC68020, introduced in 1985 added full 32-bit data and address busses. The 68020 became

hugely popular in the Unix supermicrocomputer market, and many small companies (e.g., Altos,

Charles River Data Systems) produced desktop-size systems. Following this with the MC68030,

which added the MMU into the chip, the 68K family became the processor for everything that

wasn't running DOS. The continued success led to the MC68040, which included a FPU for better

math performance. An 68050 failed to achieve its performance goals and was not released, and

the follow-up MC68060 was released into a market saturated by much faster RISC designs. The

68K family faded from the desktop in the early 1990s.

Other large companies designed the 68020 and follow-ons into embedded equipment. At one

point, there were more 68020s in embedded equipment than there were Intel Pentiums in PCs (See

this webpage for this embedded usage information). The ColdFire processor cores are

derivatives of the venerable 68020.

During this time (early to mid 1980s), National Semiconductor introduced a very similar 16-bit

pinout, 32-bit internal microprocessor called the NS 16032 (later renamed 32016), the full 32-bit

version named the NS 32032, and a line of 32-bit industrial OEM microcomputers. By the mid-

1980s, Sequent introduced the first symmetric multiprocessor (SMP) server-class computer using

the NS 32032. This was one of the designs few wins, and it disappeared in the late 1980s.

Other designs included the interesting Zilog Z8000, which arrived too late to market to stand a

chance and disappeared quickly.

In the late 1980s, "microprocessor wars" started killing off some of the microprocessors.

Apparently, with only one major design win, Sequent, the NS 32032 just faded out of existence,

and Sequent switched to Intel microprocessors.

64 bit microchipsThough RISC (see below) based designs featured the first crop of 64 bit processors long before

the current mainstream PC microchips from AMD & Intel, they were limited to proprietary OSes.

However with AMD's introduction of the first 64-bit chip Athlon 64, followed by Intel's own 64 bit

chips, the 64 bit race has truly begun. Both processors are also backward compatible meaning

they can run 32 bit legacy apps as well as the new 64 bit software. With 64 bit Windows XP and

Linux that runs on 64 bits, the software too is geared to utilise the full power of such processors.

RISCIn the mid-1980s to early-1990s, a crop of new high-performance RISC (reduced instruction set

computer) microprocessors appeared, which were initially used in special purpose machines and

Unix workstations, but have since become almost universal in all roles except the Intel-standard

desktop.

The first commercial design was released by MIPS Technology, the 32-bit R2000 (the R1000 was

not released). The R3000 made the design truly practical, and the R4000 introduced the world's

first 64-bit design. Competing projects would result in the IBM POWER and Sun SPARC systems,

respectively. Soon every major vendor was releasing a RISC design, including the AT&T CRISP,

AMD 29000, Intel i860 and Intel i960, Motorola 88000, DEC Alpha and the HP-PA.

Market forces have "weeded out" many of these designs, leaving the POWER and the derived

PowerPC as the main desktop RISC processor, with the SPARC being used in Sun designs only.

MIPS continues to supply some SGI systems, but is primarily used as an embedded design,

notably in Cisco routers. The rest of the original crop of designs have either disappeared, or are

about to. Other companies have attacked niches in the market, notably ARM, originally intended

for home computer use but since focussed at the embedded processor market. Today RISC

designs based on the MIPS, ARM or PowerPC core power the vast majority of computing devices.

Of course, in the IBM-compatible PC world, Intel, AMD, and now VIA of Taiwan all make x86-

compatible microprocessors. In 64-bit computing, the DEC(-Intel) ALPHA, the AMD 64, and the

HP-Intel Itanium are the most popular designs as of late 2004.

x86 or 80x86 is the generic name of a microprocessor architecture first developed and

manufactured by Intel.

The architecture is called x86 because the earliest processors in this family were identified only

by numbers ending in the sequence "86": the 8086, the 80186, the 80286, the 386, and the 486.

Because one cannot trademark numbers, Intel and most of its competitors began to use

trademarkable names such as Pentium for subsequent generations of processors, but the earlier

naming scheme has stuck as a term for the entire family. Intel now refers to x86 as IA-32, an

abbreviation for Intel Architecture, 32-bit.

Intel 8085The Intel 8085 is an 8-bit microprocessor made by Intel in the mid-1970s It was binary compatible

with the more-famous Intel 8080 but required less supporting hardware, thus allowing simpler and

less expensive microcomputer systems to be built.

The "5" in the model number came from the fact that the 8085 required only a 5-volt power supply

rather than the 5V and 12V supplies the 8080 needed. Both processors were sometimes used in

computers running the CP/M operating system

, and the 8085 later saw use as a microcontroller (much by virtue of its component count reducing

feature). Both designs were later eclipsed by the compatible but more capable Zilog Z80, which

took over most of the CP/M computer market as well as taking a large share of the booming home

computer market in the early-to-mid-1980s.

The 8085 can access 65,536 individual memory locations, but can only access one at a time,

because it is an eight bit microprocessor and each operation requires eight bits to preform it.

Unlike some other microprocessors of its era, it has a separate address space for up to 256 I/O

ports. It also has a built in register array which are usually labeled A,B,C,D,E,H and L. The

microprocessor also has three hardware based HALT operations which are found in pin 7 pin 8

and pin 9, these are called RST 7.5, RST 6.5, and RST 5.5 respectively. RST 7.5 is used in case of a

power surge.

8-bit CPUs normally use an 8-bit data bus and a 16-bit address bus which means that their

address space is limited to 64 kilobytes; this is not a "natural law", however, and thus there are

exceptions.

The first widely adopted 8-bit microprocessor was the Intel 8080, being used in many hobbyist

computers of the late 1970s and early 1980s, often running the CP/M. The Zilog Z80 (compatible

with the 8080) and the Motorola 6800 were also used in similar computers. The Z80 and the MOS

Technology 6502 8-bit CPUs were widely used in home computers and game consoles of the 70s

and 80s. Many 8-bit CPUs or microcontrollers are the basis of today's ubiquitous embedded

systems

There are 28 (256) possible permutations for 8 bits.

Address space

In computing, an address space defines a context in which an address makes sense.

Two addresses may be numerically the same, but refer to different things, if they belong to

different address spaces.

Some example address spaces include:

Main memory(physical memory)

Virtual memory

I/O port space

IP address

house numbers on street addresses

In general, things in one address space are physically in a different location than things in

another address space. For example, "house number 101 South" on one particular southward

street is completely different from any house number (not just the 101st house) on a different

southward street.

However, sometimes different address spaces overlap (some physical location exists in both

address spaces). When overlapping address spaces are not aligned, translation is necessary.

For example, virtual-to-physical address translation is necessary to translate addresses in the

virtual memory address space to ddresses in physical address space -- one physical address,

and one or more numerically different virtual address, all refer to the same physical byte of

Many programmers prefer to use a flat memory model, in which there is no distinction

between code space, data space, and virtual memory-- in other words, numerically identical

pointers refer to exactly the same byte of RAM in all three address spaces.

Unfortunately, many early computers did not support a flat memory model -- in particular,

Harvard architecture machines force program storage to be completely separate from data

storage.

Many modern DSPs (such as the Motorola 56000) have 3 separate storage areas -- program

storage, coefficient storage, and data storage. Some commonly-used instructions fetch from

all three areas simultaneously --

fewer storage areas (even if there were the same or more total bytes of storage) would make

those instructions run slower. 3 storage areas merely a type of Harvard architecture,or does

"Harvard" imply exactly 2 storage areas ?

In the Linux kernel, address spaces include:

Kernel memory

User memory, accessed through copy_to_user(), copy_from_user and similar

functions

I/O memory, accessed through readb(), writel(), memcpy_toio(), etc.

Primary storage

Primary storage is a category of computer storage, often called main memory. Confusingly, the

term primary storage has recently been used in a few contexts to refer to online storage (hard

disk), which is usually classified as secondary storage.

Primary storage is used to store data that is likely to be in active use, so it is usually faster than

long-term secondary storage. Today, many computers have cache memory located in between the

central processing unit and primary storage in order to further increase speed.

A particular location in storage is selected by its physical memory address. That address remains

the same, no matter how the particular value stored there changes.

Over the history of computing, a variety of technologies have been used for primary storage.

Today, we are most familiar with random access memory (RAM) made out of many small

integrated circuits. Some early computers used mercury delay lines, in which a series of acoustic

pulses were sent along a tube filled with mercury. When the pulse reached the end of the tube, the

circuitry detected whether the pulse represented a binary 1 or 0 and caused the oscillator at the

beginning of the line to repeat the pulse. Other early computers stored RAM on high-speed

magnetic drums.

Modern primary storage devices include:

Random access memory (RAM) - includes VRAM WRAM NVRAM

Read-only memory (ROM)

Before the use of integrated circuits for memory became widespread, primary storage was

implemented in many different forms:

Williams tube

Delay line memory

Drum memory

Core memory

Twistor memory

Bubble memory

Virtual memory

Virtual memory is a computerdesign feature that permits software to use more main memory (the

memory which the CPU can read and write to directly) than the computer actually physically

possesses.

Most computers possess four kinds of memory: registers in the CPU, caches both inside and

adjacent to the CPU, physical memory, generally in the form of RAM which the CPU can read and

write to directly and reasonably quickly; and disk storage which is much slower, but also much

larger. Many applications require access to more information (codeas well as data) than can be

stored in physical memory. This is especially true when the operating system is one that wishes

to allow multiple processes/applications to run seemingly in parallel. The obvious response to the

problem of the maximum size of the physical memory being less than that required for all running

programs is for the application to keep some of its information on the disk, and move it back and

forth to physical memory as needed, but there are a number of ways to do this.

One option is for the application software itself to be responsible both for deciding which

information is to be kept where, and also for moving it back and forth. The programmer would do

this by determining which sections of the program (and also its data) were mutually exclusive and

then arranging for loading and unloading the appropriate sections from physical memory, as

needed. The disadvantage of this approach is that each application's programmer must spend

time and effort on designing, implementing, and debuggingthis mechanism, instead of focusing

on their application; this hampered programmers' efficiency. Also, if any programmer could truly

choose which of their items of data to store in the physical memory at any one time, they could

easily conflict with the decisions made by another programmer, who also wanted to use all the

available physical memory at that point.

The alternative is to use virtual memory, in which a combination of special hardware and

operating system software makes use of both kinds of memory to make it look as if the computer

has a much larger main memory than it actually does. It does this in a way that is invisible to the

rest of the software running on the computer. It usually provides the ability to simulate a main

memory of almost any size (as limited by the size of the addresses being used by the operating

system and cpu; the total size of the Virtual Memory can be 232 for a 32 bit system, or

approximately 4 Gigabytes, while newer 64 bit chips and operating systems use 64 or 48 bit

addresses and can index much more virtual memory).

This makes the job of the application programmer much simpler. No matter how much memory

the application needs, it can act as if it has access to a main memory of that size. The

programmer can also completely ignore the need to manage the moving of data back and forth

between the different kinds of memory.

In technical terms, virtual memory allows software to run in a memory address space whose size

and addressing are not necessarily tied to the computer's physical memory. While conceivably

virtual memory could be implemented solely by operating system software, in practice its

implementation almost universally uses a combination of hardware and operating system

software.

Basic operation

When virtual memory is used, or when a main memory location is read or written to by the CPU,

hardware within the computer translates the address of the memory location generated by the

software (the virtual memory address) into either:

the address of a real memory location (the physical memory address) which is assigned

within the computer's physical memory to hold that memory item, or

an indication that the desired memory item is not currently resident in main memory (a so-

called virtual memory exception)

In the former case, the memory reference operation is completed, just as if the virtual memory

were not involved. In the latter case, the operating system is invoked to handle the situation,

since the actions needed before the program can continue are usually quite complex.

The effect of this is to swap sections of information between the physical memory and the disk;

the area of the disk which holds the information which is not currently in physical memory is

called the swap file, page file, or swap partition (on some operating systems it is a dedicated

partition of a disk).

Details

The translation from virtual to physical addresses is implemented by an MMU. This may be either

a module of the CPU, or an auxiliary, closely coupled chip.

The operating system is responsible for deciding which parts of the program's simulated main

memory are kept in physical memory. The operating system also maintains the translation tables

which provide the mappings between virtual and physical addresses, for use by the MMU. Finally,

when a virtual memory exception occurs, the operating system is responsible for allocating an

area of physical memory to hold the missing information, bringing the relevant information in

from the disk, updating the translation tables, and finally resuming execution of the software that

incurred the virtual memory exception.

In most computers, these translation tables are stored in physical memory. Therefore, a virtual

memory reference might actually involve two or more physical memory references: one or more

to retrieve the needed address translation from the page tables, and a final one to actually do the

memory reference.

To minimize the performance penalty of address translation, most modern CPUs include an on-

chip MMU, and maintain a table of recently used physical-to-virtual translations, called a

Translation Lookaside Buffer, or TLB. Addresses with entries in the TLB require no additional

memory references (and therefore time) to translate, However, the TLB can only maintain a fixed

number of mappings between virtual and physical addresses; when the needed translation is not

resident in the TLB, action will have to be taken to load it in.

On some processors, this is performed entirely in hardware; the MMU has to do additional

memory references to load the required translations from the translation tables, but no other

action is needed. In other processors, assistance from the operating system is needed; an

exception is raised, and on this exception, the operating system replaces one of the entries in the

TLB with an entry from the translation table, and the instruction which made the original memory

reference is restarted.

The hardware that supports virtual memory almost always supports memory protection

mechanisms as well. The MMU may have the ability to vary its operation according to the type of

memory reference (for read, write or execution), as well as the privilege mode of the CPU at the

time the memory reference was made. This allows the operating system to protect its own code

and data (such as the translation tables used for virtual memory) from corruption by an erroneous

application program and to protect application programs from each other and (to some extent)

from themselves (e.g. by preventing writes to areas of memory which contain code).Paging and

virtual memory

Virtual memory is usually (but not necessarily) implemented using paging. In paging, the low

order bits of the binary representation of the virtual address are preserved, and used directly as

the low order bits of the actual physical address; the high order bits are treated as a key to one or

more address translation tables, which provide the high order bits of the actual physical address.

For this reason a range of consecutive addresses in the virtual address space whose size is a

power of two will be translated in a corresponding range of consecutive physical addresses. The

memory referenced by such a range is called a page. The page size is typically in the range of 512

to 8192 bytes (with 4K currently being very common), though page sizes of 4 megabytes or larger

may be used for special purposes. (Using the same or a related mechanism, contiguous regions

of virtual memory larger than a page are often mappable to contiguous physical memory for

purposes other than virtualization, such as setting access and caching control bits.)

The operating system stores the address translation tables, the mappings from virtual to physical

page numbers, in a data structure known as a page table,

If a page that is marked as unavailable (perhaps because it is not present in physical memory, but

instead is in the swap area), when the CPU tries to reference a memory location in that page, the

MMU responds by raising an exception (commonly called a page fault) with the CPU, which then

jumps to a routine in the operating system. If the page is in the swap area, this routine invokes an

operation called a page swap, to bring in the required page.

The page swap operation involves a series of steps. First it selects a page in memory, for

example, a page that has not been recently accessed and (preferably) has not been modified

since it was last read from disk or the swap area. (See page replacement algorithms for details.) If

the page has been modified, the process writes the modified page to the swap area. The next step

in the process is to read in the information in the needed page (the page corresponding to the

virtual address the original program was trying to reference when the exception occurred) from

the swap file. When the page has been read in, the tables for translating virtual addresses to

physical addresses are updated to reflect the revised contents of the physical memory. Once the

page swap completes, it exits, and the program is restarted and continues on as if nothing had

happened, returning to the point in the program that caused the exception.

It is also possible that a virtual page was marked as unavailable because the page was never

previously allocated. In such cases, a page of physical memory is allocated and filled with zeros,

the page table is modified to describe it, and the program is restarted as above.

Additional details

One additional advantage of virtual memory is that it allows a computer to multiplex its CPU and

memory between multiple programs without the need to perform expensive copying of the

programs' memory images. If the combination of virtual memory system and operating system

supports swapping, then the computer may be able to run simultaneous programs whose total

size exceeds the available physical memory. Since most programs have a small subset (active

set) of pages that they reference over significant periods of their execution, the performance

penalty is less than that which might be expected. If too many programs are run at once, or if a

single program continuously accesses widely scattered memory locations, then page swapping

becomes excessively frequent and overall system performance will become unacceptably slow.

This is often called thrashing (since the disk is being excessively overworked - thrashed) or

paging storm, which corresponds to accessing the swap medium being three orders of magnitude

slower compared to main memory access.

Note that virtual memory is not a requirement for precompilation of software, even if the software

is to be executed on a multiprogramming system. Precompiled software loaded by the operating

system has the opportunity to carry out address relocation at load time. This suffers by

comparison with virtual memory in that a copy of program relocated at load time cannot run at a

distinct address once it has started execution.

It is possible to avoid the overhead of address relocation using a process called rebasing, which

uses metadata in the executable image header to guarantee to the run-time loader that the image

will only run within a certain virtual address space. This technique is used on the system libraries

on Win32 platforms, for example.

In embedded systems, swapping is typically not supported.

Systems with a large amount of RAM can create a virtual hard disk within the RAM itself. This

does block some of the RAM from being available for other system tasks but it does considerably

speed up access to the swap file itself.

Processor register

In computer architecture, a processor register is a small amount of very fast computer memory

used to speed the execution of computer programs by providing quick access to commonly used

values—typically, the values being in the midst of a calculation at a given point in time.

These registers are the top of the memory hierarchy, and are the fastest way for the system to

manipulate data. Registers are normally measured by the number of bits they can hold, for

example, an "8-bitregister" or a "32-bit register". Registers are now usually implemented as a

register file, but they have also been implemented using individual flip-flops, high speed core

memory, thin film memory, and other ways in various machines.

The term is often used to refer only to the group of registers that can be directly indexed for input

or output of an instruction, as defined by the instruction set. More properly, these are called the

"architected registers". For instance, the x86 instruction set defines a set of eight 32-bit registers,

but a CPU that implements the x86 instruction set will contain many more hardware registers than

just these eight.

There are several other classes of registers:

Data registers are used to store integer numbers (see also Floating Point Registers, below).

In some simple/older CPUs, a special data register is the accumulator, used for arithmetic

calculations.

Address registers hold memory addresses and are used to access memory. In some

simple/older CPUs, a special address register is the index register (one or more of these may be

present)

General Purpose registers (GPRs) can store both data and addresses, i.e., they are

combined Data/Address registers.

Floating Point registers (FPRs) are used to store floating point

Constant registers hold read-only values (e.g., zero, one, pi, ...).

Vector registers hold data for vector processing done by SIMD instructions (Single

Instruction, Multiple Data).

Special Purpose registers store internal CPU data, like the program counter (aka

instruction pointer), stack pointer, and status register (aka processor status word).

In some architectures, model-specific registers (also called machine-specific registers)

store data and settings related to the processor itself. Because their meanings are attached to the

design of a specific processor, they cannot be expected to remain standard between processor

generations.

Memory segment

On the Intel x86 architecture, a memory segment is the portion of memory which may be

addressed by a single index register without changing a 16-bit segment selector. In real mode or

protected mode on the 80286 processor (or V86 mode on the 80386 and later processors), a

segment is 64 kilobytes in size (using 16-bit index registers). In 32-bit protected mode, available in

80386 and subsequent processors, a segment is 4 gigabytes (due to 32-bit index registers).

In 16-bit mode, enabling applications to make use of multiple memory segments (in order to

access more memory than available in any one 64K-segment) was quite complex, but was viewed

as a necessary evil for all but the smallest tools (which could do with less memory). The root of

the problem was that no appropriate address-arithmetic instructions suitable for flat addressing

of the entire memory range were available. Flat addressing is possible by applying multiple

instructions, which however leads to slower programs.

The introduction of 32-bit operating systems and the more comfortable 32-bit flat memory model

has resulted in the almost elimination in use of segmented addressing towards the end of the

1990s. However, using the flat memory model has resulted in the 4 gigabyte limit not being far

from everyday use. Segmentation allows operating systems to make the limit a per-process

virtual address space issue, utilizing up to a maximum of 64 gigabytes of system memory, but the

reluctance to eventually return to segmentation is often cited as motivation to move towards 64-

bit processors.

Computer busIn computer architecture, a bus is a subsystem that transfers data or power between computer

components inside a computer or between computers. Unlike a point-to-point connection, a bus

can logically connect several peripheral s over the same set of wires.

Early computer buses were literally parallel electrical buses with multiple connections, but the

term is now used for any physical arrangement that provides the same logical functionality as a

parallel electrical bus. Modern computer buses can use both parallel and bit-serial connections,

and can be wired in either a multidrop (electrical parallel) or daisy chain topology, or connected

by switched hubs, as in the case of USB.

Intel 8086

The 8086 is a 16-bit microprocessor chip designed by Intel in 1978, which gave rise to the x86

architecture. Shortly later the Intel 8088 was introduced with an external 8-bit bus, allowing the

use of cheap chipsets. It was based on the design of the 8080 and 8085 (it was assembly

languagewith the 8080) with a similar register set, but was expanded to 16 bits. The Bus Interface

Unit fed the instruction stream to the Execution Unit through a 6 byte prefetch queue, so fetch

and execution were concurrent – a primitive form of pipelining (8086 instructions varied from 1 to

4 bytes).

It featured four 16-bit general registers, which could also be accessed as eight 8-bit registers, and

four 16-bit index registers (including the stack pointer). The data registers were often used

implicitly by instructions, complicating register allocation for temporary values. It featured 64K 8-

bit I/O (or 32K 16 bit) ports and fixed vectored interrupts. Most instructions could only access one

memory location, so one operand had to be a register. The result was stored in one of the

operands.

There were also four segment registers that could be set from index registers. The segment

registers allowed the CPU to access one megabyte of memory in an odd way. Rather than just

supplying missing bytes, as in most segmented processors, the 8086 shifted the segment register

left 4 bits and added it to the address. As a result segments overlapped, which most people

consider to have been poor design. Although this was largely acceptable (and even useful) for

assembly language, where control of the segments was complete, it caused confusion in

languages which make heavy use of pointers (such as C). It made efficient representation of

pointers difficult, and made it possible to have two pointers with different values pointing to the

same location. Worse, this scheme made expanding the address space to more than one

megabyte difficult. Effectively, it was expanded by changing the addressing scheme in the 80286.

The processor runs at clock speeds between 4.77 (in the original IBM PC) and 10 MHz.

Typical execution times in cycles (estimates):

addition: 3–4 (register), 9+EA–25+EA (memory access)

multiplication: 70–118 (register), 76+EA–143+EA (memory access)

move: 2 (register), 8+EA–14+EA (memory access)

near jump: 11–15, 18+EA (memory access)

far jump: 15, 24+EA (memory access)

EA: time to compute effective address, ranging from 5 to 12 cycles

The 8086 did not contain any floating point instructions, but could be connected to a

mathematical coprocessors to add this capability. The Intel 8087 was the standard version, but

manufacturers like Weitek soon offered higher performance alternatives.

Microcomputers using the 8086

The first commercial microcomputer built on the basis of the 8086 was the Mycron 2000.

The IBM Displaywriter word processing machine also used the 8086. The most influential

microcomputer of all, the IBM PC, used the 8-bit variant, the Intel 8088.

History

The x86 architecture first appeared inside the Intel 8086 CPU in 1978; the 8086 was a development

of the 8008 processor (which itself followed the 4004). It was adopted (in the simpler 8088 version)

three years later as the standard CPU of the IBM PC. The ubiquity of the PC platform has resulted

in the x86 becoming one of the most successful CPU architectures ever.

Other companies also manufacture or have manufactured CPUs conforming to the x86

architecture: examples include Cyrix (now owned by VIA Technologies), NEC Corporation, IBM,

IDT, and Transmeta. The most successful of the clone manufacturers has been AMD, whose

Athlon series is a close second to the Pentium series for popularity.

The 8086 was a 16-bit processor; the architecture remained 16-bit until 1985, when the 32-bit

80386 was developed. Subsequent processors represented refinements of the 32-bit architecture,

introducing various extensions, until in 2003 AMD developed a 64-bit extension to the architecture

in the form of the AMD64 standard, introduced with the Opteron processor family, which was also

adopted a few years later (under a different name) in a new generation of Intel Pentiums.

Note that Intel also introduced a separate 64-bit architecture used in its Itanium processors which

it calls IA-64 or more recently IPF (Itanium Processor Family). IA-64 is a completely new system

that bears no resemblance whatsoever to the x86 architecture; it should not be confused with IA-

32, which is essentially synonymous with x86.

Design

The x86 architecture is essentially CISC with variable instruction length. Word sized memory

access is allowed to unaligned memory addresses. Words are stored in the little-endian order.

Backwards compatibility has always been a driving force behind the development of the x86

architecture (the design decisions this has required are often criticised, particularly by

proponents of competing processors, who are frustrated by the continued success of an

architecture widely perceived as quantifiably inferior). Modern x86 processors translate the x86

instruction set to more RISC-like micro-instructions upon which modern micro-architectural

techniques can be applied.

Note that the names for instructions and registers (mnemonics) that appear in this brief review

are the ones specified in Intel documentation and used by Intel (and compatible, eg. Microsoft's

MASM, Borland's TASM, CAD-UL's as386, etc.) assemblers. An instruction that is specified in the

Intel syntax by mov al, 30h is equivalent to AT&T-syntax movb $0x30, %al, and both translate to

the two bytes of machine code B0 30 (hexadecimal). You can see that there is no trace left in this

code of either "mov" or "al", which are the original Intel mnemonics. If we wanted, we could write

an assembler that would produce the same machine code from the command "move immediate

byte hexadecimally encoded 30 into low half of the first register". However, the convention is to

stick to Intel's original mnemonics.

The x86 assembly language is discussed in more detail in the x86 assembly language article.

Real mode Intel 8086 and 8088 had 14 16-bit registers. Four of them (AX, BX, CX, DX) were general purpose

(although each had also an additional purpose; for example only CX can be used as a counter

with the loop instruction). Each could be accessed as two separate bytes (thus BX's high byte can

be accessed as BH and low byte as BL). In addition to them, there are 4 segment registers (CS,

DS, SS and ES). They are used to form a memory address. There are 2 pointer registers (SP which

points to the bottom of the stack, and BP which can be used to point at some other place in the

stack or the memory). There are two index registers (SI and DI) which can be used to point inside

an array. Finally, there are the flag register (containing flags such as carry, overflow, zero and so

on), and the instruction pointer (IP) which points at the current instruction.

In real mode, memory access is segmented. This is done by shifting the segment address left by 4

bits and adding an offset in order to receive a final 20-bit address. Thus the total address space in

real mode is 220 bytes, or 1 MB, quite an impressive figure for 1978. There are two addressing

modes: near and far. In far mode, both the segment and the offset are specified. In near mode,

only the offset is specified, and the segment is taken from the appropriate register. For data the

register is DS, for code is CS, and for stack it is SS. For example, if DS is A000h and SI is 5677h,

DS:SI will point at the absolute address DS × 16 + SI = A5677h.

In this scheme, two different segment/offset pairs can point at a single absolute location. Thus, if

DS is A111h and SI is 4567h, DS:SI will point at the same A5677h as above. In addition to

duplicity, this scheme also makes it impossible to have more than 4 segments at once. Moreover,

CS, DS and SS are vital for the correct functioning of the program, so that only ES can be used to

point somewhere else. This scheme, which was intended as a compatibility measure with the Intel

8085 has caused no end of grief to programmers.

In addition to the above-said, the 8086 also had 64K of 8-bit (or alternatively 32K of 16-bit) I/O

space, and a 64K (one segment) stack in memory supported by hardware. Only words (2 bytes)

can be pushed to the stack. The stack grows downwards, its bottom being pointed by SS:SP.

There are 256 interrupts, which can be created by both hardware and software. The interrupts can

cascade, using the stack to store the return address.

16-bit protected mode The Intel 80286 could support 8086 real mode 16-bit software without any changes, however it

also supported another mode of work called the protected mode, which expanded addressable

physical memory to 16MB and addressable virtual memory to 1GB. This was done by using the

segment registers only for storing an index to a segment table. There were two such tables, the

GDT and the LDT, holding each up to 8192 segment descriptors, each segment giving access to

up to 64 KB of memory. The segment table provided a 24-bit base address, which could then be

added to the desired offset to create an absolute address. In addition, each segment could be

given one of four privilege levels (called the rings).

Although the introductions were an improvement, they were not widely used because a protected

mode operating system could not run existing real mode software as processes. Such capability

only appeared with the virtual 8086 mode of the subsequent 80386 processor.

In the meantime, operating systems like OS/2 tried to ping-pong the processor between protected

and real modes. This was both slow and unsafe, as in real mode a program could easily crash the

computer. OS/2 also defined restrictive programming rules which allowed a Family API or bound

program to run either in real mode or in protected mode. This was however about running

programs originally designed for protected mode, not vice-versa. By design, protected mode

programs did not suppose that there is a relation between selector values and physical

addresses. It is sometimes mistakenly believed that problems with running real mode code in 16-

bit protected mode resulted from IBM having chosen to use Intel reserved interrupts for BIOS

calls. It is actually related to such programs using arbitrary selector values and performing

"segment arithmetic" described above on them.

This problem also appeared with Windows 3.0. Optimally, this release wanted to run programs in

16-bit protected mode, while previously they were running in real mode. Theoretically, if a

Windows 1.x or 2.x program was written "properly" and avoided segment arithmetic it would run

indifferently in both real and protected modes. Windows programs generally avoided segment

arithmetic because Windows implemented a software virtual memory scheme and moved

program code and data in memory when programs were not running, so manipulating absolute

addresses was dangerous; programs were supposed to only keep handles to memory blocks

when not running, and such handles were quite similar to protected-mode selectors already.

Starting an old program while Windows 3.0 was running in protected mode triggered a warning

dialog, suggesting to either run Windows in real mode (it could presumably still use expanded

memory, possibly emulated with EMM386 on 80386 machines, so it was not limited to 640KB) or

to obtain an updated version from the vendor. Well-behaved programs could be "blessed" using a

special tool to avoid this dialog. It was not possible to have some GUI programs running in 16-bit

protected mode and other GUI programs running in real mode, probably because this would

require having two separate environments and (on 80286) would be subject to the previously

mentioned ping-ponging of the processor between modes. In version 3.1 real mode disappeared.

32-bit protected mode

The Intel 80386 introduced, perhaps, the greatest leap so far in the x86 architecture. With the

notable exception of the Intel 80386SX, which was 32-bit yet only had 24-bit addressing (and a 16-

bit data bus), it was all 32-bit - all the registers, instructions, I/O space and memory. To work with

the latter, it used a 32-bit extension of Protected Mode. As it was in the 286, segment registers

were used to index inside a segment table that described the division of memory. Unlike the 286,

however, inside each segment one could use 32-bit offsets, which allowed every application to

access up to 4GB without segmentation and even more if segmentation was used. In addition, 32-

bit protected mode supported paging, a mechanism which made it possible to use virtual

memory.

No new general-purpose registers were added. All 16-bit registers except the segment ones were

expanded to 32 bits. Intel represented this by adding "E" to the register mnemonics (thus the

expanded AX became EAX, SI became ESI and so on). Since there was a greater number of

registers, instructions and operands, the machine code format was expanded as well. In order to

provide backwards compatibility, the segments which contain executable code can be marked as

containing either 16 or 32 bit instructions. In addition, special prefixes can be used to include 32-

bit instructions in a 16-bit segment and vice versa.

Paging and segmented memory access were both required in order to support a modern

multitasking operating system. Linux, 386BSD, Windows NT and Windows 95 were all initially

developed for the 386, because it was the first CPU that made it possible to reliably support the

separation of programs' memory space (each into its own address space) and the preemption of

them in the case of necessity (using rings). The basic architecture of the 386 became the basis of

all further development in the x86 series.

The Intel 80387 math co-processor was integrated into the next CPU in the series, the Intel 80486.

The new FPU could be used to make floating point calculations, important for scientific

calculation and graphic design.

MMX and beyond

1996 saw the appearance of the MMX (Matrix Math Extensions, though sometimes incorrectly

referred to as Multi-Media Extensions) technology by Intel. While the new technology has been

advertised widely and vaguely, its essence is very simple: MMX defined 8 64-bit SIMD registers

overlayed onto the FPU stack to the Intel Pentium CPU design. Unfortunately, these instructions

were not easily mappable to the code generated by ordinary C compilers, and Microsoft, the

dominant compiler vendor, was slow to support them even as intrinsics. MMX also is limited to

integer operations. These technical shortcomings caused MMX to have little impact in its early

existence. Nowadays, MMX is typically used for some 2D video applications.

3DNow!

In 1997 AMD introduced the 3DNow! which were SIMD floating point instruction enhancements to

MMX (targeting the same MMX registers). While this did not solve the compiler difficulties, the

introduction of this technology coincided with the rise of 3D entertainment applications in the PC

space. 3D video game developers and 3D graphics hardware vendors used 3DNow! to help

enhance their performance on AMD's K6 and Athlon series of processors.

SSE In 1999 Intel introduced the SSE instruction set which added 8 new 128 bit registers (not

overlayed with other registers). These instructions were analogous to AMD's 3DNow! in that they

primarily added floating point SIMD.

SSE2 In 2001 Intel introduced the SSE2 instruction set which added 1) a complete complement of

integers instructions (analogous to MMX) to the original SSE registers and 2) 64-bit SIMD floating

point instructions to the original SSE registers. The first addition made MMX almost obsolete, and

the second allowed the instructions to be realistically targeted by conventional compilers.

SSE3 Introduced in 2004 along with the Prescott revision of the Pentium 4 processor, SSE3 added

specific memory and thread-handling instructions to boost the performance of Intel's

HyperThreading technology. AMD later licensed the SSE3 instruction set for it's latest (E) revision

Athlon 64 processors. The SSE3 instruction set included on the new Athlons are only lacking a

couple of the instructions that Intel designed for HyperThreading, since the Athlon 64 doesn't

support HyperThreading; however SSE3 is still recognized in software as being supported on the

platform.

64-bit As of 2002, the x86 architecture began to reach some design limits due to the 32-bit character

length. This makes it more difficult to handle massive information stores larger than 4 GB, such

as those found in databases or video editing.

Intel had originally decided to completely drop x86 compatibility with the 64-bit generation, by

introducing a new architecture called IA-64. IA-64 technology is the basis for its Itanium line of

processors. IA-64 is not software compatible with x86 software natively; it uses various forms of

emulation to run x86 software.

AMD took the initiative of extending out the 32-bit x86, aka IA-32 to 64-bit. It came up with an

architecture, called AMD64 (it was called x86-64 until being rebranded), and the first products

based on this technology were the Opteron and Athlon 64 family of processors. Due to the

success of the AMD64 line of processors, Intel adopted the AMD64 instruction set and added

some new extensions of their own, rebranding it the EM64T architecture (apparently not wishing

to acknowledge that the instruction set came from its main rival).

This was the first time that a major upgrade of the x86 architecture was initiated and originated by

a manufacturer other than Intel. Perhaps more importantly, it was the first time that Intel actually

accepted technology of this nature from an outside source.

Virtualization

x86 virtualization is difficult because the architecture does not meet the Popek and Goldberg

virtualization requirements. Nevertheless, there are several commercial x86

virtualizationproducts, such as VMware and Microsoft Virtual PC. Intel and AMD have both

announced that future x86 processors will have new enhancements to facilitate more efficient

virtualization. Intel's code names for their virtualization features are "Vanderpool" and

"Silvervale"; AMD uses the code name "Pacifica".

8086/88 Device Specifications

Both are packaged in DIP (Dual In-Line Packages).

8086: 16-bit microprocessor with a 16-bit data bus

8088: 16-bit microprocessor with an 8-bit data bus.

Both are 5V parts:

8086: Draws a maximum supply current of 360mA.

8086: Draws a maximum supply current of 340mA.

80C86/80C88: CMOS version draws 10mA with temp spec -40 to 225degF.

Input/Output current levels:

Yields a 350mV noise immunity for logic 0 (Output max can be as high as 450mV while input max

can be no higher than 800mV).

This limits the loading on the outputs.

8086/88 Pinout

8086/88 PinoutPin functions:

AD15-AD0

Multiplexed address(ALE=1)/data bus(ALE=0).

A19/S6-A16/S3 (multiplexed)

High order 4 bits of the 20-bit address OR status bits S6-S3.

Indicates if address is a Memory or IO address.

When 0, data bus is driven by memory or an I/O device.

Microprocessor is driving data bus to memory or an I/O device. When 0, data bus contains valid

ALE (Address latch enable)

When 1, address data bus contains a memory or I/O address.

DT/R (Data Transmit/Receive)

Data bus is transmitting/receiving data.

DEN (Data bus Enable)

Activates external data bus buffers.

8086/88 Pinout Pin functions:

S7, S6, S5, S4, S3, S2, S1, S0

oS7: Logic 1, S6: Logic 0.

oS5: Indicates condition of IF flag bits.

oS4-S3: Indicate which segment is accessed during current bus

cycle:

oS2, S1, S0 : Indicate function of current bus cycle (decoded by

8288).

8086/88 Pinout

Pin functions:

oWhen 1 and IF=1, microprocessor prepares to service interrupt.

INTA becomes active after current instruction completes.

Interrupt Acknowledge generated by the microprocessor in response to INTR. Causes the

interrupt vector to be put onto the data bus.

Non-maskable interrupt. Similar to INTR except IF flag bit is not consulted and interrupt is vector

Clock input must have a duty cycle of 33% (high for 1/3 and low for 2/3s)

VCC/GND

Power supply (5V) and GND (0V).

MN/ MX

Select minimum (5V) or maximum mode (0V) of operation. BHE

Bus High Enable. Enables the most significant data bus bits (D 15

-D 8 ) during a read or write

operation.

Used to insert wait states (controlled by memory and IO for reads/writes) into the microprocessor.

Microprocessor resets if this pin is held high for 4 clock periods.

Instruction execution begins at FFFF0H and IF flag is cleared.

oAn input that is tested by the WAIT instruction.

oCommonly connected to the 8087 coprocessor.

Requests a direct memory access (DMA). When 1, microprocessor stops and places address, data

and control bus in high-impedance state.

HLDA (Hold Acknowledge)

Indicates that the microprocessor has entered the hold state.

RO/GT1 and RO/GT0

Request/grant pins request/grant direct memory accesses (DMA) during maximum mode

operation.

Lock output is used to lock peripherals off the system. Activated by using the LOCK: prefix on

any instruction.

QS1 and QS0

The queue status bits show status of internal instruction queue. Provided for access by the

numeric coprocessor (8087).

8284A Clock GeneratorBasic functions:

Clock generation.

RESET synchronization.

READY synchronization.

Peripheral clock signal.

Connection of the 8284 and the 8086.

8284A Clock Generator

8284A Clock GeneratorClock generation:

Crystal is connected to X1 and X2.

XTAL OSC generates square wave signal at crystal's frequency which feeds:

An inverting buffer (output OSC) which is used to drive the EFI input of other 8284As.

2-to-1 MUX

F/ C selects XTAL or EFI external input.

The MUX drives a divide-by-3 counter (15MHz to 5MHz).

This drives:

The READY flipflop (READY synchronization).

A second divide-by-2 counter (2.5MHz clk for peripheral components).

The RESET flipflop.

CLK which drives the 8086 CLK input.

8284A Clock GeneratorRESET:

Negative edge-triggered flipflop applies the RESET signal to the 8086 on the falling edge.

The 8086 samples the RESET pin on the rising edge.

Correct reset timing requires that the RESET input to the microprocessor becomes a logic 1 NO

LATER than 4 clocks after power up and stay high for at least 50us.

BUS Buffering and LatchingDemultiplexing the Buses:

Computer systems have three buses:

Address

Control

The Address and Data bus are multiplexed (shared) due to pin limitations on the 8086.

The ALE pin controls a set of latches.

oAll signals MUST be buffered.

Latches buffer for A 0 -A

Control and A 16

are buffered separately.

Data bus buffers must be bi-

directional buffers (BB).

oBHE : Selects the high-order memory bank.

BUS Buffering and Latching

BUS Timing Writing:

Dump address on address bus.

Dump data on data bus.

Issue a write ( WR ) and set M/ IO to 1.

BUS Timing Reading:

Dump address on address bus.

Issue a read ( RD ) and set M/ IO to 1.

Wait for memory access cycle.

BUS Timing

Bus Timing:

BUS TimingDuring T

The address is placed on the Address/Data bus.

Control signals M/ IO , ALE and DT/ R specify memory or I/O, latch the address onto the

address bus and set the direction of data transfer on data bus.

During T 2 :

8086 issues the RD or WR signal, DEN , and, for a write, the data.

DEN enables the memory or I/O device to receive the data for writes and the 8086 to receive the

data for reads.

During T 3 :

This cycle is provided to allow memory to access data.

READY is sampled at the end of T

If low, T 3 becomes a wait state.

Otherwise, the data bus is sampled at the end of T 3

During T 4 :

All bus signals are deactivated, in preparation for next bus cycle.

Data is sampled for reads, writes occur for writes.

BUS TimingTiming:

Each BUS CYCLE on the 8086 equals four system clocking periods (T states).

The clock rate is 5MHz , therefore one Bus Cycle is 800ns .

The transfer rate is 1.25MHz .

Memory specs (memory access time) must match constraints of system timing.

For example, bus timing for a read operation shows almost 600ns are needed to read data.

However, memory must access faster due to setup times, e.g. Address setup and data setup.

This subtracts off about 150ns .

Therefore, memory must access in at least 450ns minus another 30-40ns guard band for buffers

and decoders.

420ns DRAM required for the 8086.

BUS TimingREADY:

An input to the 8086 that causes wait states for slower memory and I/O components.

A wait state (T W

) is an extra clock period inserted between T 2 and T

3 to lengthen the bus cycle.

For example, this extends a 460ns bus cycle (at 5MHz clock) to 660ns .

Text discusses role of 8284A and timing requirements for the 8086.

MIN and MAX ModeControlled through the MN/ MX pin.

Minimum mode is cheaper since all control signals for memory and I/O are generated by the

microprocessor.

Maximum mode is designed to be used when a coprocessor (8087) exists in the system.

Some of the control signals must be generated externally, due to redefinition of certain control

pins on the 8086.

The following pins are lost when the 8086 operates in Maximum mode .

This requires an external bus controller: The 8288 Bus Controller .

8288 Bus Controller

Separate signals are used for I/O ( IORC and IOWC ) and memory ( MRDC and

MWTC ).

Also provided are advanced memory ( AIOWC ) and I/O ( AIOWC ) write strobes

plus INTA .

MAX Mode 8086 System

Intel 8088

The Intel 8088 is an Intel microprocessor based on the 8086, with 16-bit registers and an 8-bit

external data bus. The processor was used in the original IBM PC.

The 8088 was targeted at economical systems by allowing the use of 8-bit designs. Large bus

width circuit boards were still fairly expensive when it was released. The prefetch queue of the

8088 is 4 bytes, as opposed to the 8086's 6 bytes. The descendants of the 8088 include the 8018,

80288 (obsolete), and 80388 microcontrollers which are still in use today.

The most influential microcomputer to use the 8088 was, by far, the IBM PC. The original PC

processor ran at a clock frequency of 4.77 MHz.

Apparently IBM's own engineers wanted to use the Motorola 68000, and it was used later in the

forgotten IBM Instruments 9000 Laboratory Computer, but IBM already had rights to manufacture

the 8086 family, in exchange for giving Intel the rights to its bubble memory designs. A factor for

using the 8-bit Intel 8088 version was that it could use existing Intel 8085-type components, and

allowed the computer to be based on a modified 8085 design. 68000 components were not widely

available at the time, though it could use Motorola 6800 components to an extent. Intel bubble

memory was on the market for a while, but Intel left the market due to fierce competition from

Japanese corporations who could undercut by cost, and left the memory market to focus on

processors.

A compatible replacement chip, the V20, was produced by NEC for an approximate 20 percent

improvement in computing power.

Assembly language

Assembly language or simply assembly is a human-readable notation for the machine language

that a specific computer architecture uses. Machine language, a pattern of bits encoding machine

operations, is made readable by replacing the raw values with symbols called mnemonics.

For example, a computer with the appropriate processor will understand this x86/IA-32 machine

instruction:

10110000 01100001

For programmers, however, it is easier to remember the equivalent assembly language

representation:

mov al, 0x61

which means to move the hexadecimal value 61 (97 decimal) into the processor register with the

name "al". The mnemonic "mov" is short for "move", and a comma-separated list of arguments or

parameters follows it; this is a typical assembly language statement.

Unlike in high-level languages, there is usually a 1-to-1 correspondence between simple assembly

statements and machine language instructions. Transforming assembly into machine language is

accomplished by an assembler, and the reverse by a disassembler.

Every computer architecture has its own machine language, and therefore its own assembly

language. Computers differ by the number and type of operations that they support. They may

also have different sizes and numbers of registers, and different representations of data types in

storage. While all general-purpose computers are able to carry out essentially the same

functionality, the way they do it differs, and the corresponding assembly language must reflect

these differences.

In addition, multiple sets of mnemonics or assembly-language syntax may exist for a single

instruction set. In these cases, the most popular one is usually that used by the manufacturer in

their documentation.

Machine instructions

Instructions in assembly language are generally very simple, unlike in a high-level language. Any

instruction that references memory (for data or as a jump target) will also have an addressing

mode to determine how to calculate the required memory address. More complex operations must

be built up out of these simple operations. Some operations available in most instruction sets

include:

movings

set a register (a temporary "scratchpad" location in the CPU itself) to a fixed constant

move data from a memory location to a register, or vice versa. This is done to obtain

the data to perform a computation on it later, or to store the result of a computation.

read and write data from hardware devices

computing

add, subtract, multiply, or divide the values of two registers, placing the result in a

register

perform bitwise operations, taking the conjunction/disjunction (and/or) of

corresponding bits in a pair of registers, or the negation (not) of each bit in a register

compare two values in registers (for example, to see if one is less, or if they are equal)

affecting program flow

jump to another location in the program and execute instructions there

jump to another location if a certain condition holds

jump to another location, but save the location of the next instruction as a point to

return to (a call)

Specific instruction sets will often have single, or a few instructions for common operations

which would otherwise take many instructions. Examples:

saving many registers on the stack at once

moving large blocks of memory

complex and/or floating-point arithmetic (sine, cosine, square root, etc.)

applying a simple operation (for example, addition) to a vector of values

Addressing mode

In computer programming, addressing modes are primarily of interest to compiler writers and to

those (few nowadays) who use assembly language. Some computer science students may also

need to learn about addressing modes as part of their studies. Those involved with CPU design or

computer architecture should already know this and a lot more.

Addressing modes form part of the instruction set architecture for some particular type of CPU.

Some machine languages will need to refer to (addresses of) operands in memory. An addressing

mode specifies how to calculate the effective memory address of an operand by using

information held in registers and/or

constants contained within a machine instruction.

Opcode

Microprocessors perform operations using binary bits (on/off/1or0).

Four bits is equal to a byte, and two bytes is equal to a word.

as an example lets design a crude 4-bit microprocessor.

all registers/ALU/counter/address have a data path of 4-bit wide.

and all of our instructions must fit in a 3-bit address.

these are the op-codes mnemonic operations explanation

000 ADD add A to B and store in b

001 mov move A to B and store in b

010 Jmp jump value in A

011 xorA xor A with next op-code store in b

100 clrA clear A

101 return return to pointer

110 counter counter value

111 end end program

when the op-code values are active at the decoders logic inputs the desired operations are

performed..

this is a better explanation then whats below..cleanUpLater.., each of which is assigned a numeric

code called an opcode. To assist in the use of these numeric codes, mnemonics are used as

textual abbreviations. It's much easier to remember ADD than 05, for example.

Opcodes operate on registers, values in memory, values stored on the stack, I/Oports, the bus,

etc. They are used to perform arithmetic operations and move and change values. Operands are

the things that opcodes operate on.

Mnemonic

A mnemonic (Pronounced in American English in British English is a memoryaid. Mnemonics are

often verbal, are sometimes in verse form, and are often used to remember lists. Mnemonics rely

not only on repetition to remember facts, but also on associations between easy-to-remember

constructs and lists of data, based on the principle that the human mind much more easily

remembers data attached to spatial, personal or otherwise meaningful information than that

occurring in meaningless sequences. The word mnemonic shares etymology with Mnemosyne,

the name of the titan who personified Memory in Greek mythology

Techniques

A mnemonic technique is one of many memory aids that is used to create associations among

facts that make it easier to remember these facts. Popular mnemonic techniques include mind

mapping and peg lists. These techniques make use of the power of the visual cortex to simplify

the complexity of memories. Thus simpler memories can be stored more efficiently. For example,

a number can be remembered as a picture. This makes it easier to retrieve it from memory.

Mnemonic techniques should be used in conjunction with active recall to actually be beneficial.

For example, it is not enough to look at a mind map; one needs to actively reconstruct it in one's

memory

Instruction set

An instruction set, or instruction set architecture (ISA), describes the aspects of a computer

architecture visible to a programmer, including the native datatypes, instructions, registers,

addressing modes, memory architecture, interrupt and exception handling, and external I/O (if

An ISA is a specification of the set of all binary codes (opcodes) that are the native form of

commands implemented by a particular CPU design. The set of opcodes for a particular ISA is

also known as the machine language for the ISA.

"Instruction set architecture" is sometimes used to distinguish this set of characteristics from the

microarchitecture, which is the set of processor design techniques used to implement the

instruction set (including microcode, pipelining, cache systems, and so forth). Computers with

different microarchitectures can share a common instruction set. For example, the Intel Pentium

and the AMD Athlon implement nearly identical versions of the x86 instruction set, but have

radically different internal designs. This concept can be extended to unique ISAs like TIMI present

in the IBM System/38 and IBM IAS/400. TIMI is an ISA that is implemented as low-level software

and functionally resembles what is now referred to as a virtual machine It was designed to

increase the longevity of the platform and applications written for it, allowing the entire platform

to be moved to very different hardware without having to modify any software except that which

comprises TIMI itself. This allowed IBM to move the AS/400 platform from an older CISC

architecture to the newer POWER architecture without having to rewrite any parts of the OS or

software associated with it.

When designing microarchitectures, engineers use Register Transfer Language (RTL) to define

the operation of each instruction of an ISA.

An ISA can also be emulated in software by a interpreter. Due to the additional translation needed

for the emulation, this is usually slower than directly running programs on the hardware

implementing that ISA. Today, it is common practice for vendors of new ISAs or

microarchitectures to make software emulators available to software developers before the

hardware implementation is ready.

Assembly language directives

In addition to codes for machine instructions, assembly languages have extra directives for

assembling blocks of data, and assigning address locations for instructions or code.

They usually have a simple symbolic capability for defining values as symbolic expressions

which are evaluated at assembly time, making it possible to write code that is easier to read and

understand.

Like most computer languages, comments can be added to the source code which are ignored by

the assembler.

They also usually have an embedded macro language to make it easier to generate complex

pieces of code or data.

In practice, the absence of comments and the replacement of symbols with actual numbers

makes the human interpretation of disassembled code considerably more difficult than the

original source would be.

Usage of assembly language

There is some debate over the usefulness of assembly language. It is often said that modern

compilers can render higher-level languages into codes that run as fast as hand-written

assembly, but counter-examples can be made, and there is no clear consensus on this topic. It is

reasonably certain that, given the increase in complexity of modern processors, effective hand-

optimization is increasingly difficult and requires a great deal of knowledge.

However, some discrete calculations can still be rendered into faster running code with assembly,

and some low-level programming is simply easier to do with assembly. Some system-dependent

tasks performed by operating system simply cannot be expressed in high-level languages. In

particular, assembly is often used in writing the low level interaction between the operating

system and the hardware, for instance in device drivers. Many compilers also render high-level

languages into assembly first before fully compiling, allowing the assembly code to be viewed for

debugging and optimization purposes.

It's also common, especially in relatively low-level languages such as, to be able to embed

assembly language into the source code with special syntax. Programs using such facilities, such

as the Linux kernel often construct abstractions where different assembly is used on each

platform the program supports, but it is called by portable code through a uniform interface.

Many embedded systems are also programmed in assembly to obtain the absolute maximum

functionality out of what is often very limited computational resources, though this is gradually

changing in some areas as more powerful chips become available for the same minimal cost.

Another common area of assembly language use is in the system BIOS of a computer. This low-

level code is used to initialize and test the system hardware prior to booting the OS and is stored

in ROM. Once a certain level of hardware initialization has taken place, code written in higher level

languages can be used, but almost always the code running immediately after power is applied is

written in assembly language. This is usually due to the fact system RAM may not yet be

initialized at power-up and assembly language can execute without explicit use of memory,

especially in the form of a stack.

Assembly language is also valuable in reverse engineering, since many programs are distributed

only in machine code form, and machine code is usually easy to translate into assembly language

and carefully examine in this form, but very difficult to translate into a higher-level language.

Tools such as the Interactive Disassembler make extensive use of disassembly for such a

purpose.

Interrupt

In computer science, an interrupt is a signal from a device which typically results in a context

switch: that is, the processor sets aside what it's doing and does something else.

Digital computers usually provide a way to start software routines in response to asynchronous

electronic events. These events are signaled to the processor via interrupt requests (IRQ). The

processor and interrupt code make a context switch into a specifically written piece of software to

handle the interrupt. This software is called the interrupt service routine, or interrupt handler. The

addresses of these handlers are termed interrupt vectors and are generally stored in a table in

RAM, allowing them to be modified if required.

Interrupts were originated to avoid wasting the computer's valuable time in software loops (called

polling loops) waiting for electronic events. Instead, the computer was able to do other useful

work while the event was pending. The interrupt would signal the computer when the event

occurred, allowing efficient accommodation for slow mechanical devices.

Interrupts allow modern computers to respond promptly to electronic events, while other work is

being performed. Computer architectures also provide instructions to permit processes to initiate

software interrupts or traps. These can be used, for instance, to implement co-operative

multitasking

A well-designed interrupt mechanism arranges the design of the computer bus software and

interrupting device so that if some single part of the interrupt sequence fails, the interrupt restarts

and runs to completion. Usually there is an electronic request, an electronic response, and a

software operation to turn off the device's interrupt, to prevent another request.

Interrupt Types

Typical interrupt types include:

timer interrupts

disk interrupts

power-off interrupts

Other interrupts exist to transfer data bytes using UARTs, or Ethernet, sense key-presses, control

motors, or anything else the equipment must do.

A classic timer interrupt just interrupts periodically from a counter or the power-line. The software

(usually part of an operating system counts the interrupts to keep time. The timer interrupt may

also be used to reschedule the priorities of running processes. Counters are popular, but some

older computers used the power line because power companies control the power-line frequency

with an atomic clock.

A disk interrupt signals the completion of a data transfer from or to the disk peripheral. A process

waiting to read or write a file starts up again.

A power-off interrupt predicts or requests a loss of power. It allows the computer equipment to

perform an orderly shutdown.

Interrupts are also used in typeahead features for buffering events like keystrokes.

Interrupt routines generally have a short execution time. Most interrupt routines do not allow

themselves to be interrupted, because they store saved context on a stack, and if interrupted

many times, the stack could overflow. An interrupt routine frequently needs to be able to respond

to a further interrupt from the same source. If the interrupt routine has significant work to do in

response to an interrupt, and it is not critical that the work be performed immediately, then often

the routine will do nothing but schedule the work for some later time and return as soon as

possible. Some processors support a hierarchy of interrupt priorities, allowing certain kinds of

interrupts to occur while processing higher priority interrupts.

Processors also often have a mechanism referred to as interrupt disable which allows software to

prevent interrupts from interfering with communication between interrupt-code and non-interrupt

code. See mutual exclusion.

Typically, the user can configure the machine using hardware registers so that different types of

interrupts are enabled or disabled, depending on what the user wants. The interrupt signals are

And'ed with a mask, thus allowing only desired interrupts to occur. Some interrupts cannot be

disabled - these are referred to as non-maskable interrupts

Interrupt vector

When a processor receives an interrupt, the normal flow of whatever program it is running stops

and control is passed to another program (or a different part of the same program). A more low-

level description is to say that the CPU stops what it was doing, stores its status somewhere and

jumps to another area of memory and starts running whatever code is there.

The destination to which the CPU jumps for a given interrupt is termed the interrupt vector.

Generally, most computer system designs will incorporate a list of such vectors; this is termed

the interrupt vector table or dispatch table.

Interrupt handlerAn Interrupt Handler is the modern progression of an interrupt service routine, a routine whose

execution is triggered by an interrupt.

In modern systems Interrupt Handlers are split into two parts: the First-Level Interrupt Handler

(FLIH) and the Second-Level Interrupt Handlers (SLIH).

The FLIH operates in the same way as the old interrupt routines did. In response to an interrupt

there is a context switch and the code for the interrupt is loaded and executed. The job of the

FLIH, however, is not to process the interrupt, but to schedule the execution of the SLIH, while

recording any critical information which is only available at the time of the interrupt.

The SLIH sits on the run queue of the operating system until it can be executed to perform the

processing for the interrupt when processor time is available.

It is worth noting that in many systems the FLIH and SLIH are referred to as upper halves and

lower halves, or a derivation of those names.

Non-Maskable interruptA non-maskable interrupt (or NMI) is a special type of interrupt used on in most types of

microcomputer, for example the IBM PC and Apple II.

An NMI causes the CPU to stop what it was doing, change the program counter to point to a

particular address and continue executing code from that location. Programmers are unable to

program the CPU to ignore these interrupts, hence the term "non-maskable".

In practice, NMIs are particularly useful for two reasons.

One is for debugging faulty code, where it can be instantly suspended at any point and control

transferred to a special monitor program, from which the developer can inspect the machine's

memory and examine the internal state of the program as it rests in "suspended animation". The

Apple Macintosh's "programmers' button" worked in this way, as do certain key combinations on

SUN workstations.

A second is for leisure users and gamers. Devices which added a button to generate an NMI, such

as Romantic Robot's Multiface, were a popular accessory for 1980s 8-bit and 16-bit home

computers. These peripherals had a small amount of ROM and an NMI button. Pressing the button

transferred control to the software in the peripheral's ROM, allowing the suspended program to be

saved to disk (very useful for tape-based games with no disk support, but also for saving games

in progress), screenshots to be saved or printed, or values in memory to be manipulated -- a

cheating technique to acquire extra lives, for example.

Some floppy disk interfaces, such as the Miles Gordon Technology's DISCiPLE and PlusD for the

ZX Spectrum, also included an NMI button.

Intel 8087

The 8087 was the first math coprocessor designed by Intel and it was built to be paired with the

Intel 8088 and 8086 microprocessors. The purpose of the 8087, the first of the x87 family, was to

speed up computations on demanding applications involving floating point mathematics. The

performance enhancements went from 20% to 500% depending on the specific application.

This coprocessor introduced about 60 new instructions available to the programmer, all

beginning with "F" to differentiate them from the standard 8086/88 integer math instructions. For

example, in constrast to ADD/MUL, the 8087 provided FADD/FMUL.

The 8087 (and, in fact, the entire x87 family) does not provide a freely, linear register set such as

the AX/BX/CX/DX registers of the 8086/88 and 80286 processors -- the x87 registers are structured

in some form of stack (altough it is not exactly like a typical stack data structure) ranging from

ST0 to ST7. The floating point instructions of the 80x87 coprocessors operate popping and

pushing values onto this stack.

When Intel designed the 8087 it aimed to make a standard floating point format for future designs.

In fact, one of the most successful things from a historical perspective of this coprocessor was

the introduction of the first floating point standard for the x86 PCs: the IEEE 754. The 8087

provided two basic 32/64-bit floating point data types and an additional extended 80-bit internal

support to improve accuracy over large and complex calculations. Apart from this, the 8087

offered a 80-bit/17-digit packed BCD (binary coded decimal) format and 16,32 and 64-bit integer

data types.

The 8087, announced in 1980, was superseded by the 80287, 80387DX/SX and the 487SX. Intel

80486DX, Pentium and later processors include a built-in coprocessor on the CPU core.

Pin Diagram of Intel 8087

Peripheral

A peripheral is a type of computer hardware that is added to a host computer, in order to expand

its abilities. More specifically the term is used to describe those devices that are optional in

nature, as opposed to hardware that is either demanded, or always required in principle.

The term also tends to be applied to devices that are hooked up externally, typically though some

form of computer bus like USB. Typical examples include joysticks, printers and scanners.

Devices such as monitors and disk drives are not considered peripherals because they are not

truly optional, and video capture card are typically not referred to as peripheral because they are

internal devices.

Programmable Peripheral Interface (82C55) The 82C55 is a popular interfacing component, that can interface any TTL-compatible I/O device

to the microprocessor.

It is used to interface to the keyboard and a parallel printer port in PCs (usually as part of an

integrated chipset).

Requires insertion of wait states if used with a microprocessor using higher that an 8 MHz clock.

PPI has 24 pins for I/O that are programmable in groups of 12 pins and has three distinct modes

of operation.

In the PC, an 82C55 or its equivalent is decoded at I/O ports 60H-63H.

Pinout of 82C55 PPI

Interfacing the 82C55 PPI

Programming the 82C55

82C55: Mode 0 Operation

82C55: Mode 0 OperationMode 0 operation causes the 82C55 to function as a buffered input device or as a latched output

device.

In previous example, both ports A and B are programmed as (mode 0) simple latched output

ports.

Port A provides the segment data inputs to display and port B provides a means of selecting one

display position at a time.

Different values are displayed in each digit via fast time multiplexing.

The values for the resistors and the type of transistors used are determined using the current

requirements (see text for details).

Textbook has the assembly code fragment demonstrating its use.

Examples of connecting LCD displays and stepper motors are also given.

82C55: Mode 0 Operation

82C55: Mode 1 Strobed InputPort A and/or port B function as latching input devices. External data is stored in the ports until

the microprocessor is ready.

Port C used for control or handshaking signals (cannot be used for data).

Signal definitions for Mode 1 Strobed Input

82C55: Mode 1 Strobed Input

82C55: Mode 1 Strobed Input Example

Keyboard encoder debounces the key-switches, and provides a strobe whenever a key is

depressed.

DAV is activated on a key press strobing the ASCII-coded key code into Port A.

82C55: Mode 1 Strobed OutputSimilar to Mode 0 output operation, except that handshaking signals are provided using port C.

Signal Definitions for Mode 1 Strobed Output

82C55: Mode 1 Strobed Output

82C55: Mode 2 Bi-directional Operation

Only allowed with port A. Bi-directional bused data used for interfacing

two computers, GPIB interface etc.

82C55: Mode 2 Bi-directional Operation

Timing diagram is a combination of the Mode 1 Strobed Input and Mode 1 Strobed Output Timing

diagrams.

PIC 8259

What are Interrupts?

When receiving data and change in status from I/O Ports, we have two methods available to us.

We can Poll the port, which involves reading the status of the port at fixed intervals to determine

whether any data has been received or a change of status has occurred. If so, then we can branch

to a routine to service the ports requests.

As you could imagine, polling the port would consume quite some time. Time which could be

used doing other things such refreshing the screen, displaying the time etc. A better alternative

would be to use Interrupts. Here, the processor does your tasks such as refreshing the screen,

displaying the time etc, and when a I/O Port/Device needs attention as a byte has been received or

status has changed, then it sends a Interrupt Request (IRQ) to the processor.

Once the processor receives an Interrupt Request, it finishes its current instruction, places a few

things on the stack, and executes the appropriate Interrupt Service Routine (ISR) which can

remove the byte from the port and place it in a buffer. Once the ISR has finished, the processor

returns to where it left off.

Using this method, the processor doesn't have to waste time, looking to see if your I/O Device is

in need of attention, but rather the device will interrupt the processor when it needs attention.

Interrupts and Intel Architecture Interrupts do not have to be entirely associated with I/O devices. The 8086 family of

microprocessors provides 256 interrupts, many of these are only for use as software interrupts,

which we do not attempt to explain in this document. The 8086 series of microprocessors has an Interrupt Vector Table situated at 0000:0000 which extends for 1024 bytes. The Interrupt Vector table holds the

address of the Interrupt Service Routines (ISR), all four bytes in length. This gives us room for the 256 Interrupt Vectors.

INT (Hex) IRQ Common Uses

00 - 01 Exception Handlers -

02 Non-Maskable IRQ Non-Maskable IRQ (Parity Errors)

03 - 07 Exception Handlers -

08 Hardware IRQ0 System Timer

09 Hardware IRQ1 Keyboard

0A Hardware IRQ2 Redirected

0B Hardware IRQ3 Serial Comms. COM2/COM4

0C Hardware IRQ4 Serial Comms. COM1/COM3

0D Hardware IRQ5 Reserved/Sound Card

0E Hardware IRQ6 Floppy Disk Controller

0F Hardware IRQ7 Parallel Comms.

10 - 6F Software Interrupts -

70 Hardware IRQ8 Real Time Clock

71 Hardware IRQ9 Redirected IRQ2

72 Hardware IRQ10 Reserved

74 Hardware IRQ12 PS/2 Mouse

75 Hardware IRQ13 Math's Co-Processor

76 Hardware IRQ14 Hard Disk Drive

78 - FF Software Interrupts -

Table 1 : x86 Interrupt Vectors

The average PC, only has 15 Hardware IRQ's plus one Non-Maskable IRQ. The rest of the interrupt

vectors are used for software interrupts and exception handlers. Exception handlers are routines

like ISR's which get called or interrupted when an error results. Such an example is the first

Interrupt Vector which holds the address of the Divide By Zero, Exception handler. When a divide

by zero occurs the Microprocessor fetches the address at 0000:0000 and starts executing the

code at this Address.

Hardware Interrupts

The Programmable Interrupt Controller (PIC) handles hardware interrupts. Most PC's will have two

of them located at different addresses. One handles IRQ's 0 to 7 and the other, IRQ's 8 to 15,

giving a total of 15 individual IRQ lines, as the second PIC is cascaded into the first, using IRQ2.

Most of the PIC's initialization is done by BIOS, thus we only have to worry about two

instructions. The PIC has a facility available where we can mask individual IRQ's so that these

requests will not reach the Processor. Thus the first instruction is to the Operation Control Word

1 (OCW1) to set which IRQ's to mask and which IRQ's not too. As there are two PIC's located at different addresses, we must first determine which PIC we need to use. The first PIC, located at Base Address 0x20h

controls IRQ 0 to IRQ 7. The bit format of PIC1's Operation Control Word 1 is shown below in table 2.

Bit Disable IRQ Function

7 IRQ7 Parallel Port

6 IRQ6 Floppy Disk Controller

5 IRQ5 Reserved/Sound Card

4 IRQ4 Serial Port

3 IRQ3 Serial Port

2 IRQ2 PIC2

1 IRQ1 Keyboard

0 IRQ0 System Timer

Table 2 : PIC1 Operation Control Word 1 (0x21)

Note that IRQ 2 is connected to PIC2, thus if you mask this IRQ, then you will be disabling IRQ's 8

to 15.

The second PIC located at a base address of 0xA0h controls IRQs 8 to 15. Below is the individual

bits required to make up it's Operation Control Word.

Bit Disable IRQ Function

7 IRQ15 Reserved

6 IRQ14 Hard Disk Drive

5 IRQ13 Maths Co-Processor

4 IRQ12 PS/2 Mouse

3 IRQ11 Reserved

2 IRQ10 Reserved

1 IRQ9 Redirected IRQ2

0 IRQ8 Real Time Clock

Table 3 : PIC2 Operation Control Word 1 (0xA1)

As the above table shows the bits required to disable an IRQ, we must invert them should we

want to enable an IRQ. For example, if we want to enable IRQ 3 then we would send the byte 0xF7

as OCW1 to PIC1. But what happens if one of these IRQs are already enabled and then we come

along and disable it?

Therefore we must first get the mask and use the AND function to output the byte back to the

register with our changes so to cause the least upset to the other IRQs. Going back to our IRQ3

example, we could use outportb(0x21,(inportb(0x21) & 0xF7); to enable IRQ3. Take note that the

OCW1 goes to the register at Base + 1.

The same procedure must be used to mask (disable) an IRQ once we are finished with it. However

this time we must OR the byte 0x08 to the contents of OCW1. Such and example of code is

outportb(0x21,(inportb(0x21) | 0x08);

The other PIC instruction we have to worry about is the End of Interrupt (EOI). This is sent to the

PIC at the end of the Interrupt Service Routine so that the PIC can reset the In Service Register.

See The Programmable Interrupt Controller for more information. An EOI can be sent using

outportb(0x20,0x20); for PIC1 or outportb(0xA0,0x20); for PIC2

Implementing the Interrupt Service Routine (ISR)

In C you can implement your ISR using void interrupt yourisr() where yourisr is a far pointer,

pointing to the address that your Interrupt Service Routine will reside in memory. This is later

placed in the Interrupt Vector Table so that, it will be called when interrupted.

The following code is a basic implementation of an ISR.

void interrupt yourisr() /* Interrupt Service Routine (ISR) */

disable();

/* Body of ISR goes here */

oldhandler();

outportb(0x20,0x20); /* Send EOI to PIC1 */

enable();

void interrupt yourisr() defines this function as an Interrupt Service Routine. disable(); clears the

interrupt flag, so that no other hardware interrupts ,except a NMI (Non-Maskable Interrupt) can

occur. Otherwise, and interrupt with a higher priority that this one can interrupt the execution of

this ISR. However this is not really a problem in many cases, thus is optional.

The body of your ISR will include code which you want to execute upon this interrupt request

being activated. Most Ports/UARTs may interrupt the processor for a range of reasons, eg byte

received, time-outs, FIFO buffer empty, overruns etc, thus the nature of the interrupt has to be

determined. This is normally achieved by reading the status registers of the port you are using.

Once it has been established, you can service it's requests.

If you read any data from a port, it is normally common practice to place it in a buffer, rather that

immediately writing it to the screen, inhibiting further interrupts to be processed. Most Ports

these days will have FIFO buffers which can contain more than one byte, thus repeat your read

routine, until the FIFO is empty, then exit your ISR.

In some cases it may be appropriate to chain the old ISR to this one. Such an example would be

the Clock Interrupt. Other TSR or resident programs may also be using it, thus if you intercept the

interrupt and keep it all for yourself, the other ISR's can no longer function possibly causing some

side effects. However for Serial/Parallel Ports this is not a problem. To chain the old ISR to your

new ISR, you can call it using oldhandler(); where oldhandler points to your old ISR.

Before we can return from the interrupt, we must tell the Programmable Interrupt Controller, that

we are ending the interrupt by sending an EOI (End of Interrupt 0x10) to it. As there are two PIC's

you must first establish which one to send it to. Use outportb(0x20,0x20); for PIC 1 (IRQ 0 - 7) or

outportb(0xA0,0x20); for PIC 2 (IRQ 8 - 15).

Note: If using PIC2, then an EOI has to be sent to both PIC1 and PIC2.

Using your new Interrupt Service Routine

Now that we have written our new interrupt service routine, we can start looking at how to

implement it. The following code segment shows the basic usage of your new ISR. For this

example we have chosen to use IRQ 3.

#include <dos.h>

#define INTNO 0x0B /* Interupt Number - See Table 1 */

void main(void)

oldhandler = getvect(INTNO); /* Save Old Interrupt Vector */

setvect(INTNO, yourisr); /* Set New Interrupt Vector Entry */

outportb(0x21,(inportb(0x21) & 0xF7)); /* Un-Mask (Enable) IRQ3 */

/* Set Card - Port to Generate Interrupts */

/* Body of Program Goes Here */

/* Reset Card - Port as to Stop Generating Interrupts */

outportb(0x21,(inportb(0x21) | 0x08)); /* Mask (Disable) IRQ3 */

setvect(INTNO, oldhandler); /* Restore old Interrupt Vector Before Exit */

Before you can place the address of your new ISR in the interrupt vector table, you must first save

the old interrupt vector, so that you can restore it once you exit your program. This is done using

oldhandler = getvect(INTNO); where INTNO is the number of the interrupt vector you wish to

return. Before oldhandler can be used, you must first declare it using void interrupt ( *oldhandler)

Once the old interrupt vector is stored, we can now install your new ISR into the interrupt vector

table. This is done using the line setvect(INTNO, yourisr); where yourisr points to your interrupt

service routine.

The IRQ which you are using must now be unmasked. We have already discussed this earlier. See

Hardware Interrupts.

Most Ports/UARTs will need some initialization to be able to generate interrupts. For Example, The

Standard Parallel Port (SPP) will require Bit 4 of the Control Port, Enable IRQ Via ACK Line to be

set at Base + 2. The Serial Port will require the appropriate setting of Bits 0 to 4 of the Interrupt

Enable Register (IER) located at Base + 1.

Your body of the program normally consists of a few housekeeping tasks depending upon your

application. Here you look for new keys pressed, menus being selected, updating clocks, checking for

incoming data in buffers etc, knowing that any data from your Ports will be automatically read and

processed, by the ISR.

If you like implementing ISR's so much, you can attach your own ISR to the Keyboard Interrupt, so any

keys being pressed will be automatically handled by another ISR, and even one to the clock. Upon every

18.2 ticks you can update the seconds on your display! The possibilities of ISR's are endless.

Before you exit your program always restore the old interrupt vector, so that your computer

doesn't become unstable. This is done using setvect(INTNO, oldhandler); , where oldhandler

points to the old interrupt service routine, which we stored using oldhandler = getvect(INTNO);

The Programmable Interrupt Controller

As we have all ready discussed, the Interrupt ReQuests (IRQ's) of a PC is handled by two 8259

Programmable Interrupt Controllers. On the old XT's/AT's these were two 28 Pin DIP IC's, but as

you can imagine, things have changed dramatically since then. While the operational principal is

still the same, the PIC is now integrated somewhere into your chipset, along with many other

devices.

The basic block diagram of the PIC is shown above. The 8 individual interrupt request lines are

first passed through the Interrupt Mask Register (IMR) to see if they have been masked or not. If

they are masked, then the request isn't processed any further. However if they are not masked,

they will register their request with the Interrupt Request Register (IRR).

The Interrupt Request Register will hold all the requested IRQ's until they have been dealt with

appropriately. If required, this register can be read by setting certain bits of the Operation Control

Word 3. The Priority Resolver simply selects the IRQ of highest priority. The higher priority

interrupts are the lower numbered ones. For Example IRQ 0 has the highest priority followed by

IRQ 1 etc.

Now that the PIC has determined which IRQ to process, it is now time to tell the processor, so that

it can call your ISR for you. This process is done by sending a INT to the processor, i.e. the INT

line on the processor is asserted. The processor will then finish the current instruction it's

processing and acknowledge your INT request with a INTA (Interrupt Acknowledge) pulse.

Upon receiving the processor's INTA, the IRQ which the PIC is processing at the time is stored in

the In Service Register (ISR) which as the name suggests, shows which IRQ is currently in

service. The IRQ's bit is also reset in the Interrupt Request Register, as it is no longer requesting

service but actually getting service.

Another INTA pulse will be sent by the processor, to tell the PIC to place a 8 bit pointer on the

data bus, corresponding to the IRQ number. If an IRQ serviced by PIC2 is requesting the service,

then PIC2 will send the pointer to the processor. The Master (PIC1) at this stage, will select PIC2

to send the pointer, by placing PIC2's Slave ID on the Cascade lines, which is a 3 wire bus

between all the PIC's in a system.

The 5 most significant bits of this pointer is set using the Initialization Command Word 2 (ICW2).

This will be 00001 for PIC1 and 01110 for PIC2. The three least significant bits, will be sent due to

which IRQ is being serviced. For example, if IRQ3 is requesting service then the 8 bit pointer will

be made up with 00001 for the 5 most significant bits and 011 (IR3) for the least significant bits.

Put this together and you get 00001011 or 0x0B which just happens to be IRQ3's interrupt vector.

For PIC2, the same principal is applied. If IRQ10 is requesting service, then 01110010 will be sent,

which just happens to represent Interrupt 72h. IRQ10 happens to be connected to IR2 on the

Second PIC, thus 010 is used as the least significant bits.

Once your ISR has done everything it needs, it sends an End of Interrupt (EOI) to the PIC, which

resets the In-Service Register. If the request came from PIC2, then EOI's are required to be sent to

both PICs. The PIC will then determine the next highest priority interrupt and repeat the same

process. If no Interrupt Requests are present, then the PIC waits for the next request before

interrupting the processor.

IRQ2/IRQ9 Redirection

The redirection of IRQ2 causes quite some confusion, and thus is discussed here. In the original

XT's there were only one PIC, thus only eight IRQ's. However users soon out grew these

resources, thus an additional 7 IRQ's were added to the PC. This involved attaching another PIC

to the existing one already in the XT. Compatibility always causes problems as the new

configuration still had to be compatible with old hardware and software. The "new" configuration

is shown below.

The CPU only has one interrupt line, thus the second controller had to be connected to the first

controller, in a master/slave configuration. IRQ2 was selected for this. By using IRQ2 for the

second controller, no other devices could use IRQ2, so what happened to all these devices using

IRQ2? Nothing, the interrupt request line found on the bus, was simply diverted into the IRQ 9

input. As no devices yet used the second PIC or IRQ9, this could be done. The next problem was that a hardware device using IRQ2 would install it's ISR at INT 0x0A. Therefore an ISR routine was used at INT 71h, which sent a

EOI to PIC2 and then called the ISR at INT 0x0A. If you dis-assemble the ISR for IRQ9, it will go a little like,

MOV AL,20OUT A0,AL ; Send EOI to PIC2INT 0A ; Call ISR for IRQ2IRET

The routine only has to send a EOI to PIC2, as it is expected that a ISR routine written for IRQ2 will

send a EOI to PIC1. This example destroys the contents of Register AL, thus this must be placed

on the stack first (Not shown in example). As PIC2 is initialized with a Slave on IRQ2, any request

using PIC2 will not call the ISR routine for IRQ2. The 8 bit pointer will come from PIC2.

Programmable Interrupt Controller's Addresses

The two PIC's found in an IBM compatible system are initialized via BIOS thus you don't have to

worry about all of their registers. However for some people, who have inquisitive minds the

following information may come in some use or maybe you want to (re)program a BIOS? Below is

a table of all the command words of the 8259 and compatible Programmable Interrupt Controller.

The Top Table shows the Addresses for the PIC1, while the bottom table shows addresses for

Address Read/Write Function

Write Initialization Command Word 1 (ICW1)

Write Operation Command Word 2 (OCW2)

Read Interrupt Request Register (IRR)

Read In-Service Register (ISR)

Read/Write Interrupt Mask Register (IMR)

Table 4 : Addresses/Registers for PIC1

PIC2 Addresses . . .

Address Read/Write Function

Read Interrupt Request Register (IRR)

Read In-Service Register (ISR)

Read/Write Interrupt Mask Register (IMR)

Table 5 : Addresses/Registers for PIC2

Initialization Command Word 1 (ICW1)

If the PIC has been reset, it must be initialized with 2 to 4 Initialization Command Words (ICW)

before it will accept and process Interrupt Requests. The following selection outlines the four

possible Initialization Command Words.

Bit(s) Function

7:5 Interrupt Vector Addresses for MCS-80/85 Mode.

4 Must be set to 1 for ICW1

31 Level Triggered Interrupts

0 Edge Triggered Interrupts

21 Call Address Interval of 4

0 Call Address Interval of 8

11 Single PIC

0 Cascaded PICs

01 Will be Sending ICW4

0 Don't need ICW4

Table 6 : Initialization Command Word 1 (ICW1)

The 8259 Programmable Interrupt Controller, offers many other features which are not used in the

PC. It also offers support for MCS-80/85 microprocessors. All we have to be aware of being PC

uses, is if the system is running in single mode (One PIC) or if in Cascaded Mode (More than one

PIC) and if the Initialization Command Word 4 is needed. If no ICW4 is used, then all of it's bits will

be set to 0. As we are using it in 8086 mode, we must send a ICW4.

Bit 8086/8080 Mode MCS 80/85 Mode

7 I7 A15

6 I6 A14

5 I5 A13

4 I4 A12

3 I3 A11

2 - A10

1 - A9

0 - A8

Initialization Command Word 2 (ICW2) selects which vector information is released onto the bus,

during the 2nd INTA Pulse. Using the 8086 mode, only bits 7:3 need to be used. This will be

00001000 (0x08) for PIC1 and 01110000 (0x70) for PIC2. If you wish to relocate the IRQ Vector

Table, then you can use this register.

There are two different Initialization Command Word 3's. One is used, if the PIC is a master, while

the other is used for slaves. The top table shows the ICW3 for the master.

Bit Function

7 IR7 is connected to a Slave

Table 8 : Initialization Command Word 3 for Master PIC (ICW3)

And for the slave device, the ICW3 below is used.

Bit(s) Function

7 Reserved. Set to 0

2:0 Slave ID

000 Slave 0

001 Slave 1

010 Slave 2

011 Slave 3

100 Slave 4

101 Slave 5

110 Slave 6

111 Slave 7

Table 9 : Initialization Command Word 3 for Slaves (ICW3)

Initialization Command Word 4 (ICW4) Bit(s) Function

41 Special Fully Nested Mode

0 Not Special Fully Nested Mode

3:2 0x Non - Buffered Mode

10 Buffered Mode - Slave

11 Buffered Mode - Master

11 Auto EOI

0 Normal EOI

01 8086/8080 Mode

0 MCS-80/85

Once again, many of these are special functions not used with the 8259 PIC in a PC. We don't use,

Special Fully Nested Mode, thus this bit is set to 0. Likewise we use non-buffered mode and

Normal EOI's thus all these corresponding bits are set to 0. The only thing we must set is

8086/8080 Mode which is done using Bit 0.

Operation Control Word 1 (OCW1)

Once all the required Initialization Command Words have been sent to the PIC, then you

can send Operation Control Words, in any order and at any time during the PIC's

operation. The Operation Control Words are shown in the next sections.

Bit PIC 2 PIC 1

7 Mask IRQ15 Mask IRQ7

Table 11 : Operation Control Word 1 (OCW1)

Operation Control Word 1, shown above is used to mask the inputs of the PIC. This has already

been discussed, earlier in this article.

Bit(s) Function

7:5 000 Rotate in Auto EOI Mode (Clear)

001 Non Specific EOI

010 Reserved

011 Specific EOI

100 Rotate in Auto EOI Mode (Set)

101 Rotate on Non-Specific EOI

110 Set Priority Command (Use Bits 2:0)

111 Rotate on Specific EOI (Use Bits 2:0)

4 Must be set to 0

3 Must be set to 0

2:0 000 Act on IRQ 0 or 8

001 Act on IRQ 1 or 9

Operation Control Word 2 selects how the End of Interrupt (EOI) procedure works. The only thing

of interest to us in this register is the non-specific EOI command, which we must send at the end

of our ISR's.

Bit(s) Function

7 Must be set to 0

6:5 00 Reserved

01 Reserved

10 Reset Special Mask

11 Set Special Mask

4 Must be set to 0

3 Must be set to 1

21 Poll Command

0 No Poll Command

1:0 00 Reserved

01 Reserved

10 Next Read Returns Interrupt Request Register

11 Next Read Returns In-Service Register

Bits 0 and 1 are of the most significant to us, in Operation Control Word 3. These two bits enable

us to read the status of the Interrupt Request Register (IRR) and the In-Service Register (ISR).

This is done by setting the appropriate bits correctly as above, and reading the register at the

Base Address.

For example if we wanted to read the In-Service Register (ISR), then we would set both bits 1 and

0 to 1. The next read to the base register, (0x20 for PIC1 or 0xA0 for PIC2) will return the status of

the In-Service Register.

http://www.absoluteastronomy.com/reference/list_of_intel_microprocessors

List of Intel microprocessors

This list of Intel microprocessors attempts to present all of Intel's processors (µPs) from the

pioneering 4-bit 4004 (1971) to the present high-end offerings, the 64-bit Itanium 2 (2002) and

Pentium 4F with EM64T(2004). Concise technical data are given for each product.

The 4-bit and 8-bit processors

Intel 4004: 1st single-chip µP

Introduced November 15, 1971

Clock speed 740 kHz

0.06 MIPS

Bus Width 4 bits (multiplexed address/data due to limited pins)

Number of Transistors 2,300 at 10 μm

Addressable Memory 640 bytes

Program Memory 4K bytes

World's first microprocessor

Used in Busicom calculator

Trivia: The original goal was to equal the clock speed of the IBM 1620; this was not quite

Introduced 4th Qtr, 1974

Clock speed of 500 kHz to 740 kHz using 4 to 5.185 MHz crystals

0.06 MIPS

Addressable Memory 640 bytes

Program Memory 8K bytes

Interrupts

Enhanced version of 4004

Introduced April 1, 1972

Clock speed 500 kHz (8008-1: 800 kHz)

0.05 MIPS

Addressable memory 16 kilobytes

Typical in dumb terminals, general calculators, bottling machines

Developed in tandem with 4004

Originally intended for use in the Datapoint 2200 terminal

Clock speed 2MHz

0.64 MIPS

Bus Width 8 bits data, 16 bits address

Addressable memory 64 kilobytes

10X the performance of the 8008

Used in the Altair 8800, Traffic light controller, cruise missile

Required six support chips versus 20 for the 8008

Introduced March 1976

Clock speed 5MHz

0.37 MIPS

Used in Toledo scale

High level of integration, operating for the first time on a single 5 volt power supply, from

12 volts previously

The 16-bit processors: Origin of x86

Introduced June 8, 1978

Clock speeds:

5MHz with 0.33 MIPS

8MHz with 0.66MIPS

10MHz with 0.75 MIPS

Addressable memory 1 megabyte

10X the performance of 8080

Used in portable computing

Assembly language compatible with 8080

Used segment registers to access more than 64K of data at once, bane of programmers'

existence for years to come

Clock speeds:

5MHz with 0.33 MIPS

8MHz with 0.75 MIPS

Internal architecture 16 bits

External bus Width 8 bits data, 20 bits address

Addressable memory 1 megabyte

Identical to 8086 except for its 8 bit external bus

Used in IBM PCs and PC clones

iAPX 432 (chronological entry)

Introduced January 1, 1981

Multi-chip CPU; Intel's first 32-bit microprocessor

See main entry

Introduced 1982

Used mostly in embedded applications - controllers, point-of-sale systems, terminals, and

the like

Included two timers, a DMA controller, and an interrupt controller on the chip in addition to

the processor

Later renamed the iAPX 186

A version of the 80186 with an 8-bit external data bus

Later renamed the iAPX 188

Introduced February 1, 1982

Clock speeds:

6MHz with 0.9 MIPS

8MHz, 10MHz with 1.5 MIPS

12.5MHz with 2.66 MIPS

Bus Width 16 bits

Included memory protection hardware to support multitasking operating systems with per-

process address space

Number of Transistors 134,000 at 1.5 μm

Addressable memory 16 megabytes

Added protected-mode features to 8086 with essentially the same instruction set

3-6X the performance of the 8086

Widely used in PC clones at the time

32-bit processors: The non-x86 µPs

iAPX 432

Introduced January 1, 1981 as Intel's first 32-bit microprocessor

Object/capability architecture

Microcoded operating system primitives

One terabyte virtual address space

Hardware support for fault tolerance

Two-chip General Data Processor (GDP), consists of 43201 and 43202

43203 Interface Processor (IP) interfaces to I/O subsystem

43204 Bus Interface Unit (BIU) simplifies building multiprocessor systems

43205 Memory Control Unit (MCU)

Architecture and execution unit internal data paths 32 bit

Clock speeds:

80186, 80188, 80286, 80386(DX) (chronological entries)

Introduced 1981–1988

See main entries

i960 aka 80960

RISC-like 32-bit architecture

predominantly used in embedded systems

Evolved from the capability processor developed for the BiiN joint venture with Siemens

Many variants identified by two-letter suffixes.

80386SX (chronological entry)

See main entry

80376 (chronological entry)

See main entry

i860 aka 80860

Intel's first superscalarprocessor

RISC 32/64-bit architecture, with pipeline characteristics very visible to programmer

Used in Intel Paragon massively parallel supercomputer

XScale

Introduced August 23, 2000

32-bit RISC microprocessor based on the ARM architecture

Many variants, such as the PXA2xx applications processors, IOP3xx I/O processors and

IXP2xxx and IXP4xx network processors.

32-bit processors: The 80386 range

80386DX

Introduced October 17, 1985

Clock speeds:

16MHz with 5 to 6 MIPS

2/16/1987 20MHz with 6 to 7 MIPS

4/4/1988 25MHz with 8.5 MIPS

4/10/1989 33MHz with 11.4 MIPS (9.4 SPECint92 on Compaq/i 16K L2)

Bus Width 32 bits

Addressable memory 4 gigabytes

Virtual memory 64 terabytes

First x86 chip to handle 32-bit data sets

Reworked and expanded memory protection support including paged virtual memory and

virtual-86 mode, features required by Windows 95and OS/2 Warp

Used in Desktop computing

Can address enough memory to manage an eight-page history of every person on earth

Can scan the Encyclopædia Britannica in 12.5 seconds

80960 (i960) (chronological entry)

See main entry

80386SX

Clock speeds:

16MHz with 2.5 MIPS

1/25/1989 20MHz with 2.5 MIPS, 25MHz with 2.7 MIPS

10/26/1992 33MHz with 2.9 MIPS

External bus width 16 bits

Number of Transitors 275,000 at 1 μm

Virtual memory 256 gigabytes

16-bit address bus enable low cost 32-bit processing

Built in multitasking

Used in entry-level desktop and portable computing

Introduced January 16, 1989; Discontinued June 15, 2001

Variant of 386 intended for embedded systems

No "real mode", starts up directly in "protected mode"

Replaced by much more successful 80386EX from 1994

80860 (i860) (chronological entry)

See main entry

80486DX (chronological entry)

See main entry

80386SL

Clock speeds:

20MHz with 4.21 MIPS

9/30/1991 25MHz with 5.3 MIPS

External bus width 16 bits

First chip specifically made for portable computers because of low power consumption of

Highly integrated, includes cache, bus, and memory controllers

80486SX/DX2/SL, Pentium, 80486DX4 (chronological entries)

Introduced 1991–1994

See main entries

Intel386 EX

Introduced August 1994

Variant of 80386SX intended for embedded systems

Static core, i.e. may run as slowly (and thus, power efficiently) as desired, down to full halt

On-chip peripherals:

clock and power mgmt

timers/counters

watchdog timer

serial I/O units (sync and async) and parallel I/O

RAM refresh

JTAG test logic

Significantly more successful than the 80376

Used aboard several orbiting satellites and microsatellites

Used in NASA's FlightLinux project

32-bit processors: The 80486 range

80486DX

Clock speeds:

25MHz with 20 MIPS (16.8 SPECint92, 7.40 SPECfp92)

5/7/1990 33MHz with 27 MIPS (22.4 SPECint92 on Micronics M4P 128k L2)

6/24/1991 50MHz with 41 MIPS (33.4 SPECint92, 14.5 SPECfp92 on Compaq/50L 256K

Bus Width 32 bits

Number of Transistors 1.2 million at 1 μm; the 50MHz was at 0.8 μm

Level 1 cache on chip

50X performance of the 8088

Used in Desktop computing and servers

80386SL (chronological entry)

See main entry

80486SX

Clock speeds:

9/16/1991 16MHz with 13 MIPS, 20MHz with 16.5 MIPS

9/16/1991 25MHz with 20 MIPS (12 SPECint92)

9/21/1992 33MHz with 27 MIPS (15.86 SPECint92)

Bus Width 32 bits

Number of Transistors 1.185 million at 1 μm and 900,000 at 0.8 μm

Identical in design to 486DX but without math coprocessor

Used in low-cost entry to 486 CPU desktop computing

Upgradable with the Intel OverDrive processor

80486DX2

Introduced March 3, 1992

Clock speeds:

50MHz with 41 MIPS (29.9 SPECint92, 14.2 SPECfp92 on Micronics M4P 256K L2)

8/10/1992 66 MHz with 54 MIPS (39.6 SPECint92, 18.8 SPECfp92 on Micronics M4P

256K L2)

Bus Width 32 bits

Number of Transistors 1.2 million at 0.8 μm

Used in high performance, low cost desktops

Uses "speed doubler" technology where the microprocessor core runs at twice the speed

of the bus

80486SL

Clock speeds:

20MHz with 15.4MIPS

25MHz with 19 MIPS

33MHz with 25 MIPS

Bus Width 32 bits

Used in notebook PCS

Pentium (chronological entry)

See main entry

80486DX4

Clock speeds:

75MHz with 53 MIPS (41.3 SPECint92, 20.1 SPECfp92 on Micronics M4P 256K L2)

100MHz with 70.7 MIPS (54.59 SPECint92, 26.91 SPECfp92 on Micronics M4P 256K L2)

Bus width 32 bits

Pin count 168 PGA Package, 208 SQFP Package

Die size 345 Square mm

Used in high performance entry-level desktops and value notebooks

32-bit processors: The Pentium ("I")

Pentium ("Classic")

P5 0.8 μm process technology

Bus width 64 bits

System bus speed 50 or 60 or 66 MHz

Address bus 32 bits

Number of transistors 3.1 million

Addressable Memory 4 gigabytes

Virtual Memory 64 terabytes

Socket 4 273 pin PGA processor package

Package dimensions 2.16" x 2.16"

Superscalar architecture brought 5X the performance of the 33MHz 486DX processor

Runs on 5 volts

Used in desktops

16KB of L1 cache

Variants

60 MHz with 100 MIPS (70.4 SPECint92, 55.1 SPECfp92 on Xpress 256K L2)

66 MHz with 112 MIPS (77.9 SPECint92, 63.6 SPECfp92 on Xpress 256K L2)

P54C 0.6 μm process technology

Socket 7 296/321 pin PGA package

75 MHz Introduced October 10, 1994

90 MHz Introduced March 7, 1994

P54C 0.35 ²m process technology

90mm die size

120 MHz Introduced March, 1995

133 MHz Introduced June, 1995

150 MHz Introduced January 4, 1996

200 MHz Introduced June 10, 1996

80486DX4 (chronological entry)

See main entry

80386EX (Intel386 EX) (chronological entry)

Introduced August 1994

See main entry

Pentium Pro (chronological entry)

Introduced November 1995

See main entry

Pentium MMX

P55C 0.35 μm process technology

Intel MMX instructions

Socket 7 296/321 pin PGA (pin grid array) package

32KB L1 cache

System bus speed 66 MHz

Variants

166 MHz (Mobile) Introduced January 12, 1998

200 MHz (Mobile) Introduced September 8, 1997

32-bit processors: Pentium Pro, II, Celeron, III, M

Pentium Pro

0.6 μm process technology

Precursor to Pentium II and III

Socket 8 processor package (387 pins) (Dual SPGA)

Number of transistors 22 million

16KB L1 cache

256KB integrated L2 cache

60 MHz system bus speed

Variants

150 MHz Introduced November 1, 1995

0.35 μm process technology, or 0.35 μm CPU with 0.6 μm L2 cache

Number of transistors 36.5 million or 22 million

512KB or 256KB integrated L2 cache

60 or 66 MHz system bus speed

Variants

166 MHz (66 MHz bus speed, 512KB 0.35 μm cache) Introduced November 1, 1995

200 MHz (66 MHz bus speed, 1MB 0.35 μm cache) Introduced August 18, 1997

Pentium II

Introduced May 7, 1997

Klamath 0.35 μm process technology (233, 266, 300MHz)

Pentium Pro with MMX and improved 16-bit performance

242-pin Slot 1 (SEC) processor package

66MHz system bus speed

32KB L1 cache

512KB 1/2 speed external L2 cache

Variants

233 MHz Introduced May 7, 1997

Deschutes 0.25 μm process technology (333, 250, 400, 450MHz)

66MHz system bus speed (333MHz variant), 100MHz system bus speed for all models after

Variants

350 MHz Introduced April 15, 1998

450 MHz Introduced August 24, 1998

233 MHz (Mobile) Introduced April 2, 1998

333 MHz (Mobile)

Celeron (Pentium II-based)

Covington - 0.25 μm process technology

242-pin Slot 1 SEPP (Single Edge Processor Package), Socket 370 PPGA package

32KB L1 cache

No L2 cache

Variants

Mendocino - 0.25 μm process technology

242-pin Slot 1SEPP (Single Edge Processor Package), Socket 370 PPGA package

32KB L1 cache

128KB integrated cache

Variants

300A MHz Introduced August 24, 1998

466 MHz

266 MHz (Mobile)

300 MHz (Mobile)

366 MHz (Mobile)

400 MHz (Mobile)

433 MHz (Mobile)

450 MHz (Mobile) Introduced February 14, 2000

466 MHz (Mobile)

500 MHz (Mobile) Introduced February 14, 2000

Pentium II Xeon chronological entry)

See main entry

Pentium III

Katmai - 0.25 μm process technology

Improved PII, i.e. P6-based core, now including Streaming SIMD Extensions (SSE)

512KB 1/2 speed L2 External cache

242-pin Slot-1 SECC2 (Single Edge Contact cartridge 2) processor package

System Bus Speed 100 MHz

Variants

450 MHz Introduced February 26, 1999

500 MHz Introduced February 26, 1999

533 MHz Introduced (133MHz bus speed) September 27, 1999

600 MHz Introduced (133MHz bus speed) September 27, 1999

Coppermine - 0.18 μm process technology

256KB Advanced Transfer L2 Cache (Integrated)

242-pin Slot-1 SECC2 (Single Edge Contact cartridge 2) processor package, 370-pin FC-

PGA (Flip-chip pin grid array) package

System Bus Speed 100 MHz, 133 MHz (Those with 133 MHz bus carried a 'B' suffix in their

Variants

500 MHz (100MHz bus speed)

533 MHz

600 MHz

650 MHz (100MHz bus speed) Introduced October 25, 1999

700 MHz (100MHz bus speed) Introduced October 25, 1999

750 MHz (100MHz bus speed) Introduced December 20, 1999

800 MHz (100MHz bus speed) Introduced December 20, 1999

800 MHz Introduced December 20, 1999

850 MHz (100MHz bus speed) Introduced March 20, 2000

1000 MHz Introduced March 8, 2000 (Not widely available at time of release)

400 MHz (Mobile) Introduced October 25, 1999

750 MHz (Mobile) Introduced June 19, 2000

900 MHz (Mobile) Introducted March 19, 2001

Tualatin - 0.13 μm process technology

Introduced July 2001

32KB L1 cache

256KB or 512KB Advanced Transfer L2 cache (Integrated)

370-pin FC-PGA (Flip-chip pin grid array) package

Variants

1133 MHz (512KB L2)

1200 MHz

1266 MHz (512KB L2)

1333 MHz

1400 MHz (512KB L2)

Pentium II and III Xeon

PII Xeon

Variants

450 MHz (512 KB L2 Cache) Introduced October 6, 1998

450 MHz (1 MB and 2 MB L2 Cache) Introduced January 5, 1999

PIII Xeon

Number of transistors: 9.5 million at 0.25 μm or 28 million at 0.18 μm)

L2 cache is 256KB, 1MB, or 2MB Advanced Transfer Cache (Integrated)

Processor Package Style is Single Edge Contact Cartridge (S.E.C.C.2) or SC330

System Bus Speed 133 MHz (256KB L2 cache) or 100 MHz (1-2MB L2 cache)

System Bus Width 64 bit

Used in two-way servers and workstations (256KB L2) or 4- and 8-way servers (1-2MB L2)

Variants

500 MHz (0.25 μm process) Introduced March 17, 1999

550 MHz (0.25 μm process) Introduced August 23, 1999

600 MHz (0.18 μm process, 256KB L2 cache) Introduced October 25, 1999

800 MHz (0.18 μm process, 256KB L2 cache) Introduced January 12, 2000

866 MHz (0.18 μm process, 256KB L2 cache) Introduced April 10, 2000

933 MHz (0.18 μm process, 256KB L2 cache)

1000 MHz (0.18 μm process, 256KB L2 cache) Introduced August 22, 2000

700 MHz (0.18 μm process, 1-2MB L2 cache) Introduced May 22, 2000

Celeron (Pentium III Coppermine-based)

Introduced March,2000

Coppermine-128 - 0.18 μm process technology

Streaming SIMD Extensions (SSE)

Socket 370

PPGA processor package

66MHz system bus speed, 100MHz system bus speed on January 3, 2001

32KB L1 cache

128KB Advanced Transfer L2 cache

Variants

533 MHz

566 MHz

800 MHz

850 MHz Introducted April 9, 2001

900 MHz Introducted July 2, 2001

550 MHz (Mobile)

800 MHz (Mobile)

850 MHz (Mobile) Introduced July 2, 2001

600 MHz (LV Mobile)

500 MHz (ULV Mobile) Introducted January 30, 2001

600 MHz (ULV Mobile)

XScale (chronological entry)

See main entry

Pentium 4 (not 4EE, 4E, 4F), Itanium, P4-based Xeon, Itanium 2 (chronological entries)

Introduced April 2000 – July 2002

See main entries

Celeron (Pentium III Tualatin-based)

Tualatin Celeron - 0.13 μm process technology

32KB L1 cache

256KB Advanced Transfer L2 cache

Variants

1.0 GHz

1.1 GHz

1.2 GHz

1.3 GHz

1.4 GHz

Pentium M

Banias 0.13 μm process technology

64KB L1 cache

1MB L2 cache (integrated)

Based on Pentium III core, with SIMD SSE2 instructions and deeper pipeline

Micro-FCPGA, Micro-FCBGA processor package

Heart of the Intel mobile "Centrino" system

400 MHz Netburst-style system bus.

Variants

900MHz (Ultra low voltage)

1.0 GHz (Ultra low voltage)

1.1 GHz (Low voltage)

1.3 GHz

1.4 GHz

1.5 GHz

1.6 GHz

1.7 GHz

Dothan 0.09 μm (90nm) process technology

Introduced May 2004

2MB L2 cache

Revised data prefetch unit

Variants

1.5 GHz

1.6 GHz

1.7 GHz

1.8 GHz

1.9 GHz

2.0 GHz

2.1 GHz

2.2 GHz (To arrive in Q3 2005)

Yonah 0.065 μm (65nm) process technology

To be introduced 2006

Dual Core variants with 2MB Shared L2 cache

Variants

None yet announced

Celeron M

Banias-512 0.13 μm process technology

64KB L1 cache

512KB L2 cache (integrated)

No SpeedStep technology, is not part of the 'Centrino' package

32-bit processors: Pentium 4 range

Pentium 4

0.18 μm process technology (1.40 and 1.50 GHz)

L2 cache was 256KB Advanced Tansfer Cache (Integrated)

Processor Package Style was PGA423, PGA478

System Bus Speed 400 MHz

SSE2 SIMD Extensions

Number of Transistors 42 million

Used in desktops and entry-level workstations

0.18 μm process technology (1.7 GHz)

See the 1.4 and 1.5 chips for details

0.18 μm process technology (1.6 and 1.8 GHz)

Introduced July 2, 2001

See 1.4 and 1.5 chips for details

Core Voltage is 1.15 volts in Maximum Performance Mode; 1.05 volts in Battery

Optimized Mode

Power <1 watt in Battery Optimized Mode

Used in full-size and then light mobile PCs

0.18 μm process technology "Willamette" (1.9 and 2.0 GHz)

See 1.4 and 1.5 chips for details

Pentium 4 (2 GHz, 2.20 GHz)

Pentium 4 (2.4 GHz)

0.13 μm process technology "Northwood A"(1.7, 1.8, 1.9, 2, 2.2, 2.4, 2.5, 2.6 GHz)

Improved branch prediction and other microcodes tweaks

512KB integrated L2 cache

400 MHz system bus.

0.13 μm process technology "Northwood B" (2.26, 2.4, 2.53, 2.66, 2.8, 3.06 GHz)

533 MHz system bus. (3.06 includes Intel's hyper threading technology).

0.13 μm process technology "Northwood C" (2.4, 2.6, 2.8, 3.0, 3.2, 3.4 GHz)

800MHz system bus (all versions include Hyper Threading)

6500 to 10000 MIPS

Itanium (chronological entry)

Introduced 2001

See main entry

Official designation now Xeon, i.e. not "Pentium 4 Xeon"

Xeon 1.4, 1.5, 1.7 GHz

Introduced May 21, 2001

L2 cache was 256KB Advanced Transfer Cache (Integrated)

Processor Package Style was Organic Lan Grid Array 603 (OLGA 603)

System Bus Speed 400MHz

SSE2 SIMD Extensions

Used in high-performance and mid-range dual processor enabled workstations

Xeon 2.0 GHz

Introduced September 25, 2001

Itanium 2 (chronological entry)

Introduced July 2002

See main entry

Pentium 4EE

Introduced September 2003

EE = "Extreme Edition"

same as Pentium 4 Processor, but with 2MB onboard L3 Cache

Pentium 4E

Introduced February 2004

built on 0.09 μm (90 nm) process technology "Prescott" (2.4A, 2.8, 2.8A, 3.0, 3.2, 3.4) 1MB

L2 cache

533MHz system bus (2.4A and 2.8A only)

800MHz system bus (all other models)

Hyper-Threading support is only available on CPUs using the 800MHz system bus.

The processor's integer instruction pipeline has been increased from 20 stages to 31

stages, which theoretically allows for even greater clock speeds.

7500 to 11000 MIPS

Pentium 4F

Introduced Spring 2004

same core as 4E, "Prescott"

3.2–3.6 GHz

starting with the D0 stepping of this processor, EM64T 64-bit extensions has also been

incorporated

Pentium D

Introduced Q2 2005

"Smithfield" dual-core version

2.8–3.2 GHz

1MB+1MB L2 cache (non-shared, 2MB total)

800MHz system-bus

Not hyperthreading, performance increase of 60% over similarly clocked Prescott

Cache-coherency between cores requires communication over the 800MHz FSB

The 64-bit processors: Itanium & ...

Itanium

Released May 29, 2001

733 MHz and 800 MHz

Itanium 2

Released July 2002

900 MHz and 1 GHz

Pentium M (chronological entry)

See main entry

Pentium 4EE, 4E (chronological entries)

Introduced September 2003, February 2004, respectively

See main entries

Intel® Extended Memory 64 Technology

Introduced Spring 2004, with the Pentium 4F (D0 and later P4 steppings)

64-bit architecture extension for the x86 range; near clone of AMD64

· web viewin the meantime, operating systems like os/2 tried to ping-pong the processor between...

Documents

aph ping pong

touch pong

magazine ping pong

assignment ping pong

le pong des civilisations©sumés...le pong des...

pong materials.pptx

the production of bioinsecticide based from pong-pong

ap style pong

in the meantime, please note…

welcome to pongathon, the ping pong party … party...

pong game cards - scratch resources browser · 2020. 11....

411442 pong examples.pdf

ping-pong pong

ping pong 2015

application-oriented ping-pong benchmarking: how to assess...

the meantime matters

portfolio project 04 meantime

scratch_ el pong

pong tutorial

pong game cards - wordpress.com · 2017-08-30 · pong game...