intel 3000 and am 2900 microprocessors — a comparison

I ntel 3000 and AM 2900 microprocessors- a comparison A microprocessor's architecture is a major element in determining the difficulty of programming. Professor Andrew Colin compares two widely used bit-slice systems.

Writing microprograms is notoriously hard. In practice, a major element in determining the difficulty of microprogramming is the architecture of the underlying processor. This paper compares the properties of two widely used bit-slice microprocessor systems - the Intel 3000 and AM 2900 series - as vehicles for microprograms. It begins with a general introduction to microprogrammable microprocessors and goes on to discuss in some detail the ways in which the various facilities needed for microprogramming are implemented on each of the two systems. Being impartial and based on experience, it brings out certain aspects of the system which cannot easily be deduced from the manufacturers" literature alone.

The Inte13000 and AM 2900 are bipolar bit-slice microprocessor systems. Each consists of a 'kit' of related parts which can be used to assemble a computer or other digital machine quickly and flexibly, using only a few components.

In terms of cost and cycle time, the two systems are so similar that designers may have some difficulty in choosing between them. Th is paper gives an overall description of the two systems and of the context in which they might be used, and goes on to compare some of their less obvious

Department of Computer Science, University of Strathclyde, Livingstone Tower, 26 Richmond Street, Glasgow G1 1XH, UK. The views expressed in this paper are those of the author alone.

characteristics such as ease of programming and documentation. In practice, these factors are likely to have more influence than the simple hardware parameters on the cost and performance of the final product. Unfortunately the paper has no comments to offer on the microcomputer development systems- software and in- circuit emulators -which are offered by the manufacturers. While it is accepted that these systems are important and useful, it was not possible for me to gain direct experience of using them. The work which led to this paper was done with 'home-grown' assemblers and simulators.

From the hardware engineer's point of view both systems are close to being ideal. They require only one 5 V power supply and can be directly interfaced to standard transistor-transistor logic (TTL) circuitry with modest loading and a generous fan-out allowance.

Both the 3000 and the 2900 are designed for use in microprogrammed machines. The reader meeting bit-slice microprocessors for the first time may have considerable trouble- as I did - in understanding the exact context for which they are intended. The extended example which follows is meant to help with this initial difficulty.

SMALL M I C R O P R O G R A M M E D COMPUTER

A typical layout for a small microprogrammed general- purpose computer is shown in Figure 1. The running of

Next microaddress - -

Control lines to memory, peripherals etc.

i iMnist~ction ,~ J store '~=

I Secluence

Microorder Sequence Main system generator function control

' ut I Address buffer

Random access memory

l

Address bus

B1 directional data bus

Figure 1. Overall structure of a simple microprogrammable computer

R

vol I no 5june 77 287

1 Sequence function

About 40 50 bits

seq I ..... I logic Condition Arithmet,c emit control selection function

Figure 2. Microinstruction format

Dat. bus gOUrCt.'

Data bus destination

Main memory control Peripheral control

this machine is best thought of in terms of 'major' and 'minor' cycles.

Each minor cycle corresponds to a single clock pulse and would normally take some 100-200 ns. The microprogram address stored in the microsequence controller is used to select a microinstruction from the microprogram store. This instruction is read out and placed in the microinstruction register, where it serves to control the action of the system for the rest of the minor cycle.

The fields in the microinstruction are divided into two groups - those which deal with data transfers during the current cycle and those which control the microsequencing. One possible layout for a microinstruction is shown in Figure 2.

The data-transfer fields in this example look after a number of different components including

• a bidirectional data bus, specifying its source and destination(s) for the current cycle

• an arithmetic unit with registers and adder/subtractor circuits, specifying the required function, the source and destination of the operands, and details such as the initial carry

• a random-access memory (RAM) which forms the main memory of the computer system

• several peripheral devices, such as asynchronous lines, A/D convertors, lights, switches and so on

• various special-purpose registers, such as the 'address buffer' and 'macrocontrol register'

The data paths in all these components are typically 16 bits. ]-he microsequence controller is responsible for

selecting and setting up the address of the next microinstruction to be obeyed. Its primary control comes from the microinstruction register in the form of a 'sequence function'. Some of these functions determine the address entirely on their own; others, which are essentially 'conditional jumps', use the output of a 'sequence logic' unit. This output can be either a single binary condition or a complete new microaddress to jump to. ]-he sequence logic unit is controlled by the appropriate fields from the microinstruction register and takes its raw input from various parts of the system. These include condition codes in the arithmetic unit, the bits in the macrocontrol register, a ready/busy line from the main memory or interrupt lines from the peripherals.

Sometimes the fetching of one microinstruction is overlapped with the execution of the one before it. This speeds up the entire cycle and causes no special problems except that any condition which controls the microsequence must be ready one minor cycle earlier than otherwise. The arrangement is sometimes called 'pipelining'.

The aim of the entire system is to emulate, or behave like, a specified 'target' computer. Every major cycle corresponds to the fetching and execution of a single

instruction in the target computer. This will always take several minor cycles.

It is assumed that the word size of the target computer is the same as the width of the data paths in these various components. The accumulator, stack pointer, program counter and other 'central registers' of the target computer are mapped on the registers within the arithmetic unit. The sequence of minor cycles within a major cycle might be as follows

Fetch nex t ins t ruct ion

• Send the contents of the register which maps the program counter to the address buffer

• Carry out a memory-read cycle; send the output of the memory to the macrocontrol register

• Increment the register which maps the program counter

• Discriminate on the function bits held in the macrocontrol register (which specify an operation like 'add', 'store' or 'jump' for the target computer) and transfer control to the appropriate sequence of microinstructions

From here, a different sequence would be followed for each different function in the instruction repertoire of the target machine. Two examples are

Add

• Move the address part of the current instruction to the address buffer

• Do a memory-read cycle. Add the memory output to the register which maps the accumulator. Jump to first microinstruction for the next major cycle

Jump if accumulator = 0

• Add constant 0to the register used to map the accumulator (This changes nothing but sets the '=0' condition code in the arithmetic unit to a suitable value.)

• If the'_0' indicator is set, return to first microinstruction for the next major cycle. Otherwise

• Move the address part of the current instruction to the register which maps the program counter. Return to first microinstruction for the next major cycle

In practice, the speed of the machine is bounded by the cycle time of the main store, so that an 'emulated' system which uses a microprogram is just as fast as one where the functions of the target computer are decoded and executed by hardware. Microprogramming has three important advantages

• The hardware is simple. All the difficult parts of the design are to be found in the microinstructions

288 microprocessors

R~ Push/pop ~ ~ < Register 4 File enable enable

I ~ ) ~ ' ~ Add ..... egister ~ - ~ Stack pointer [

Direct inputs

D >

SO>. S I ,.-

OR3 O R 2 ~ O R I ~ ~ OR0/"

411

_1 :J -I

Output control

t 4 4/ ' ,) D AR F /J.Pc

Multiplexer X o Xf X 2 X 3

Y0 Y1 Y2 Y3 C

Microprogram counter register

t 4

Figure 3. 2909 sequencer

Clock

Cn + 4

• By changing the microinstructions, the same hardware can easily be switched to emulate any desired architecture over a wide range

• Different sets of hardware (having, for example, differing numbers of registers in the arithmetic unit or data paths of varying widths) can be microprogrammed to emulate the same target computer. Thus a manufacturer can offer a whole line of machines entirely compatible with one another as far as software is concerned, and with a wide range of different speeds and prices

The 3000 and the 2900 both contain components which can be used to construct systems like that in Figure I. Some of the components (like registers or read-0nly memories (ROMs)) are not specialized and can easily be substituted by slightly different types. The most complex and individual parts are shown in Table I.

Table 1.300 and 2900 components

Intel AM Purpose

3002 2909 Microsequence controller 3002 2 9 0 1 Arithmetic unit (with registers) 3003 2902 Fast carry propagation unit

These components shall be described in some detail.

MICROSEQUENCE CONTROLLERS

The 3001 and the 2909 are both chips which set out to do everything necessary for sequence control. Each contains a microinstruction address register which can be connected directly to the address lines in the microprogram ROM. In both cases the sequencing is determined by inputs from the current microinstruction and, where appropriate, from the sequence control logic as well. However, the methods of sequencing are entirely different.

The 2909 is a 4-bit slice. This means that all the main registers are 4 bits long. A 2909 on its own could therefore

generate only 24 or 16 different microaddresses, but the units can be cascaded so that two can control a 256-word microprogram store, three a ROM with 4096 words and so o n .

The main microprogram counter register is backed up by several auxiliary registers, as shown in Figure 3. There is a separate 'address register' and a 4-element stack with its own pointer.

For any cycle, the next address can be determined in four different ways

• It can be the current address, incremented by unity. This mode is convenient for obeying simple sequences of microinstructions without jumps or branches, and in practice it is by far the most frequently used

• The next address can be taken from a set of direct inputs driven by the sequence control logic. This is the main way of organizing microprogram jumps to predetermined addresses. Whenever such a jump is made it is possible to put the current address, incremented by unity, on to the stack. This corresponds to a subroutine entry

• The next address can be taken from the top of the stack, the current address being discarded. This is useful for jumping out of subroutines

• The next address is taken from the'address register'. This mode could be used to jump back to the beginning of a loop. Experience suggests that this mode is not as useful as the other three

Ancillary control lines to the 2909 offer many other facilities thus

• The stack can be pushed or popped at any cycle • The address register can be loaded from an external

source • A 'zero' address can be forced to start the system up • The same microinstruction can be repeated over and

over again until stopped by an external condition. This greatly speeds up such data operations as repeated shifting or binary multiplication

• The current address can be 'or 'ed with an external input. If this input is made to vary according to some condition, the mechanism provides a way of implementing conditional skips and jumps. In practice it is quite enough to control only the least significant bit of the microaddress.

The 3001 is illustrated in Figure 4. This device is broader than the 2909 (9 bits which can address microprograms of up to 512 words) but cannot be cascaded without special arrangements. A key concept in the 3001 is the arrangement of the 512-word address space into 16 columns of 32 rows. Each 9-bit address is subdivided into a column number and a row number, as shown in Figure 5.

The address-sequencing mechanism provides a number of different functions. One of them allows you to jump to any address in the same row as the current instruction; another lets you go to any address in the same column and a third, to any address in row 0. Other functions are used for conditional jumps, but, as we shall see, they all have similar restrictions upon them. There is no incrementer as in the 2909, and no sequence function which simply says, 'Take the next instruction'.

The 3001 incorporates a small number of additional registers and latches. Three single-bit registers called C', Z and F are used to store external conditions and a 4-bit

vol I no _5 june 77 289

Interrupt strobe enable

A d d r e S s control lunction

Load

ISE

Enable

] ~ b u t t e r j

ACe - ~ ACs - ~ .

AC4 ~ AC3

AC2 I AC 1 Micr

a d d t

L "° ~ I [ "~ ~'"

CLK --~ / Next address logic GND ~ L - - - - - -

Vcc --t ~ - " I I o0 ]~ I I a I -

Microprogram memory row address address

ERA MA 8 - - MAo

[ [ output ]

buffer

=

MA 3 = MA 0

FC 0 FC 1 F 1 F o FC~ FC 3 PX 7 PX 4

Flag Flag Flag Flag Primary Logic input output logic instruction control control bus

SX 3 SX o

Secondary instruction bus

Figure 4. 3001 block diagram

MCU output enable

Program latch outputs

register known as the 'PR latch' serves to store quantities (such as the current function code of the target machine) which might be needed for controll ing microprogram sequencing.

These registers are brought into the sequence control in various ways. For example the bits in the PR latch and the condit ion registers can be used as part of the next microaddress. When these bits are used, other bits in the address are forced to particular values. For example, there is a function which takes the lower two bits of the PR latch as digits A7 and A8 of the address; but this function always forces bits A5 and A6 to be ones. The result is that the destination address must lie in one of columns 12 to 15.

The restrictions which surround the condit ion codes are even more stringent; all successful jumps must end in columns 3 or 11 and all unsuccessful ones in columns 2 or 10.

These rules have the effect of making programming difficult. An instruction taken from any given location can only transfer control to a small subset of the other locations. If the transfer is conditional the selection available is even smaller.

All this means that allocating microinstructions to ROY addresses is no longer a simple clerical task which can be left to a microassembler. It becomes a major design problem of about the same difficulty as finding a good layout of components on a PC board.

A method tried relied on marking out a large (8ft x 4in) (2438.4 mm x 101.6 mm) fibre board. Drawing pins were used to place the instructions and move them about until the arrangement seemed satisfactory. It was found that the

row e ~ [ ] . . . . E3 E3 [3 [] E3 [3 [] £3 [3 O [3 [] [ ] E3 [] [] D [] [] [3 [] [] O [] [] [] [] [] D [3

row~, _ _ ~ r - I D D D D D D D D D D D D D D N I

t col e col~

Figure 5. Microprogram address space on 3001

use of the board was very uneven; row 0 (which can be reached from any part of the store) filled up almost immediately and columns 2, 3, 10 and 11, being the only possible destinations for conditional jumps, were used up far more rapidly than the others. In general, it would probably be difficult to place more than about 300

290 microprocessors

Main memory addrm Data out

A A 0 O I D O

1 " - - - - f f h - - - - ¢ , I / = = I ,

I address .

Look ahead ~ ~ ~ I I carry outputs i 1 I

XY I Arithmetic/logic - - - - 1 - - 1 - - ~ = - - - C I carry in Carry out C O ~ section J I I I I L R^ ri ht

Left in LI ---41 ~ j ;- . U g out CLK ------" ~ - - I I

F5 I A B

==IIIF II Illl I"'°'stersll ' ii [1, " .I

M 1 M 0 I t I 0 K 1 K 0 Memory data in External Mask

device in in


instructions into the ROM without running out of space in the 'special' rows and columns.

There are reported attempts to automate the placing of instructions in the ROM, but apparently they only work well when the store is less than half full. This confirms my subjective impression.

The device has several unusual points in its design. Thus it is possible to supply an external address, but only of 8 bits: the ninth is always forced to zero. Even more strangely three of the bits of the PR latch are brought to outputs, but the fourth is not.

The comparison between the 3001 and 2909 is interesting. The 2909 design takes into account many of the basic tenets of good computer architecture, such as sequential access to instructions, uniform treatment of the address space, extensibility and the need for subroutines at several levels• These factors make it easy and straightforward to use.

On the other hand, the 3001 pays scant attention to these ideas. It is difficult to arrange for subroutines of even one level, and allocating instructions to the store is a major problem due solely to the inherent design of the device.

ARITHMETIC UNIT COMPONENTS

Next we move to the arithmetic unit components. The 3002 is a 2-bit slice and contains an adder, 13 2-bit registers, and various multiplexers and buffers. The units can be cascaded to build a complete arithmetic unit with any even number of bits per word.

For systems where speed is not critical, the 'carry out' of one unit is connected to the 'carry in' of the next, so that the total propagation time is determined by the sum of the individual carry propagation times. For a faster system the 3003 chip can be used with up to eight 3002s as a carry Iookahead unit.

A diagram of the 3002 is shown in Figure 6. There are three interesting points

• It has two distinct output buffers. One of them is specifically intended to hold a memory address, so that on a machine which uses this chip a separate address buffer is not needed

• It has three distinct data input busses- M, K and I. In principle two of these inputs and both outputs can be used in the same cycle, so that the system can apparently sustain a high data rate.

• The unit has about 40 different functions to control the arithmetic circuits and the multiplexers which determine the inputs to them

These features are superficially admirable, and it is not until the details of the design are examined that points of difficulty begin to emerge. Nearly all the instructions do very complicated things. A representative example is

C / V (R n A AC /k K)<=CO, R n A (AC A K ) = R n

where CI is the carry in, R is a scratchpad register, AC is the accumulator, K is the K-input bus and CO is the carry out.

Some of the instructions reduce to simple operations as special cases, but there are several important elementary functions missing. Thus the unit contains a number of 'scratchpad' registers and an 'accumulator'. Every instruction can select a register and every instruction can specify a constant to be applied to the K-inputs of the unit.

I I

"1,1, 1,13 21,10 Destination ALU ALU

control function source

Microlnstructicm decode

C l o c k . . . . . . . e ( | 'O'shift

'B' data in I - ~

address'A' (read) ~ 'A' address CP •

RAM | F O

16 addressable registers / 'Q' register

r ~ 'B' address - - CP Q 'B' (read w 'A' 'B' address data data

out out

- - I O A 8 O Q

"--(mm, ICIN

R

R

ALU data source selector

8-function ALU Carry in

A Output enable ~ Output data selection

S

___; --lira--- p .-Ira.. - - C N +4 . ~ " F3 (s'gn)

• Overflow • F =0000

F


vol I no 5june 77 291

There is, however, no instruction which adds such a constant to a given register: it must be done through the accumulator and takes two microcycles. Subtraction cannot be done except by complementing and adding. Furthermore it is impossible to add the contents of a register to the accumulator without destroying the contents of the register as well. These common and important operations have to be programmed in awkward roundabout ways.

The 2909 is also a bit-slice, but it encompasses 4 bits rather than two, and provides 17 internal registers. The carry Iookahead function, if needed, is provided by the 2902.

A diagram of the 2901 is given in Figure 7. Superficially it is much less ambitious than the 3002. There is only one data path for input and one for output. The arithmetic unit has only eight functions, each of which involves two operands and one result. Some additional flexibility is given by the possibility of selecting the operands from amongst the internal registers or the data input, and of disposing of the result to (possibly) a different internal register or the data output, or both.

The most immediate result of this simple design is that the device is easy to understand. The act of programming soon shows that the functions are extremely well chosen, turning out to be precisely the ones needed to write a compact, effective code.

The same emulator was microprogrammed for both the 3000 and 2900 systems, writing the best program possible in both cases. With fair consistency, most functions needed almost twice as many microinstructions on the 3000. This was a direct result of the dearth of suitable functions on the 3002 arithmetic unit.

DOCUMENTATION

Lastly it is worth commenting on the documentation available for the two systems. The view must necessarily be biased because the 3000 description was read first.

The greatest difficulty in studying the Inte13000 was the absence of a background paper to introduce the idea of a bit-slice and to explain its context of use. The chip specifications had to be read about six times before it could be understood what was being said, but once a general picture had been formed it was not hard to extract the details necessary for an actual design.

By contrast the 2900 literature was found very easy to absorb. The AM Corporation clearly attach a high value to good documentation. The bald component specifications are backed up with clear and well written background papers and a 'training kit' with an extensive manual is available. The only curious feature of all this writing is a tendency to confuse left and right: a shift towards the least significant bit of a word is called a'left shift'.

292 microprocessors

intel 3000 and am 2900 microprocessors — a comparison

Documents