asic &fpga
TRANSCRIPT
Semi Custom Integrated Circuit Design
Introduction :
The semiconductor industry has evolved from the first ICs of the early 1970s and later on grown
rapidly to the present state. The first small-scale integration (SSI ) ICs contained a few (1 to 10)
logic gates NAND gates, NOR gates, and so on amounting to a few tens of transistors. The era
of medium-scale integration (MSI) increased the range of integrated logic available to counters
and similar, larger scale, logic functions. The era of large-scale integration ( LSI ) packed even
larger logic functions, such as the first microprocessors, into a single chip. The era of very large-
scale integration (VLSI ) now offers 64-bit microprocessors, complete with cache memory and
floating-point arithmetic units well over a million transistors on a single piece of silicon. With
the rapid developments in CMOS process technology, transistors continue to get smaller and ICs
hold more and more transistors. Some people (especially in Japan) use the term ultra large scale
integration ( ULSI ), but most people stop at the term VLSI.
The earliest ICs used bipolar technology and the majority of logic ICs used either
transistor logic (TTL) or emitter-coupled logic (ECL). Although invented before the bipolar
transistor, the metal-oxide-silicon (MOS) transistor was initially difficult to manufacture because
of problems with the oxide interface. As these problems were gradually solved, metal-gate n -
channel MOS (n-MOS or NMOS ) technology developed in the 1970s. At that time MOS
technology required fewer masking steps, was denser, and consumed less power than equivalent
bipolar ICs. This meant that, for a given performance, an MOSIC was cheaper than a bipolar IC
and led to investment and growth of the MOS IC market. The introduction of polysilicon as a
gate material was a major improvement in CMOS technology, making it easier to make two
types of transistors, n -channel MOS and p -channel MOS transistors, on the same IC a
complementary MOS (CMOS) technology. The principal advantage of CMOS over NMOS is
lower power consumption. Another advantage of a polysilicon gate was a simplification of the
fabrication process, allowing devices to be scaled down in size.
With the advent of VLSI in the 1980s engineers began to realize the advantages of
designing an IC that was customized or tailored to a particular system or application rather than
using standard ICs alone. Microelectronic system design then becomes a matter of defining the
1
functions that you can implement using standard ICs and then implementing the remaining logic
functions (sometimes called glue logic ) with one or more custom ICs . As VLSI became
possible you could build a system from a smaller number of components by combining many
standard ICs into a few custom ICs. Building a microelectronic system with fewer ICs allows
you to reduce cost and improve reliability.
In early 90s IC industry recognized the importance of Custom Integrated Circuits . As different
types of custom ICs began to evolve for different types of applications, these new ICs gave rise
to a new term : application-specific IC, or ASIC. Examples of ICs that are not ASICs include
standard parts such as : memory chips ,ROMs, DRAM, and SRAM ; microprocessors; TTL or
TTL-equivalent ICs at SSI, MSI, and LSI levels. Examples of ICs that are ASICs include: a chip
for a toy bear that talks; a chip for a satellite; a chip designed to handle the interface between
memory and a microprocessor for a workstation CPU; and a chip containing a microprocessor as
a cell together with other logic.
Types of ASICs : ASIC stands for an Application-Specific Integrated Circuit ((pronounced a-
sick). Based on the design technology ASICs are broadly classified into three types.(i) Full
custom and (ii) Semicustom and (iii) Programmable ASICs.The further classification of ASICs is
shown below.
Full Custom IC
In a full-custom IC only all the mask layers are customized and only , some (possibly all) logic
cells are customized. i.e In a full-custom ASIC some or all of the logic cells, circuits, or layout
2
are designed specifically. This means the designer do not use the pre-tested and pre-
characterized cells for all or part of that design. So, this approach is considered only when there
are no suitable existing cell libraries available that can be used for the entire design. This might
be due to the reason that existing cell libraries are not fast enough, or the logic cells are not
small enough or consume large power.
A microprocessor is an example of a full-custom IC_designers spend many hours squeezing the
most out of every last square micron of microprocessor chip space by hand. Customizing all of
the IC features in this way allows designers to include analog circuits,optimized memory cells,
or mechanical structures on an IC, for example. Full-custom IC share the most expensive to
manufacture and to design. The manufacturing lead time (the time it takes just to make an IC not
including design time) is typically eight weeks for a full-custom IC. These specialized full-
custom ICs are often intended for a specific application, so we might call some of them full-
custom Asics.
A cell-based ASIC (cell-based IC, or CBIC) uses predesigned logic cells (AND gates, OR gates,
multiplexers, and flip-flops, for example) known as standard cells . We could apply the term
CBIC to any IC that uses cells, but it is generally accepted that a cell-based ASIC or CBIC
means a standard-cell based ASIC.The standard-cell areas (also called flexible blocks) in a CBIC
are built of rows of standard cells like a wall built of bricks. The standard-cell areas may be used
in combination with larger predesigned cells, perhaps microcontrollers or even microprocessors,
known as mega cells . Mega cells are also called mega functions, full-custom blocks, system-
level macros (SLMs), fixed blocks, cores, or Functional Standard Blocks (FSBs).
The important features of this type of ASIC are as follows:
All mask layers are customized transistors and interconnect.
Custom blocks can be embedded.
Manufacturing lead time is about eight weeks.
3
Fig. Cell based ASIC
The diagram shows a cell-based ASIC (CBIC) die with a single standard-cell area together with
four fixed blocks. The flexible block contains rows of standard cells. The small squares around
the edge of the die are bonding pads that are connected to the pins of the ASIC package.Each
standard cell in the library is constructed using full-custom design methods. This design style
provides the same performance and flexibility advantages of a full-custom ASIC but reduces
design time and reduces risk.
Gate-Array_Based ASICs :
In a gate array based ASIC the transistors are predefined on the silicon wafer. The predefined
pattern of transistors on a gate array is the base array , and the smallest element that is replicated
to make the base array is the base cell (sometimes called a primitive cell ). Only the top few
layers of metal, which define the interconnect between transistors, are defined by the designer
using custom masks. To distinguish this type of gate array from other types of gate array, it is
often called a masked gate array ( MGA ). The designer chooses from a gate-array library of
predesigned and pre-characterized logic cells. The logic cells in a gate-array library are often
called macros . The reason for this is that the base-cell layout is the same for each logic cell, and
only the interconnect (inside cells and between cells) is customized, so that there is a similarity
between gate-array macros and a software macro.
There are three different types of MGA or gate-array based ASICs .They are
Channeled gate arrays.
Channel less gate arrays.
Structured gate arrays.
4
The meaning of these arrays lies in their names itself and explains their construction. For
example, in the term channeled gate-array architecture, the gate array is channeled . There are
two common ways of arranging (or arraying) the transistors on a MGA: in a channeled gate array
we leave space between the rows of transistors for wiring; the routing on a channel less gate
array uses rows of unused transistors. The channeled gate array was the first to be developed, but
the channel less gate-array architecture is now more widely used. A structured (or embedded)
gate array can be either channeled or channel less but it includes (or embeds) a custom block.
Channeled Gate Array : The important features of this type of MGAs are,
(i) Only the interconnect is customized.
(ii).The interconnects uses predefined spaces between
rows of base cells.
(iii). Manufacturing lead time is between two days and two weeks.
A channeled gate array is similar to a CBIC both use rows of cells separated by channels used for
interconnect. One difference is that the space for interconnect between rows of cells are fixed in
height in a channeled gate array, whereas the space between rows of cells may be adjusted in a
CBIC.
Channel less Gate Array :
5
This channel less Gate array is also known as a channel-free gate array or sea-of-gates array , or
SOG array). The important features of this type of MGA are as follows
Only some (the top few) mask layers are customized the inter connect.
Manufacturing lead time is between two days and two weeks.
The diagram below shows a channel less gate-array or sea-of-gates (SOG) array die. The core
area of the die is completely filled with an array of base cells (the base array.
The key difference between a channel less gate array and channeled gate array is that there are
no predefined areas set aside for routing between cells on a channel less gate array. Instead we
route over the top of the gate-array devices. We can do this because we customize the contact
layer that defines the connections between metal1, the first layer of metal, and the transistors.
When we use an area of transistors for routing in a channel less array, we do not make any
contacts to the devices lying below but we simply leave the transistors unused.
Structured Gate Array :
An embedded gate array or structured gate array (also known as master slice or master image )
combines some of the features of CBICs and MGAs. One of the disadvantages of the MGA is the
fixed gate-array base cell. This makes the implementation of memory, for example, difficult and
inefficient. In an embedded gate array we set aside some of the IC area and dedicate it to a
specific function. This embedded area either can contain a different base cell that is more
6
suitable for building memory cells, or it can contain a complete circuit block, such as a
microcontroller.
The Fig. below shows an embedded gate array. The important features of this type of MGA are
the following:
Only the interconnect is customized.
Custom blocks (the same for each design) can be embedded.
Manufacturing lead time is between two days and two weeks.
Programmable Logic Devices (PLDs):
Programmable logic devices ) are standard ICs that are available in standard configurations .
However, PLDs may also be configured or programmed to create a part customized to a
specific application, and so they also considered as a family of ASICs. PLDs use different
technologies to allow programming of the device. The important features of the PLDs are
outlined below.
No customized mask layers or logic cells
Fast design turnaround
A single large block of programmable interconnect
A matrix of logic macro cells that usually consist of programmable array logic followed
by a flip-flop or latch.
The simplest type of programmable IC is a read-only memory ( ROM ). The most common types
of ROM use a metal fuse that can be blown permanently (programmable ROM or PROM ). An
electrically programmable ROM , or EPROM , uses programmable MOS transistors whose
7
characteristics are altered by applying a high voltage. One can erase an EPROM either by using
another high voltage (an electrically erasable PROM , or EEPROM ) or by exposing the device
to ultraviolet light ( UV-erasable PROM , or UVPROM ).There is another type of ROM that can
be placed on any ASIC - a mask-programmable ROM (mask-programmed ROM or masked
ROM). A masked ROM is a regular array of transistors permanently programmed using custom
mask patterns. An embedded masked ROM is thus a large, specialized, logic cell.
Field-Programmable Gate Arrays :
FPGAs are the newest member of the ASIC family and are rapidly growing in importance,
replacing TTL in microelectronic systems. Even though an FPGA is a type of gate array, we do
not consider the term gate-array_based ASICs to include FPGAs.
The important characteristics of an FPGA are
None of the mask layers are customized.
A method for programming the basic logic cells and the interconnect.
The core is a regular array of programmable basic logic cells that can implement
combinational as well as sequential logic (flip-flops).
A matrix of programmable interconnect surrounds the basic logic cells.
Programmable I/O cells surround the core.
Design turn around is a few hours.
8
As shown in the figure above , all the FPGAs contain a regular structure of programmable basic
logic cells surrounded by programmable interconnect.
The ASIC Design process :
The ASIC design process is explained by the design flow diagram in which the sequence of
steps to design an ASIC are shown. This is called the design flow diagram. In the below flow
diagram , each step involved in the process is explained.
In the above diagram the process is divided into two stages the first one is logical design and the
next one is physical design. The steps involved in the ASIC design process are explained below.
1.Design entry: Enter the design into an ASIC design system, either using a hardware description language ( HDL ) or schematic entry ..2.Logic synthesis : Use an HDL (VHDL or Verilog) and a logic synthesis tool to produce a netlist a description of the logic cells and their connections.
3. System partitioning. Divide a large system into ASIC-sized pieces.
4. Prelayout simulation. Check to see if the design functions correctly.
5. Floorplanning. Arrange the blocks of the net list on the chip.
6. Placement. Decide the locations of cells in a block.
9
Complex Programmable Logic Device (CPLD) :
The CPLD is an advancement in terms of complexity from the SPLD .It is build on on SPLD
architecture and creates a much larger design. So,the CPLD can be used to integrate the
functions of a number of SPLDs into a single device. The CPLD architecture is based on a small
number of logic blocks and a global programmable interconnect.
Generic CPLD architecture :
The CPLD consists of a number of logic blocks (also called as functional blocks), each of which
contains a macro cell and either a PLA or PAL circuit arrangement. In the diagram below, eight
logic blocks are shown. The macro cell provides additional circuitry to accommodate registered
or nonregistered outputs, alongwith signal polarity control. Polarity control provides an output
that is a true signalor a complement of the true signal. The actual number of logic blocks within
aCPLD varies; the more logic blocks available, the larger the design that can be configured. In
the center of the design is a global programmable interconnect. This interconnect allows
connections to the logic block macro cells and the I/O cell arrays .
Fig. Generic CPLD architecture
The programmable interconnect is usually based on either array-based interconnect or
multiplexer-based interconnect:
• Array-based interconnect allows any signal within the programmable interconnect to connect to
any logic block within the CPLD. This is achieved by allowing horizontal and vertical routing
10
within the programmable interconnect and allowing the crossover points to be connected or
unconnected (the same idea as with the PLA and PAL), depending on the CPLD configuration.
• Multiplexer-based interconnect uses digital multiplexers connected to each of the macro cell
inputs within the logic blocks. Specific signals within the programmable interconnect are
connected to specific inputs of the multiplexers. It would not be practical to connect all internal
signals within the programmable interconnect to the inputs of all multiplexers due to size
and speed of operation considerations.
FPGA Architecture:
FPGA stands for Field Programmable Gate Array. FPGAs are popular with Microsystems
designers because they fill the gap between TTL and PLD design and also expensive ASICs.
FPGAs are ideal for prototyping systems or for low-volume production. Normally FPGAs
comprises of :
• Programmable logic blocks which implement logic functions.
• Programmable routing that connects these logic functions.
• I/O blocks that are connected to logic blocks through routing interconnect and that make off-
chip connections.
..
11
FPGAs have a regular array of basic logic cells that are configured using a programming
technology .The chip inputs and outputs use special I/O logic cells that are different from the
basic logic cells. A programmable interconnects scheme forms the wiring between the two types
of logic cells. Finally, the designer uses custom software, tailored to each programming
technology and FPGA architecture, to design and implement the programmable connections. The
programming technology in an FPGA determines the type of basic logic cell and the interconnect
scheme. The logic cells and interconnection scheme, in turn, determine the design of the input
and output circuits as well as the programming scheme
Configurable Logic Block (CLB):
A configurable logic block (CLB) is a basic component of an FPGA that provides the basic logic
and storage functionality for a target application design. Exact numbers and features vary from
device to device, but every CLB consists of a configurable switch matrix with 4 or 6 inputs,
some selection circuitry (MUX, etc), and flip-flops. The switch matrix is highly flexible and can
be configured to handle combinatorial logic, shift registers or RAM . The CLB acts as the main
logic resource for implementing logic circuits. Generally the CLBs contain RAM based
LUTs(look up tables) to implement logic and storage elements that can be used as flip-flops or
latches .CLBs can be programmed to perform various logical functions as well as to store data.
The diagram below shows the XILINX XC3000 CLB which has five logic inputs (A-E), a
common clock input (K), an asynchronous direct-reset input (RD), and an enable clock
(EC).Using programmable MUXes connected to the SRAM programming cells, one can
independently connect each of the two CLB outputs (X and Y) to the output of the flip-flops (QX
and QY) or to the output of the combinational logic (F and G).
12
Normally the CLB is designed such that the CLB propagation delay is fixed, and is equal to the
LUT access time, and independent of the logic function being implemented.
There are seven inputs for the combinational logic in the XC3000 CLB among them five are
CLB inputs from A to E and two are the flip-flop outputs.(OX and OY) There are two outputs
from the LUT (F and G). Since a 32-bit LUT requires only five variables to form a unique
address (32 = 25 ), there are several ways to use the LUT(One can use five of the seven possible
inputs (A to E, QX, QY) with the entire 32-bit LUT. The CLB outputs (F and G) are then
identical
The 32-bit LUT can be split into half, to implement two functions of four variables each. And
can choose four input variables from the seven inputs(A to E, QX, QY). We have to choose two
of the inputs from the five CLB inputs (A to E); then one function output connects to F and the
other output connects to G.
It is also possible to split the 32-bit LUT in half, using one of the seven input variables as a
select input to a 2:1 MUX that switches between F and G. This allows to implement some
functions of six and seven variables.
Interconnect:
While the CLB provides the logic capability, flexible interconnect routes the signals between
CLBs and to and from I/Os. Routing comes in several flavors, from that designed to interconnect
between CLBs to fast horizontal and vertical long lines spanning the device to global low-skew
routing for Clocking and other global signals.
13
Modern CMOS ASICs use two, three, or more levels (or layers) of metal for interconnect. This
allows wires to cross over different layers in the same way that we use copper traces on different
layers on a printed-circuit board. In a two-level metal CMOS technology, connections to the
standard-cell inputs and outputs are usually made using the second level of metal . This allows
for more sophisticated routing programs to take advantage of the extra metal layer to route
interconnect over the top of the logic cells. A connection that needs to cross over a row of
standard cells uses a feed through.
All FPGAs contain some type of programmable interconnect. The structure and complexity of
the interconnect is largely determined by the programming technology and the architecture of the
basic logic cell. The raw material used in building the interconnect is aluminum-based
metallization, which has a sheet resistance of approximately 50 mW /square and a line
capacitance of 0.2 pFcm . The first programmable ASICs were constructed using two layers of
metal newer programmable ASICs use three or more layers of metal interconnect.
In the diagram below the XILINX interconnect architecture is shown.
In the architecture
The vertical lines and horizontal lines run between CLBs. The general-purpose interconnect joins switch boxes (also known as magic boxes or
switching matrices).
The long lines run across the entire chip. It is possible to form internal buses using long lines and the three-state buffers that are next to each CLB.
The direct connections (not used on the XC4000) bypass the switch matrices and directly connect adjacent CLBs.
The Programmable Interconnection Points ( PIP s) are programmable pass transistors that
connect the CLB inputs and outputs to the routing network.
14
The bidirectional ( BIDI ) interconnect buffers restore the logic level and logic strength
on long interconnect paths.Switching Technology:
It is the most important process used for the routing between the logic blocks/logic clusters. In
the FPGA architectures. The basic switch Block is shown below diagram.
Each Switch Block programmably connects each incoming track from a channel to number of
outgoing tracks in other channels. The transistors in the Switch Block add capacitance and
resistance loading to the each track in a channel, and hence the Switch Block has a significant
effect on the speed of each routable connection and thus a major impact on the speed of the
FPGA as a whole.
In addition, since such a large portion of an FPGA is devoted to the routing, the chip area
required by each Switch Block will have a significant effect on the achievable logic density of
the device. Thus, the design of a good Switch Block is of the utmost importance.
Xilinx Field Programmable Gate Arrays :
The basic structure of Xilinx FPGAs is array-based, meaning that each chip comprises a two-
dimensional array of logic blocks that can be interconnected via horizontal and vertical routing
channels. An illustration of this type of architecture was shown in Figure below. Xilinx
introduced the first FPGA family, called the XC2000 series, around 1985 and now offers three
more generations: XC3000, XC4000, and XC5000. Although the XC3000 devices are still
15
widely used, the rent one and more popular is XC4000 family. The XC5000 is similar to
XC4000, but has been designed to offer similar features at a lower cost , with slightly lower
speed. Xilinx has recently introduced an FPGA family based on anti-fuses, called the XC8100.
The XC8100 has many interesting features, but it is not yet in widespread use. The Xilinx 4000
family devices range in capacity from about 2000 to more than 15,000 equivalent gates.
The XC4000 features a logic block (called a Configurable Logic Block (CLB) by Xilinx) that is
based on look-up tables (LUTs). A LUT is a small one bit wide memory array, where the address
lines for the memory are inputs of the logic block and the one bit output from the memory is the
LUT output. A LUT with K inputs would then correspond to a 2K x 1 bit memory, and can
realize any logic function of its K inputs by programming the logic function’s truth table directly
into the memory. The XC4000 CLB contains three separate LUTs, in the configuration shown in
Figure 18. There are two 4-input LUTS that are fed by CLB inputs, and the third LUT can be
used in combination with the other two. This arrangement allows the CLB to implement a wide
range of logic functions of up to nine inputs, two separate functions of four inputs or other
possibilities. Each CLB also contains two flip-flops.
16
The XC4000 chips have “system oriented” features. For example each CLB contains circuitry
that allows it to efficiently perform arithmetic (i.e., a circuit that can implement a fast carry
operation for adder-like circuits) and also the LUTs in a CLB can be configured as read/write
RAM cells. Also, each XC4000 chip includes very wide AND-planes around the periphery of the
logic block array to facilitate implementing circuit blocks such as wide decoders.
The other key feature that characterizes an FPGA is its interconnect structure. The XC4000
interconnect is arranged in horizontal and vertical channels. Each channel contains some number
of short wire segments that span a single CLB (the number of segments in each channel depends
on the specific part number), longer segments that span two CLBs, and very long segments that
span the entire length or width of the chip. Programmable switches are available to connect the
inputs and outputs of the CLBs to the wire segments, or to connect one wire segment to another.
Acknowledgment: The author is thankful to Prof. Michael Smith with out whose book this
class notes would not have been prepared.
This class notes is based on the book “Application-Specific Integrated Circuits By – Michael
Smith and other internet resources. It is purely meant for academic purpose not for any
commercial use.
17