unit # 5 dynamic cmos and clocking contents twin-tub cmos/bulk process the starting material is an...
TRANSCRIPT
Unit # 5
DYNAMIC CMOS AND CLOCKING
CONTENTS
5.1 Advantages of CMOS Over nMOS
5.2 CMOS Technologies
5.2.1 CMOS/SOI Technology
5.2.1.1 The CMOS/SOS Technology
5.2.2 CMOS/bulk Technology
5.2.2.1 p-well CMOS/Bulk process
5.2.2.2 n-well CMOS/Bulk process
5.2.2.3 Twin-tub CMOS/Bulk process
5.2.3 Latch-up in Bulk CMOS
5.2.3.1 Parasitic SCR structure
5.3 Static CMOS Design
5.4 Domino CMOS Structures
5.4.1 Domino CMOS logic examples
5.4.2 Cascaded domino CMOS logic gates
5.5 Charge Sharing
5.5.1 Solutions for charge sharing
5.6 Clocking
5.6.1 Clock generation
5.6.2 Clock distribution
5.6.3 Clocked storage elements
5.1 ADVANTAGES OF CMOS OVER nMOS The advantages of CMOS over nMOS are as follows:
• The most important advantage of CMOS is the very low static power dissipation in
compare with nMOS technology.
• Reduced power requirements lead to reduced cost and improved reliability of the final
circuit.
• Low power allows smaller, lower-cost power supplies and simplified power
distribution.
• Desirable speed-power product.
However, in recent times digital circuits are mainly CMOS circuits. We use nMOS
only when we want to fabricate fast and low-cost simple circuit. CMOS is preferred because
of its desirable speed-power product.
Disadvantages of CMOS over nMOS are as follows: • The larger number of process steps required to fabricate CMOS circuits.
• Larger die size.
• CMOS has lower gate density.
• Cost tends to increase with size than with process steps.
Larger area is due to:
• Prevent or minimize the latch-up.
• CMOS logic structures use twice the number of transistors.
• CMOS has more layout rules.
Speed/Power Performance of Available Technologies is as shown in figure 5.1.
Figure 5.1: Speed/Power Performance
5.2 CMOS TECHNOLOGIES
The categories of CMOS technologies are:
a. CMOS/SOI structures
b. CMOS/bulk (CMOS) on silicon substrate
The benefits of CMOS/SOI structures are the reduced load due to the absence of well-to-
substrate capacitance and very small interconnect-substrate capacitance. CMOS/SOI design
rules are simpler than CMOS/bulk rules. CMOS/SOI offers the nMOS designer an easy
transition to CMOS technology. CMOS/bulk requires a well (tub or an island) for at least one
type FET to provide electrical isolation
5.2.1 CMOS Silicon-on-Insulator (SOI) Technology
The SOI CMOS technology uses an insulating substrate to improve process characteristics
such as speed and latch-up susceptibility. The SOI CMOS technology allows the creation of
independent, completely isolated nMOS and pMOS transistors virtually side-by-side on an
insulating substrate.
The main advantages of this technology are:
� The higher integration density (because of the absence of well regions).
� Complete avoidance of the latch-up problem.
� Lower parasitic capacitances compared to the conventional p and n-well or twin-tub
CMOS processes.
A cross-section of nMOS and pMOS devices using SOI process is shown in figure 5.2.
Figure 5.2: SOI process
The SOI CMOS process is considerably more costly than the standard p & n-well
CMOS process. Yet the improvements of device performance and the absence of latch-up
problems can justify its use, especially for deep-sub-micron devices.
5.2.1.2 The CMOS/SOS Technology
Silicon-on-sapphire (SOS) is the highest-performance SOI technology today. In this
approach, silicon is grown on a sapphire substrate, and islands are formed by implant or
diffusion. The n-channel and p-channel transistors are built on the islands as shown in figure
5.3. High performance is achieved due to a significant reduction in parasitic capacitance, and
high gate density is achieved.
Figure 5.3: SOS process
Sapphire (Al2O3) is a good insulator and the lattice constants of silicon and sapphire
match well. When sapphire is used as the substrate, the epitaxial growth of silicon yields the
monocrystalline material. Sapphire is not affected by radiation as bulk silicon is, which
makes it a preferred material for military application which requires radiation-hardened
devices.
Disadvantages are:
• Manufacturing difficulty.
• High cost of sapphire wafers.
• Not competitive in high-volume, low-cost markets.
5.2.2 CMOS/bulk Technology
The CMOS/Bulk technologies are classified as follows:
a. p-well CMOS/Bulk process
b. n-well CMOS/Bulk process
c. twin-tub CMOS/Bulk process
5.2.2.1 p-well CMOS/Bulk process
The p-well CMOS/bulk uses p-type diffusion into an n-type bulk silicon substrate to form a
p-well for n-channel transistors. The p-channel transistors are directly built into n-substrate
as shown in figure 5.4.
Figure 5.4: p-well process
5.2.2.2 n-well CMOS/Bulk process
The n-well CMOS/bulk uses n-type diffusion into a p-type bulk silicon substrate to form an
n-well for p-channel transistors. The n-channel devices are built directly into the bulk p-
substrate as shown in figure 5.5; hence nMOS gives good performance than pMOS. This
process provides faster circuit than p-well CMOS process.
Figure 5.5: p-well process
Both p-well & n-well need contacts and leave minimum spacing (dead space) between the
edges of their wells.
5.2.2.3 Twin-tub CMOS/Bulk process
The starting material is an n+ or p+ substrate, with a lightly doped epitaxial layer on top. This
epitaxial layer provides the actual substrate on which the n-well and the p-well are formed.
Two independent doping steps are performed for the creation of the well regions; the
dopant concentrations can be carefully optimized to produce the desired device
characteristics. In p- and n-well CMOS process, the doping density of the well region is
higher than the substrate, which, among other effects, results in unbalanced drain parasitic.
The twin-tub process avoids this problem. The process is costlier and more complex.
The twin-tub process combines n-well and p-well technologies as shown in figure 5.6.
Figure 5.6: twin-well process
Twin-tub process has highest overall performance compared to n-well & p-well
process; it provides full freedom for the designer to optimize the performance of both the n-
channel & p-channel devices. This technology provides the basis for separate optimization of
the nMOS and pMOS transistors, thus making it possible for threshold voltage, body effect
and the channel transconductance of both types of transistors to be tuned independently.
5.2.3 Latch-up in Bulk CMOS
CMOS devices have parasitic bipolar transistors which can cause latch-up. Latch-up, is a
condition in which high current exist between VDD & GND. In latch-up, each collector of a
parasitic BJT is feeding the base of another parasitic BJT in a positive feedback configuration
forming a SCR. CMOS ICs have parastic silicon-controlled rectifiers (SCRs). When powered
up, SCRs can turn on, creating low-resistance path from power to ground. Latch-up can
cause malfunctioning and even destroy devices. Latch-up is terminated when power to the
SCR is interrupted.
The latch-up can occur in both p-well and n-well CMOS processes. Causes for the latch-
up are internal transient currents or voltages during power-up, external glitches on I/O pads,
and external radiation.
The triggering methods for the latch-up are current injected into the npn emitter, current
injected into the pnp emitter, and drastic current/voltage changes on any mode.
5.2.3.1 Parasitic SCR structure
Parasitic bipolar transistors (npn and pnp) exists in a CMOS structure, as shown in figure 5.7.
The well and the substrate have resistances Rw and Rs respectively.
Figure 5.7: Parasitic SCR structure
Latch-up Prevention • Two basic concepts (for reducing loop gain)
– Reduce Rwell and Rsubstrate
– Reduce parasitic npn and pnp transistors ( i.e. reduce Ic1 and Ic2)
• Decrease the current gains of the parasitic transistors
• Two basic ways:
– Latch-up resistant CMOS process
– Layout techniques
• Internal latch-up prevention techniques:
– Every well must have a substrate contact of the appropriate type.
– Every substrate contact should be connected to metal directly to a supply pad
(i.e., no diffusion or polysilicon underpasses in the supply rails)
• Use guard rings around the p- and/or n-wells, and making frequent contacts to the
rings.
• Place substrate contacts as close as possible to the source connection of transistors
connected to the supply rails (i.e., Vss in n-devices, Vdd in p-devices).
– This reduces the value of Rsubstrate and Rwell.
– A very conservative rule is place one substrate contact for every supply (Vss
or Vdd) connection.
• Otherwise a less conservative rule is to place a substrate contact for every 5-10
transistors or every 25-100µm.
5.3 STATIC CMOS DESIGN
Static CMOS Design is discussed in unit 3.
5.4 DOMINO CMOS STRUCTURES
Domino CMOS is a special form of precharge and evaluate CMOS with an inverting buffer
at the output. Problem with faulty discharge of precharged nodes in CMOS dynamic logic
circuits can be solved by placing an inverter in series with the output of each gate: All inputs
to N logic blocks therefore will be at zero volts during precharge and will remain at zero until
the evaluation stage has logic inputs to discharge the precharged node. However, all circuits
only provide non-inverted outputs. The generalized circuit diagram of a domino CMOS gate
is as shown in figure 5.8.
Figure 5.8: Domino CMOS gate
During precharge phase (when Φ = 0) the output node of the dynamic CMOS stage is
precharged to a high level, and the output of the CMOS inverter becomes low. During
evaluation phase (when Φ = 1) there are two possibilities:
– The output node either discharged to a low level through nMOS circuitry, or
– It remains high
5.4.1 Domino CMOS logic Examples:
Domino CMOS logic examples are given in figure 5.9. Dynamic CMOS logic gate stage is
cascaded with static CMOS inverter stage.
Figure 5.9: Domino CMOS logic example
5.4.2 Cascaded domino CMOS logic gates
Cascading domino CMOS logic stages are as shown in figure 5.10.
.
Figure 5.10: Cascaded domino CMOS logic gates
Cascading domino CMOS logic gates with static CMOS logic gates is shown in figure 5.11.
Figure 5.11: Cascading domino CMOS logic gates with static CMOS logic gates
Dynamic domino circuits are fast and draw no quiescent power, no glitches on output
but they require a reasonable clock rate.
Limitation is that number of inverting static load stages in cascade must be even, so
that the inputs of the next domino CMOS stage experience only 0 to 1 transitions during the
evaluation, only non-inverting structures can be implemented, and they have potential charge
sharing problems.
.
5.5 CHARGE SHARING
Charge sharing problems occur when two capacitive nodes charged to different voltages are
connected through a pass transistor. When pass transistor is turned on, it connects the two
nodes, resulting in a redistribution of the charge on both nodes. Charge sharing is a serious
problem in precharge circuits and must be carefully guarded against. One solution is to make
any charge holding capacitor much larger than any capacitors it shares charge with. Charge
sharing between the dynamic stage output node and the intermediate nodes of the nMOS
logic block during evaluation phase may cause erroneous outputs.
Charge sharing between the output capacitance C1 and an intermediate node
capacitance C2 during the evaluation cycle may reduce the output voltage level as shown in
figure 5.12.
During precharge phase, the output node capacitance C1 is charged up to its logic-
high level of VDD through pMOS transistor. In next phase, the clock signal goes high and
the evaluation begins. If the input signal of the uppermost nMOS transistor switches from
low to high during this evaluation phase as shown in figure 5.12, the charge initially stored in
the output capacitance C1 will now be shared by C2, leading to the charge sharing
phenomenon. The output node voltage becomes VDD/2, if C1 = C2 in the evaluation phase.
Thus it is important to have C2 much smaller than C1.
Figure 5.12: Charge sharing
5.5.1 Solutions for charge sharing
A weak P device (with a small W/L ratio) is added for the dynamic CMOS stage output,
compensates for charge loss due to charge sharing and leakage at low frequency clock
operation as shown in figure 5.13 (a), since weak P device is always on, the static power
dissipation increases. Other way to realize this is to have a weak pMOS pull-up device in a
feedback loop can be used to prevent the loss of output voltage level due to charge sharing is
shown figure 5.13 (b), weak P device conducts only when the output of static gate goes low.
i.e. when precharge node voltage is kept high.
Figure 5.13: A weak p device compensates for charge sharing.
Another possible solution for charge sharing is to use separate pMOS transistors to
precharge-high all intermediate nodes in the nMOS transistors, as shown in figure 5.14.
Figure 5.14: Precharge-high all intermediate nodes of nMOS transistors
Other solution is obtained by graded sizing of nMOS transistors in series structures,
where the nMOS transistor closest to the output node has smallest (W/L) ratio and nMOS
transistor closest to the ground has highest (W/L) ratio.
5.6 CLOCKING
Synchronous systems use a clock to keep operations in sequence, this distinguishes from the
previous or next and determine speed at which machine operates. Clock must be distributed
to all the sequencing elements like flip-flops and latches and also distribute clock to other
elements such as Domino circuits and memories.
There are three requirements of the clocking system:
• Signals must occur at the correct time
• Clock must be able to drive the fan-out
• Rise & fall times of the clock pulses must be as short as possible
Long transition times not only slow the circuit but also increase power consumption.
Clocks must be laid out such that the delays from the source of each clock to clocked bistable
elements are identical. Clock signals switches between VDD and GND. Two-phase, non-
overlapping clocking has no timing errors due to races or hazards.
Clock Skew
– Absolute clock skew: difference in arrival of the edge of a clock phase at a destination
in the circuit, with respect to the clock edge at the source of the clock signal.
– Relative clock skew: difference in local clock lag.
Clock skew for rising and falling clock signals need not be same. Careful design of
layout is required to avoid the skew problems. Set-up time, Hold time and Minimum pulse
width are very important for the clocks. Clock delays can be treated as any bus delay
problem; fastest clocking should be established using suitable super-buffers (clock drivers) to
drive the clock bus, or by scaling the clock-driver loads by a factor 2.7. The bus must be kept
as short as possible, and in metal as much as possible.
5.6.1 Clock Generation
All clock signals can be derived from a system clock signal, which is a square-wave.
Multiphase clocks can be generated from a single square-wave input with two toggle flip-
flops and two AND gates as shown in figure 5.15.
Figure 5.15: Generation of two-phase clocking from a primary clock
Other way of generation is as shown in figure 5.16.
Figure 5.16: Non-overlapping clocks
5.6.2 Clock distribution
On a small chip, the clock distribution network is just a wire and possibly an inverter for
clkb. On practical chips, the RC delay of the wire resistance and gate load is very long.
Variations in this delay cause clock to get to different elements at different times, called
clock skew. Clock skew can be minimized by placing all gates of a tree on the same chip.
Most chips use repeaters to buffer the clock and equalize the delay, reduce skew, as shown in
figure 5.17.
The physical layout of the clock network must conform to design rules that ensure the
integrity of the clock signal by minimizing electrical coupling, switching currents, and
impedance mismatches. Equalizing path delays also helps to reduce the skew.
Figure 5.17: H-tree.
The clock signals can cross under power lines using diffusion as shown in figure 5.18.
Figure 5.18: Clock-line crossing under a power line using diffusion
To reduce the clock skew, clock distribution network is required, which requires,
plenty of metal wiring resources. Local Clock Gaters receives the global clock and produce
the physical clocks required by clocked elements.
Clock gaters are often used to stop or gate the clock to unused blocks of logic to save power.
Different clock gaters are:
– Enabled or Gated clock
– Stretched clocks
– Nonoverlapping clocks
– Complementary clocks
– Delayed, Pulsed clocks
– Clock Doubler
– Clock Buffer
Some of the clock gaters are as shown in figure 5.19, with output waveforms.
Figure 5.19: Examples of Clock gaters with output waveforms.
5.6.3 Clocked storage elements
A two-phase clocking scheme with combinational logic inserted between every pair of
registers yields a simple pipelined structure. Feedback path added around a cascade of two
combinational-logic blocks is shown in figure 5.20.
Figure 5.20: Feedback path around a cascade of two combinational-logic blocks