prs caller

Upload: rppvch

Post on 09-Oct-2015

43 views

Category:

Documents


0 download

DESCRIPTION

pre scaller doc

TRANSCRIPT

Abstract

In this paper, a wideband 2/3 prescaler is verified in the design of proposed wide band multimodulus 32/33/47/48 prescaler. A dynamic logic multiband flexible integer-N divider is designed which uses the wideband 2/3 prescaler , multimodulus 32/33/47/48 prescaler. Since the multimodulus 32/33/47/48 prescaler has maximum operating frequency of 6.2 GHz, the values of P and S counters can actually be programmed to divide over the whole range of frequencies. However, the P and S counters are programmed accordingly. The proposed multiband flexible divider also uses an improved loadable bit-cell for Swallow - counter and consumes a power of 0.96 and 2.2 mW, respectively, and provides a solution to the low power PLL synthesizers for Bluetooth, Zigbee, IEEE 802.15.4, and IEEE 802.11a/b/g WLAN applications with variable channel spacing.

CHAPTER 1INTRODUCTION

1.1. GENERALWIRELESS LAN (WLAN) in the multigigahertz bands, such as Hiper LAN II and IEEE 802.11a/b/g, are recognized as leading standards for high-rate data transmissions, and standards like IEEE 802.15.4 are recognized for low-rate data transmissions. The demand for lower cost, lower power, and multiband RF circuits increased in conjunction with need of higher level of integration. The frequency synthesizer, usually implemented by a phase-locked loop (PLL), is one of the power-hungry blocks in the RF front-end and the first-stage frequency divider consumes a large portion of power in a frequency synthesizer. The integrated synthesizers for WLAN applications at 5 GHz reported in and consume up to 25 mW in CMOS realizations, where the first-stage divider is implemented using an injection-locked divider which consumes large chip area and has a narrow locking range. The best published frequency synthesizer at 5 GHz consumes 9.7 mWat 1-V supply, where its complete divider consumes power around 6 mW where the first-stage divider is implemented using the source-coupled logic (SCL) circuit , which allows higher operating frequencies but uses more power. Dynamic latches are faster and consume less power compared to static dividers. The frequency synthesizer reported in uses a prescaler as the first-stage divider, but the divider consumes Power. Most IEEE 802.11a/b/g frequency synthesizers employ SCL dividers as their first stage , while dynamic latches are not yet adopted for multiband synthesizers. In this paper, a dynamic logic multiband flexible integer-N divider based on pulse-swallow topology is proposed which uses a low-power wideband 2/3 prescaler and a wideband multi modulus 32/33/47/48 prescaler. The divider also uses an improved low-power loadable bit-cell for the Swallow S counter. The frequency synthesizer is one of the basic building blocks in modern communication systems. The operating frequency of the frequency synthesizer is limited by the frequency divider and the voltage-controlled oscillator. The function of channel selection in the frequency synthesizer demands programmable division ratios for the frequency divider. The integer-N frequency synthesizer is more practical, less costly and of low spurious sideband performance as compared with the fractional-N frequency synthesizer . It is usually formed by a prescaler, a program counter (P counter) and a swallow counter( S counter). Such a topology can provide a programmable division ratio of N X P + S, where N, P and S are the division ratios of three blocks respectively. The prescaler provides a dual-modulus of N=N +1. The P counter provides a fixed division ratio according to the requirement of the overall division ratio, while the continuous division ratios from 3 to 2n is achieved through the S counter by periodically reloading the divide-by-2 stages, where n is the number of stages of the S counter. The continuous division ratio is used to select the desired channels. Much research has been focused on the prescaler design for its highest operating frequency . However, in the modern communication system, there is an increasing demand for multi-standards applications. The requirement for wide band and high resolution operations continue to be the problems. To satisfy these requirements, different reference frequencies, and different arrangement for N, P and S counters are selected for different applications. For example, only the UNII bands are covered. In this paper, a new wide-band high resolution programmable frequency divider is proposed. The wide band and high resolution are obtained by using the all-stage programmable topology in both counters , The high-speed frequency divider is a key block in frequency synthesis. The prescaler is the most challenging part in the high-speed frequency-divider design because it operates at the highest input frequency. A dual-modulus prescaler usually consists of a divide-by-2/3 (or 4/5) unit followed by several asynchronous divide-by-2 units. The operation of the divide-by-2/3 unit at the highest input frequency makes it the bottleneck of the prescaler design. To achieve the two different division ratios, D flip-flops (DFFs) and additional logic gates, which reduce the operating frequency by introducing an additional propagation delay, are used in the unit. The power consumption of this divide-by-2/3 unit, which is the greatest portion of the total power consumption in the prescaler, significantly increases due to the power consumption of the additional components.

1.2. OBJECTIVE

1. A dynamic logic multiband flexible integer-N divider based on pulse-swallow topology is proposed which uses a low-power wideband 2/3 prescaler and a wideband multi modulus 32/33/47/48 prescaler.2. To achieve high-rate data transmissions 3. To achieve low-rate data transmissions.

1.3 EXISTING SYSTEM

1. The demand for lower cost, lower power, and multiband RF circuits increased in conjunction with need of higher level of integration. The frequency synthesizer, usually implemented by a phase-locked loop (PLL), is one of the power-hungry blocks in the RF front-end and the first-stage frequency divider consumes a large portion of power in a frequency synthesizer.2. The integrated synthesizers for WLAN applications at 5 GHz reported in and consume up to 25 mW in CMOS realizations, where the first-stage divider is implemented using an injection-locked divider which consumes large chip area and has a narrow locking range.3. Frequency synthesizer at 5 GHz consumes 9.7 mW 1-V supply, where its complete divider consumes power around 6 mW where the first-stage divider is implemented using the source-coupled logic (SCL) circuit, which allows higher operating frequencies but uses more power.

1.3.1 LITERATURE SURVEY

1.3.1.1 A 13.5-mW 5-GHz frequency synthesizer with dynamic-logic frequency divider- S.; Levantino, S.;Samori, C.;Lacaita, A.L.-Feb. 2004.

DESCRIPTION: The adoption of dynamic dividers in CMOS phase-locked loops for multigigahertz applications allows to reduce the power consumption substantially without impairing the phase noise and the power supply sensitivity of the phase-locked loop (PLL). A 5-GHz frequency synthesizer integrated in a 0.25-m CMOS technology demonstrates a total power consumption of 13.5 mW. The frequency divider combines the conventional and the extended true-single-phase-clock logics. The oscillator employs a rail-to-rail topology in order to ensure a proper divider function. This PLL intended for wireless LAN applications can synthesize frequencies between 5.14 and 5.70 GHz in steps of 20 MHz. A low-power 5-GHz CMOS frequency synthesizer for wireless LAN transceivers has been presented. The PLL integrated in a 0.25- m CMOS technology consumes only 13.5 mW, thanks to a dynamic TSPC divider. This class of dividers is demonstrated to be suitable for multigigahertz synthesizers, since it does not impair the power supply rejection or the phase noise performance. WIRELESS LAN systems in the 56-GHz band, such as HiperLAN II and IEEE 802.11a, are recognized as the leading standards for high-rate data transmissions. Being intended for mobile operations, the radio transceiver has a limited power budget. The frequency synthesizer, usually implemented by a phase-locked loop (PLL), is one of the most critical blocks in terms of average current dissipation since it operates extensively for both receiving and transmitting. The best published integrated synthesizers around 5 GHz suitable for wireless LAN receivers consume up to 25mWin both CMOS and bipolar realizations. Other synthesizers embedded in 802.11a-compliant transceivers can consume up to 200 mW.

DISADVANTAGES OF EXISTING: This high power consumption is mainly due to the first stages of the frequency divider that often dissipates half of the total power. Due to the high input frequency, the first stage of the divider cannot be implemented in conventional static CMOS logic. Instead, it is commonly realized in source-coupled logic (SCL), which allows higher operating frequency, but burns more power.

ADVANTAGES OF PROPOSED: Dynamic latches are known to be faster and more compact than static ones. The true-single-phase-clock (TSPC) design allows to drive the dynamic latch with a single clock phase, thus avoiding the skew problem. The use of dynamic logic is not only possible up to 6 GHz, but also extremely effective in reducing the synthesizer power dissipation.

1.3.1.2 Design and Optimization of the Extended True Single-Phase Clock-Based Prescaler -Xiao Peng Yu; Manh Anh Do; Wei Meng Lim; Kiat Seng Yeo; Jian-Guo Ma- : Nov.2006. DESCRIPTION:The power consumption and operating frequency of the extended true single-phase clock (E-TSPC)-based frequency divider is investigated. The short-circuit power and the switching power in the E-TSPC-based divider are calculated and simulated. A low-power divide-by-2/3 unit of a prescaler is proposed and implemented using a CMOS technology. Compared with the existing design, a 25% reduction of power consumption is achieved. A divide-by-8/9 dual-modulus prescaler implemented with this divide-by-2/3 unit using a 0.18-mum CMOS process is capable of operating up to 4 GHz with a low-power consumption. The prescaler is implemented in low-power high-resolution frequency dividers for wireless local area network applications. The design and optimization of a high-speed E-TSPC-based prescaler has been carried out by investigation of the operating frequency and power consumption of the E-TSPC circuit. A new divide-by-2/3 unit with low power consumption has been proposed. It is suitable for the high-speed CMOS prescaler design. A divide-by-8/9 dual-modulus prescaler implemented with the proposed unit has been implemented to achieve the ultra-low-power consumption. The dual-modulus operation above 4 GHz in the TSPC-based prescaler has first been achieved. The prescaler has been implemented in high-resolution frequency dividers. It is suitable for the wireless communication system below 4 GHz. The operation of this proposed prescaler and frequency divider have also been silicon verified.The high-speed frequency divider is a key block in frequency synthesis. The prescaler is the most challenging part in the high-speed frequency-divider design because it operates at the highest input frequency. A dual-modulus prescaler usually consists of a divide-by-2/3 (or 4/5) unit followed by several asynchronous divide-by-2 units. The operation of the divide-by-2/3 unit at the highest input frequency makes it the bottleneck of the prescaler design.

DISADVANTAGES OF EXISTING:

The extended true single-phase clock (E-TSPC) logic is proposed to increase the operating frequency. However, this causes additional power consumption. In modern wireless communication systems, the power consumption is a key consideration for the longer battery life. The MOS current mode logic (MCML) circuit, which is of high power consumption, is commonly used to achieve the high operating frequency, while a true single-phase clock (TSPC) dynamic circuit, which only consumes power during switching, has a lower operating frequency.

ADVANTAGES OF PROPOSED:

In this paper, the power consumption and operating frequency in the E-TSPC logic style is evaluated. The two major sources of power consumption, namely, the short-circuit power and the switching power, in the E-TSPC divide- by-2 unit is calculated and simulated. Based on the analysis,a new divide-by-2/3 unit is proposed to achieve the low power consumption by reducing the switching activities and the short-circuit current in the DFFs of the unit, and a dual-modulus prescaler implemented with the unit is proposed.

1.3.1.3 A Dynamic-Logic Frequency Divider for 5-GHz WLAN Frequency Synthesizer- Yue-Fang Kuo and Ro-Min Weng- National Dong Hwa University Hualien, Taiwan, Republic of China.

DESCRIPTION:A dynamic-logic frequency divider for fully integrated CMOS frequency synthesizer is presented in this paper. The divider based on the dual-modulus prescaler and dynamic logic circuit is designed to reduce the power consumption, transistor-counts, and chip area. The simulation results show the proposed circuit achieved the operating frequency band from 5.15GHz to 5.825GHz for wireless local area network applications. A simple architecture of the dynamic-logic frequency divider has been demonstrated in a standard 0.18m CMOS technology. The frequency divider is designed without counters and the simulation results show the advantages in low power consumption and less chip area. The proposed frequency divider achieves the operating frequency bands form 5.15GHz and 5.825GHz in steps of 20MHz, which covers 15 channels in WLAN applications. IEEE 802.11a and HiperLAN are standards of wireless data networks with frequency band operated from 5 to 6GHz which covers fifteen channels with a channel spacing of 20MHz. The frequency synthesizers are widely used to generate local oscillation (LO) signals in modern communication systems. In order to cover the required carries and operate from input frequency of 5GHz, the division of the divider has to be programmed from 257 to 294. The operating frequency of a frequency synthesizer is limited by the frequency divider as well as the voltage controlled oscillator (VCO).

DISADVANTAGES IN EXISTING: For WLAN standard, most common high-speed frequency are based on a pulse-swallow architecture. The architectures require two additional counters for generation of a desired division ratio. It occupies many gate-counts, large chip area, and consumes extra power.

The high power consumption is mainly due to the first stages of the frequency divider that often consumes half of the total power. The first stage of the divider cannot be implemented in dynamic TSPC circuit

ADVANTAGES OF PROPOSED:

This paper proposes a new frequency divider keeping the same function as a conventional one without employing a sallower counter to consume takes extra power and unnecessary chip area. The proposed topology is based on the counter-less and dual-modulus counter detector.

1.3.1.4 Design of a low power wideband high resolution programmable frequency divider- X. P. Yu et al.- Sep. -2005

DESCRIPTION: The design of a high-speed wide-band high resolution programmable frequency divider is investigated. A new reloadable D flip-flop for the high speed programmable frequency divider is proposed. It is optimized in terms of propagation delay and power consumption as compared with the existing designs. Measurement results show that an all-stage programmable counter implemented with this D flip-flop using the Chartered 0.18 m CMOS process is capable of operating up to 1.8 GHz for a 1.8 V supply voltage and a 5.8-mW power consumption. By using this counter, an ultra-wide range high resolution frequency divider is achieved with low power consumption for 56-GHz wireless LAN applications. The design difficulties of the wide-band high resolution programmable frequency divider for multi-standard application are investigated. A high speed low power counter is successfully implemented for multi-standard operations. Measurements results show the first GHz all-stage programmable divider with low power consumption is achievable with the proposed bitcell.

DRAWBACKS OF EXISTING: The frequency synthesizer is one of the basic building blocks in modern communication systems. The operating frequency of the frequency synthesizer is limited by the frequency divider and the voltage-controlled oscillator. The function of channel selection in the frequency synthesizer demands programmable division ratios for the frequency divider. Much research has been focused on the prescaler design for its highest operating frequency. However, in the modern communication system, there is an increasing demand for multi-standards applications.

ADVANTAGES OF PROPOSED:

A new wide-band high resolution programmable frequency divider is proposed. The wide band and high resolution are obtained by using the all-stage programmable topology in both counters.

1.3.1.5 4.2 mW frequency synthesizer for 2.4 GHz ZigBee application with fast settling time performance- S. Shin et al.,- Jun. 2006 .

DESCRIPTION: A new frequency synthesizer with low-power and short settling time is introduced. With two-point channel controls for an integer-N PLL, we have achieved a near zero settling time for any frequency change in 2.4GHz ZigBee band. By utilizing a vertical-NPN parasitic transistor for the VCO biasing, the close-in phase noise has been improved by 5dB from the case of MOS biasing. A modified-TSPC topology is proposed for low-voltage frequency divider circuits. Using the 1.2V supply voltage for 0.18mum CMOS, the power consumption is only 4.2Mw. A new frequency synthesizer architecture with very low power and short frequency settling time was introduced. A two-point channel control scheme was used for our proposed frequency synthesizer in which a DAC with tunable gain and a linearized VCO are used to effectively compensate the gain mismatch between the two control paths. Despite the use of an integer-N architecture with narrow 20kHz bandwidth, we have achieved near zero frequency settling time within the accuracy of the measurement equipment for the 75MHz frequency jumping from 2.4GHz.The battery life for mobile applications is inversely proportional to the energy consumption of mobile devices.Thus it is important to mninimize the energy consumption by mninimizing both the active duty-cycle and the active power consumption of a wireless termninal concurrently.

DISADVANTAGES OF EXISTING:

The frequency settling time of a PLL decreases as the loop-bandwidth increases. However, it requires a high frequency fractional controller thus increases the hardware complexity and active power consumption.

ADVANTAGES OF PROPOSED:

In this paper, a new frequency synthesizer with very short frequency settling time and low active power consumption is introduced. For short frequency settling time, a two-point channel control scheme composed of a direct-VCO control (compensation-path) and a divider control (main-path) is used.

1.4 PROPOSED SYSTEM

The proposed wideband multimodulus prescaler which can divide the input frequency by 32, 33, 47, and 48 . It is similar to the 32/33 prescaler, but with an additional inverter and a multiplexer. The proposed prescaler performs additional divisions (divide- by-47 and divide-by-48) without any extra flip-flop, thus saving a considerable amount of power and also reducing the complexity of multiband divider which will be discussed in Section V. The multimodulus prescaler consists of the wideband 2/3 (N1/(N1+1)) prescale, four aysnchronous TSPC divide-by-2 circuits (AD=16) and combinational logic circuits to achieve multiple division ratios. Beside the usual MOD signal for controlling (N/N+1)divisions, the additional control signal sel is used to switch the prescaler between 32/33 and 47/48 modes.

ADAVANTAGES IN PROPOSED SYSTEM

1. Best performance on power consumption2. Efficient architecture in silicon verification.

CHAPTER 2A LOW-POWER SINGLE-PHASE CLOCK MULTIBANDFLEXIBLE DIVIDER

2.1 GENERALWIRELESS LAN (WLAN) in the multigigahertz bands, such as Hiper LAN II and IEEE 802.11a/b/g, are recognized as leading standards for high-rate data transmissions, and standards like IEEE 802.15.4 are recognized for low-rate data transmissions. The demand for lower cost, lower power, and multiband RF circuits increased in conjunction with need of higher level of integration. The frequency synthesizer, usually implemented by a phase-locked loop (PLL), is one of the power-hungry blocks in the RF front-end and the first-stage frequency divider consumes a large portion of power in a frequency synthesizer. The battery life for mobile applications is inversely proportional to the energy consumption of mobile devices. Thus it is important to mninimize the energy consumption by mninimizing both the active duty-cycle and the active power consumption of a wireless termninal concurrently . The active duty-cycle of a ZigBee wireless node strongly depends on the frequency settling time of a PLL, since the settling time is a domninant portion of the total active period. Generally speaking, the frequency settling time of a PLL decreases as the loop-bandwidth increases. Since a fractional-N PLL with higher reference frequency can achieve a wider loop bandwidth, it has been favored for small active duty-cycle. However, it requires a high frequency fractional controller such as a YIA-modulator, and thus increases the hardware complexity and active power consumption. In this paper, a new frequency synthesizer with very short frequency settling time and low active power consumption is introduced. For short frequency settling time, a two-point channel control scheme composed of a direct-VCO control (compensation-path) and a divider control (main-path) is used. A tunable gain DAC and linearized chararactors are used in the compensation-path to effectively compensate the gain mismatch between the two control paths. For low active power consumption, 1.2V power supply voltage is used, instead of the typical 1.8V for the 0.18tm CMOS technology. This is made possible by adopting a modified-TSPC circuit with 2- transistor stacks for the high frequency divider circuits and other low-voltage circuit techniques. We also utilized a parasitic Vertical-NPN (VNPN) transistor, which is available in a conventional triple-well CMOS process, as a VCO current source transistor to improve the close-in phase noise performance. AS THE feature size of MOSFETs continues to shrink, a proportional downscaling in the supply voltage is mandatory to maintain gateoxide reliability . However, in consideration of the sub threshold leakage and the noise margin required by the digital integrated circuits, the scaling rate of the threshold voltage is relatively slow compared with that of the supply voltage. Consequently, the overdrive voltage of the transistors progressively decreases as the technology advances. It has become an inevitable trend to operate the MOS devices in moderate or weak inversion for certain mixed-signal and RF integrated circuits, motivating the development of low-voltage design techniques exclusively for deep-submicrometer CMOS technologies. In an RF receiver frontend, the low-noise amplifier (LNA) and the down-conversion mixer are considered the most important building blocks. Typically, these circuits suffer from significant degradation in the RF properties, especially for gain, noise figure, and linearity, as the transistors operate in weak inversion. To overcome the limitations on the supply voltage and the transistor overdrive, a complementary current-reused topology has been proposed for the RF frontend circuits . Using a standard 0.18- m CMOS process, an ultra-low-voltage LNA and mixer suitable for operations with microwatt power consumption are realized at the 5-GHz frequency band. The behavior of the MOSFETs biased at a reduced overdrive voltage and its impact on the circuit performance of RF frontends are also investigated. Though the fabricated circuits are not targeted at a specific wireless standard, the developed techniques and design guidelines can be easily applied for various short-range wireless applications such as ZigBee, Bluetooth, wideband personal area network (WPAN), and wireless sensor networks. which were adopted for a remote sensing microsystem operating in the 433 MHz European ISM (Industrial, Scientific, Medical) band. However, the same tradeoffs can be encountered in the design of most short-range, battery-powered transceivers operating in the UHF range. Therefore most proposed choices should be applicable to all these systems. Potential applications are broadly ranged, from computer peripherals and home automation to surveillance systems. Actually, these high-volume consumer products all share the need for a small sized, low cost, ultralow power transceiver. The first part of the paper will discuss the specifications of such a system, and explain shortly the architectural choices, together with the related tradeoffs. In the second part, the practical realization of some key blocks will be briefly described. The discussion will be accompanied by simulated and measured results, as functional versions of most blocks have already been realized and tested in a standard 0.5pm CMOS digital process. Finally, in the third part, a summary of the expected performances of the whole system will be presented. Although some specifications may vary from application to application, some generic objectives can still be stated. The total volume of a network node, including battery and antenna, has to be in the order of 10cm3 (lin3). This is mostly a limitation for the battery capacity and a challenge for the antenna. The tradeoff between antenna efficiency and power consumption leads to the choice of a carrier frequency in the UHF range. ISM bands are usually preferred because they are not bound to a particular standard, giving more freedom for implementing power saving strategies. The frequencies of interest are therefore 433MHz (Europe) and 917MHz (USA). In order to reduce power consumption, avoid DC-DC converters and be ready for next-generation deep-submicron processes with limited supply voltage, single battery operation is desirable. Therefore, ultimately, the circuits should be able to operate with a supply as low as lV, corresponding to the battery end-of-life voltage. As each node should exhibit some intelligence and be able to interact with its environment, cost and size requirements favor a system-on-a-chip approach, leading to choose a standard CMOS process (on chip microcontroller and interfaces). In order to ease process migration and keep costs down, a fully digital technology is preferred. From our experience, a 0.5pm process is sufficient for 433MHz operation, whereas a 0.35pm process would be more appropriate in. the 917MHz band. Cost reduction and reliability also motivates the minimization of the number of external components. The data rate can be relatively low, in the 1 to 100kbit/s range, half duplex, because each node should mostly receive data request and transmit the measurement of some slowly varying physical quantities. For some applications, as computer interfaces, the availability of several channels (2-4) is desirable. A raw BER (Bit Error Rate) of (before error correction) is usually acceptable, corresponding to a minimum SNR (Signal-to-Noise Ratio) at the demodulator of 12dB. The maximum radiated power is enforced in the -2 to 10dBm range by the ISM standards and by the battery internal resistance. As wave propagation in buildings can vary widely according to configuration, the maximum distance between two network nodes can be specified in free range (i.e. without any obstacles) and usually lies between 10 and 100m. For most sensing applications, nodes spend far more time receiving (monitoring a potential incoming call) than emitting (answering a request). Therefore, even though the power level required in transmits mode is an order of magnitude higher, the receiver power consumption is most critical. In our case, the target was not higher than 1.2mW(lmA, 1.2V supply) in order to obtain a sufficient battery lifetime. This high power consumption is mainly due to the first stages of the frequency divider that often dissipates half of the total power. Due to the high input frequency, the first stage of the divider cannot be implemented in conventional static CMOS logic . Instead, it is commonly realized in source-coupled logic (SCL), which allows higher operating frequency, but burns more power. A more efficient alternative to the first SCL divider is the injection locking divider employed. However, this resonant /5 divider requires a tank whose area is larger than the oscillators tank, and it suffers from pulling phenomena. Dynamic latches are known to be faster and more compact than static ones. The true-single-phase-clock (TSPC) design allows to drive the dynamic latch with a single clock phase, thus avoiding the skew problem. However, the adoption of this class of latches in frequency dividers has been so far limited to PLLs up to 900 MH. In this brief, a TSPC divider is employed within a PLL integrated in a 0.25- m CMOS technology. The use of dynamic logic is not only possible up to 6 GHz, but also extremely effective in reducing the synthesizer power dissipation. This divider implementation causes no appreciable degradation of the PLL spur performance, nor additional phase noise. We report results on a 0.5 V RF front end operating at 900 MHz, for zero-IF or near-zero-IF receivers. This work is a continuation of earlier work at IF frequencies . A 0.5 V front end has been presented before using fully-depleted CMOS/SIMOX SOI technology, our circuits use instead a standard mixed-mode CMOS 0.18 m technology. LNA design in standard CMOS, operating at 0.5 V, has been reported recently. The chief limitation in 0.5 V design is the reduced voltage headroom due to the substantial gate voltage overdrive needed to maintain device speed. The threshold voltage, VT, is anticipated to stay at about 200 mV to reduce OFF currents in nanoscale CMOS technologies . We used the low-VT transistors in the 0.18 m process with a VT close to that value. The growing number of users and demand for high speed wireless communications has motivated designers to move from the 1-2GHz range towards higher frequency bands. Recently, new standards in the 5GHz range for wireless local area network (WLAN) applications have been defined, such as the IEEE 802.1 la standard for the FCC unlicensed national information infrastructure (U-NII) band in the US, and the high performance radio LAN (HIPERLAN) standard in Europe. Traditionally, radio frequency integrated circuits (RFICs) were implemented in GaAs or SiGe bipolar technologies, because of their relatively high unity gain cutoff frequencies fT (i.e. >65GHz) and their superior noise performance. However, as the minimum feature size of CMOS devices decreases, the fT of the transistors continues to improve to the point where it is becoming comparable to those of GaAs and SiGe processes. Deep submicron CMOS devices with J$s exceeding l00GHz and minimum noise figures (NF) less that 0.5dB at 2GHz have been demonstrated. Because of these promising RF performances, together with the advantages of low cost and ease of integration with baseband digital circuitry, CMOS is becoming a viable alternative for RF applications, with continuous efforts towards implementing higher frequency circuitry operating from lower supply voltages IEEE 802.11a and Hiper LAN are standards of wireless data networks with frequency band operated from 5 to 6GHz which covers fifteen channels with a channel spacing of 20MHz . The frequency synthesizers are widely used to generate local oscillation (LO) signals in modern communication systems. In order to cover the required carries and operate from input frequency of 5GHz, the division of the divider has to be programmed from 257 to 294. The operating frequency of a frequency synthesizer is limited by the frequency divider as well as the voltage controlled oscillator (VCO). For WLAN standard, most common high-speed frequency are based on a pulse swallow architecture. It usually comprises of a dual modulus prescaler (DMP), a 6-bits pulse (P) counter, and a 5-bits sallower (S) counter. Two counters generate the given division ration of divider and divide-by-value of DMP. To keep up with the high input frequency and reduce the power consumption, the DMP is a trade-off between the speed and divide-by-value. On the other hand, the architectures require two additional counters for generation of a desired division ratio. It occupies many gate-counts, large chip area, and consumes extra power. This paper proposes a new frequency divider keeping the same function as a conventional one without employing a sallower counter to consume takes extra power and unnecessary chip area. A phase-locked loop (PLL) is an electronic feedback system that generates a signal, the phase of which is locked to the phase of an input reference signal. PLLs are widely used in radio, telecommunication, computer and other electronic systems. Frequency synthesizer are one of the most critical parts in a wireless transceiver. Most frequency synthesizers are of the phase-locked loop (PLL) type. . In the frequency synthesizer, only the VCO and prescaler operate at the highest frequency. It is relatively easy to design a multi-gigahertz voltage controlled oscillator (VCO) in the current advanced CMOS process, so the dual modulus prescaler remains the most critical components with the trend of applying CMOS into higher frequency systems. Operating at the highest frequency, the dual modulus prescaler divides the output frequency of the VCO by one of the two fixed division ratios to a lower frequency. This division ratio is controlled by a modulus control signal generated by the programmable counter. Besides the high operating speed, the dual modulus prescaler also has to be efficient in terms of current consumption, and be able to work at low supply voltage. In this paper, we present a 16/17 dual-modulus prescaler operating up to 3.1GHz realized in a 0.35m CMOS technology with 1.8V supply voltage. By merging NAND logic with D-flip-flop, we eliminated their propagation delay. Also, by merging AND logic that counts asynchronous output into flip-flop with NAND logic, it allowed are accurate pulse swallow by eliminating the delay that was found in the AND logic.

2.2 METHODOLOGIES

The key parameters of high-speed digital circuits are the propagation delay and power consumption. The maximum operating frequency of a digital circuit is calculated.

Fmax = 1/(tp LH + tp HL ) . 1

where tp LH and tp HL are the propagation delays of the low-to-high and high-to-low transitions of the gates, respectively. The total power consumption of the CMOS digital circuits is determined by the switching and short circuit power. The switching power is linearly proportional to the operating frequency and is given by the sum of switching power at each output node as inPswitching= 2

Where n is the number of switching nodes, Fclk is the clock frequency, CLi is the load capacitance at the output node of the ith stage, and Vdd is the supply voltage. Normally, the short-circuit power occurs in dynamic circuits when there exists direct paths from the supply to ground which is given byP sc = I sc * V dd 3

Where Isc is the short-circuit current. The analysis shows that the short-circuit power is much higher in E-TSPC logic circuits than n TSPC logic circuits. However, TSPC logic circuits exhibit higher switching power compared to that of E-TSPC logic circuits due to high load capacitance. For the E-TSPC logic circuit, the short-circuit power is the major problem. The E-TSPC cicuit has the merit of higher operating frequency than that of the TSPC circuit due to the reduction in load capacitance, but it consumes significantly more power than the TSPC circuit does for a given transistor size. The following analysis s based on the latest design using the popular and low-cost 0.18-microm CMOS process.

MAIN MODULES:

WIDEBAND 2/3 PRESCALER MULTIMODULUS 32/33/47/48 PRESCALER SWALLOW P-COUNTER SWALLOW S-COUNTER WIDEBAND 4/5 PRESCALER MULTIMODULUS 32/33/47/48 PRESCALER

2.4 MODULE DESCRIPTION:2.4.1 WIDEBAND 2/3 PRESCALER

The E-TSPC 2/3 prescaler reported in consumes large shortcircuit power and has a higher frequency of operation than that of 2/3 prescaler. The wideband single-phase clock 2/3 prescaler used in this design was reported which consists of two D-flip-flops and two NOR gates embedded in the flip-flops as in Fig. 2. The first NOR gate is embedded in the last stage of DFF1, and the second NOR gate is embedded in the first stage of DFF2 . Here, the transistors M2,M25, M4, and M8 in DFF1 helps to eliminate the short-circuit power during the divide-by-2 operation. The switching of division ratios between 2 and 3 is controlled by logic signal MC.

WIDEBAND SINGLE-PHASE CLOCK 2/3 PRESCALER

When MC switches from 0 to 1, transistors M2, M4 and M8 in DFF1 turns off and nodes S1, S2 and S3 switch to logic 0. Since node S3 is 0 and the other input to the NOR gate embedded in DFF2 is Qb, the wideband prescaler operates at the divide-by-2 mode. During this mode, nodes S1, S2 and S3 switch to logic 0 and remain at 0 for the entire divide-by-2 operation, thus removing the switching power contribution of DFF1. Since one of the transistors is always OFF in each stage of DFF1, the short-circuit power in DFF1 and the first stage of DFF2 is negligible. The total power consumption of the prescaler in the divide-by-2 mode is equal to the switching power in DFF2 and the short-circuit power in second and thirrd stages of DFF2.

Where CLi is the load capacitance at the output node of the i th stage of DFF2, and Psc1 and Psc2 are the short-circuit power in the second and third stages of DFF2. When logic signal MC switches from 1 to 0, the logic value at the input of DFF1 is transfered to the input of DFF2 as one of the input of the NOR gate embedded in DFF1 is 0 and the wideband prescaler operates at the divide-by-3 mode. During the divide-by-2 operation, only DFF2 actively participates in the operation and contributes to the total power consumption since all the switching activities are blocked in DFF1. Thus, the wideband 2/3 prescaler has benifit of saving more than 50% of power during the divide-by-2 operation.

2.4.2 MULTIMODULUS 32/33/47/48 PRESCALER

The proposed wideband multimodulus prescaler which can divide the input frequency by 32, 33, 47, and 48 is shown in Fig. 4. It is similar to the 32/33 prescaler used in, but with an additional inverter and a multiplexer. The proposed prescaler performs additional divisions (divide- by-47 and divide-by-48) without any extra flip-flop, thus saving a considerable amount of power and also reducing the complexity of multiband divider which will be discussed in Section V. The multimodulus prescaler consists of the wideband 2/3 (N1/(N+1))prescaler [10], four asynchronous TSPC divide-by-2 circuits ((AD)=16) and combinational logic circuits to achieve multiple division ratios. Beside the usual MOD signal for controlling N(N+1) divisions, the additional control signal Scl is used to switch the prescaler between 32/33 and 47/48 modes. PROPOSED MULTIMODULUS 32/33/47/48 PRESCALERCase 1: Sel=0

When Sel=0 , the output from the NAND2 gate is directly transferred to the input of 2/3 prescaler and the multimodulus prescaler operates as the normal 32/33 prescaler, where the division ratio is controlled by the logic signal MOD. If MC=1, the 2/3 prescaler operates in the divide-by-2 mode and when MC , the 2/3 prescaler operates in the divide-by-3 mode.If MOD =1, the NAND2 gate output switches to logic 1(MC=1)and the wideband prescaler operates in the divide- by-2 mode for entire operation. The division ratio N performed by the multimodulus prescaler is N = (AD*N1)+(0*(N1+1)) = 32 4Where N=2 and AD=16 is fixed for the entire design. If MOD=0 , for 30 input clock cycles MC remains at logic 1, where wideband prescaler operates in divide-by-2 mode and, for three input clock cycles, MC remains at logic 0 where the wideband prescaler operates in the divide-by-3 mode. The division ratio N+1 performed by the multimodulus prescaler is

N + 1 = ((AD 1)*N1)+(1*(N1+1)) =33 5 Case 2: Sel = 1

When Sel = 1, the inverted output of the NAND2 gate is directly transfered to the input of 2/3 prescaler and the multimodulus prescaler operates as a 47/48 prescaler, where the division ratio is controlled by the logic signal MOD. If MC = 1, the 2/3 prescaler operates in divide-by-3 mode and when MC=0, the 2/3 prescaler operates in divide-by-2 mode which is quite opposite to the operation performed when Sel=0.If MOD = 1, the division ratio N+1 performed by the multi modulus prescaler is same as except that the wideband prescaler operates in the divide-by-3 mode for the entire operationgiven by N + 1 = ((AD *(N1+1))+(0*N1)) = 48 6If MOD = 1, the division ratio N performed by the multi modulus prescaler isN = ((AD - 1) * (N1+1)) + (1*N1) = 47 7

2.4.3 PROGRAM COUNTER

The program counter is responsible for counting P pulses of SlowCLK before outputting a pulse to the phase/frequency detector and resetting itself and the swallow counter. The implementation used in this project, using a 7-bit ripple counter, a 7-bit comparator, and a zero-detector is shown in Figure 12. The ripple counter is clocked by SlowCLK, and increments its count by one each clock cycle. At each stage, the 7-bit comparator compares each count bit to the corresponding bit in the control signal, and outputs a 0 for each equal bit. When the zero-detector detects equivalence in all of the 7 bits, indicating that the desired count has been reached, Fout is driven high. On the next clock cycle, the program counter is reset to zero and the count is restarted. In addition, the output pulse on Fout is used to reset the count of the swallow counter, indicating the end of one complete cycle of the frequency divider.

BLOCK DIAGRAM OF A 7-BIT PROGRAM COUNTER

The ripple counter is implemented using 7 cascaded D-type flip-flops, each arranged in a toggle configuration. The output of each flip-flop is used to clock the next flip-flop. Since the output of each flip-flop inverts on every clock cycle, each flip-flop essentially divides its clock by two, causing the next stage of the ripple counter to be clocked at half the rate of the previous flip flop. Each flip-flop was designed to respond to the falling edge of its clock, when the output of the previous stage changes from a 1 to a 0. In this way, an incrementing binary count is achieved with the outputs of each flip-flop forming the bits of the count. Since the program counter contains 7-bits, any count between 0 and 127 can be set by the control signal. It is important to realize however that in order to achieve a division ratio as specified in the equation DIV=NP+S, the control signal must be set to P-1, since the zero-state is included in the count.

2.4.3.1 P-COUNTER IMPLEMENTATION

It is possible to see the three major components of the program counter implemented using MCML logic gates. At the input of the counter, an array of 7 flip-flops is used as the ripple counter. The outputs of the ripple counter, taken from the outputs of each of the flip-flops, are fed into an array of 7 XNOR gates. The XNOR gates compare each bit with the corresponding bit in the control signal, outputting a logical 1 when the bits are equal. Although this logic is inverted compared to the description of the comparator in the previous section, the zero-detector is implemented as a one-detector using a tree of cascaded AND gates. In this way, the overall logic of the circuit is unchanged, and the output pulse can be generated without any additional logic.Another difference seen in Figure 13 is a separate output, SwallowRST, and some simple circuitry used to generate it. SwallowRST is used internally to reset the flip-flops of the program counter, and externally to reset the flip-flops of the swallow counter. Since the fan-out of the reset signal is high (7 flip-flops in the PC, and 6 in the SC), the reset signal is broken into two paths and driven using separate MCML buffers. In early simulations, these buffers were absent and the reset signal could not provide enough current to drive the input capacitance associated with the flip-flops. SwallowRST was generated using an approach that guarantees predictable timing of the reset signal. Fout is tapped and fed to the input of a flip-flop clocked by Fin. On the clock cycle immediately following Fout going high, the pulse is sampled by the flip-flop, generating SwallowRST and resetting both the program counter and the swallow counter. To ensure that the reset signal is removed before the next clock cycle, the reset signal is fed back to its generating flip-flop through a delay chain comprised of three buffers.

2.4.4 SWALLOW COUNTER

The swallow counter, as indicated in Figure 8, is used to count S pulses of SlowCLK before asserting the modulus control signal and changing the modulus of the DMP to N. A block diagram of the swallow counter is provided in Figure 15. By looking at Figure 15, the similarities between the swallow counter and the program counter are apparent. Once again, the count (6-bits in this case) is maintained using a ripple counter comprised of cascaded flip-flops clocked with SlowCLK. In addition, a comparator compares each count bit with its corresponding bit in the control signal, and a zero-detector asserts modulus control when all bits are equal. However, theswallow counter does not reset when the count is reached, but masks the input clock using an AND gate connected to the inverse of modulus control. As a result, the ripple counter stops counting when the count is reached, and the state of the circuit is maintained until a reset signal (SwallowRST) is received from the program counter. Since the swallow counter contains 6 bits, it is capable of any count from 0 to 64. Once again, the control signal must be set to S-1, since the zero-state is included in the count.

BLOCK DIAGRAM OF A 6-BIT SWALLOW COUNTER

2.4.4.1 S-COUNTER IMPLEMENTATIONThe 6-bit ripple counter implemented as an array of flip-flops, and clocked with the gated clock provided by the AND of SlowCLK and modulus control. In addition, the comparator is implemented as an array of MCML XNOR gates, while the zero-detector is actually implemented as a one-detector using a tree of cascaded AND gates. Unlike the program counter however, no additional circuitry is necessary to generate the reset as the reset is received from the program counter by means of the SwallowRST signal.

2.4.5 4/5 PRESCALERThe 4/5 prescaler reported in consumes large short circuit power and has a higher frequency of operation than that of 4/5 prescaler. The wideband single-phase clock 4/5 prescaler used in this design, which consists of Three D-flip-flops and two Nand gates embedded. The Multi prescaler 4by5 consist of Four D-flip flop, two nand gates, and two or gates one not gate with main 4/5 prescaler circuit. The multi modulus prescaler operates as the normal 64/65/78/79.

MULTIPRESCALER_4BY5:

2.4.6 PROPOSED MULTIBAND FLEXIBLE DIVIDER:

Our proposed multiband flexible divider is Combined by 2/3 Prescaler and 4/5 Prescaler in multi modulus Prescaler. By using mux we can operate either 2/3 Prescaler and 4/5 Prescaler. It will operate 32/33/47/48 or 64/65/78/79 bandwidth.

CHAPTER 3 HARDWARE REQUIREMENTS 3.1 GENERALIntegrated circuit (IC) technology is the enabling technology for a whole host of innovative devices and systems that have changed the way we live. Jack Kilby and Robert Noyce received the 2000 Nobel Prize in Physics for their invention of the integrated circuit; without the integrated circuit, neither transistors nor computers would be as important as they are today. VLSI systems are much smaller and consume less power than the discrete components used to build electronic systems before the 1960s. Integration allows us to build systems with many more transistors, allowing much more computing power to be applied to solving a problem. Integrated circuits are also much easier to design and manufacture and are more reliable than discrete systems; that makes it possible to develop special-purpose systems that are more efficient than general-purpose computers for the task at hand.

APPLICATIONS OF VLSI

Electronic systems now perform a wide variety of tasks in daily life. Electronic systems in some cases have replaced mechanisms that operated mechanically, hydraulically, or by other means; electronics are usually smaller, more flexible, and easier to service. In other cases electronic systems have created totally new applications. Electronic systems perform a variety of tasks, some of them visible, some more hidden:

Personal entertainment systems such as portable MP3 players and DVD players perform sophisticated algorithms with remarkably little energy. Electronic systems in cars operate stereo systems and displays; they also control fuel injection systems, adjust suspensions to varying terrain, and perform the control functions required for anti-lock braking (ABS) systems. Digital electronics compress and decompress video, even at high definition data rates, on-the-fly in consumer electronics. Low-cost terminals for Web browsing still require sophisticated electronics, despite their dedicated function. Personal computers and workstations provide word-processing, financial analysis, and games. Computers include both central processing units (CPUs) and special-purpose hardware for disk access, faster screen display, etc.

Medical electronic systems measure bodily functions and perform complex processing algorithms to warn about unusual conditions. The availability of these complex systems, far from overwhelming consumers, only creates demand for even more complex systems. The growing sophistication of applications continually pushes the design and manufacturing of integrated circuits and electronic systems to new levels of complexity. And perhaps the most amazing characteristic of this collection of systems is its variety as systems become more complex, we build not a few general-purpose computers but an ever wider range of special-purpose systems. Our ability to do so is a testament to our growing mastery of both integrated circuit manufacturing and design, but the increasing demands of customers continue to test the limits of design and manufacturing.

ADVANTAGES OF VLSI

While we will concentrate on integrated circuits in this book, the properties of integrated circuits what we can and cannot efficiently put in an integrated circuitlargely determine the architecture of the entire system. Integrated circuits improve system characteristics in several critical ways. ICs have three key advantages over digital circuits built from discrete components:

Size. Integrated circuits are much smallerboth transistors and wires are shrunk to micrometer sizes, compared to the millimeter or centimeter scales of discrete components. Small size leads to advantages in speed and power consumption, since smaller components have smaller parasitic resistances, capacitances, and inductances.

Speed. Signals can be switched between logic 0 and logic 1 much quicker within a chip than they can between chips. Communication within a chip can occur hundreds of times faster than communication between chips on a printed circuit board. The high speed of circuits on-chip is due to their small sizesmaller components and wires have smaller parasitic capacitances to slow down the signal.

Power consumption. Logic operations within a chip also take much less power. Once again, lower power consumption is largely due to the small size of circuits on the chipsmaller parasitic capacitances and resistances require less power to drive them.

VLSI AND SYSTEMS

These advantages of integrated circuits translate into advantages at the system level:

Smaller physical size. Smallness is often an advantage in itselfconsider portable televisions or handheld cellular telephones.

Lower power consumption. Replacing a handful of standard parts with a single chip reduces total power consumption. Reducing power consumption has a ripple effect on the rest of the system: a smaller, cheaper power supply can be used; since less power consumption means less heat, a fan may no longer be necessary; a simpler cabinet with less shielding for electromagnetic shielding may be feasible, too. Reduced cost. Reducing the number of components, the power supply requirements, cabinet costs, and so on, will inevitably reduce system cost. The ripple effect of integration is such that the cost of a system built from custom ICs can be less, even though the individual ICs cost more than the standard parts they replace. Understanding why integrated circuit technology has such profound influence on the design of digital systems requires understanding both the technology of IC manufacturing and the economics of ICs and digital systems.

INTEGRATED CIRCUIT MANUFACTURING

Integrated circuit technology is based on our ability to manufacture huge numbers of very small devicestoday, more transistors are manufactured in California each year than raindrops fall on the state. In this section, we briefly survey VLSI manufacturing.

TECHNOLOGY

Most manufacturing processes are fairly tightly coupled to the item they are manufacturing. An assembly line built to produce Buicks, for example, would have to undergo moderate reorganization to build Chevystools like sheet metal molds would have to be replaced, and even some machines would have to be modified. And either assembly line would be far removed from what is required to produce electric drills.

MASK-DRIVEN MANUFACTURING

Integrated circuit manufacturing technology, on the other hand, is remarkably versatile. While there are several manufacturing processes for different circuit typesCMOS, bipolar, etc.a manufacturing line can make any circuit of that type simply by changing a few basic tools called masks. For example, a single CMOS manufacturing plant can make both microprocessors and microwave oven controllers by changing the masks that form the patterns of wires and transistors on the chips. Silicon wafers are the raw material of IC manufacturing. The fabrication process forms patterns on the wafer that create wires and transistors. a series of identical chips are patterned onto the wafer (with some space reserved for test circuit structures which allow manufacturing to measure the results of the manufacturing process). The IC manufacturing process is efficient because we can produce many identical chips by processing a single wafer. By changing the masks that determine what patterns are laid down on the chip, we determine the digital circuit that will be created. The IC fabrication line is a generic manufacturing linewe can quickly retool the line to make large quantities of a new kind of chip, using the same processing steps used for the lines previous product.

CIRCUITS AND LAYOUTS

We could build a breadboard circuit out of standard parts. To build it on an IC fabrication line, we must go one step further and design the layout, or patterns on the masks. The rectangular shapes in the layout (shown here as a sketch called a stick diagram) form transistors and wires which conform to the circuit in the schematic. Creating layouts is very time-consuming and very importantthe size of the layout determines the cost to manufacture the circuit, and the shapes of elements in the layout determine the speed of the circuit as well. During manufacturing, a photolithographic (photographic printing) process is used to transfer the layout patterns from the masks to the wafer. The patterns left by the mask are used to selectively change the wafer: impurities are added at selected locations in the wafer; insulating and conducting materials are added on top of the wafer as well. These fabrication steps require high temperatures, small amounts of highly toxic chemicals, and extremely clean environments. At the end of processing, The wafer is divided into a number of chips.

MANUFACTURING DEFECTS

Because no manufacturing process is perfect, some of the chips on the wafer may not work. Since at least one defect is almost sure to occur on each wafer, wafers are cut into smaller, working chips; the largest chip that can be reasonably manufactured today is 1.5 to 2 cm on a side, while a wafer is in moving from 30 to 45 cm. Each chip is individually tested; the ones that pass the test are saved after the wafer is diced into chips. The working chips are placed in the packages familiar to digital designers. In some packages, tiny wires connect the chip to the packages pins while the package body protects the chip from handling and the elements; in others, solder bumps directly connect the chip to the package. Integrated circuit manufacturing is a powerful technology for two reasons: all circuits can be made out of a few types of transistors and wires; and any combination of wires and transistors can be built on a single fabrication line just by changing the masks that determine the pattern of components on the chip. Integrated circuits run very fast because the circuits are very small. Just as important, we are not stuck building a few standard chip typeswe can build any function we want. The flexibility given by IC manufacturing lets we build faster, more complex digital systems in ever greater variety.

ECONOMICSBecause integrated circuit manufacturing has so much leveragea great number of parts can be built with a few standard manufacturing proceduresa great deal of effort has gone into improving IC manufacturing. However, as chips become more complex, the cost of designing a chip goes up and becomes a major part of the overall cost of the chip.

Moores LawIn the 1960s Gordon Moore predicted that the number of transistors that could be manufactured on a chip would grow exponentially. His prediction, now known as Moores Law, was remarkably prescient. Moores ultimate prediction was that transistor count would double every two years, an estimate that has held up remarkably well. Today, an industry group maintains the International Technology Roadmap for Semiconductors (ITRS), that maps out strategies to maintain the pace of Moores Law. (The ITRS roadmap can be found at http://www.itrs.net.)

TerminologyThe most basic parameter associated with a manufacturing process is the minimum channel length of a transistor. (In this book, for example, we will use as an example a technology that can manufacture 180 nm transistors.) A manufacturing technology at a particular channel length is called a technology node. We often refer to a family of technologies at similar feature sizes: micron, submicron, deep submicron, and now nanometer technologies. The term nanometer technology is generally used for technologies below 100 nm.

COST OF MANUFACTURING

IC manufacturing plants are extremely expensive. A single plant costs as much as $4 billion. Given that a new, state-of-the-art manufacturing process is developed every three years, that is a sizeable investment. The investment makes sense because a single plant can manufacture so many chips and can easily be switched to manufacture different types of chips. In the early years of the integrated circuits business, companies focused on building large quantities of a few standard parts. These parts are commoditiesone 80 ns, 256Mb dynamic RAM is more or less the same as any other, regardless of the manufacturer. Companies concentrated on commodity parts in part because manufacturing processes were less well understood and manufacturing variations are easier to keep track of when the same part is being fabricated day after day. Standard parts also made sense because designing integrated circuits was hardnot only the circuit, but the layout had to be designed, and there were few computer programs to help automate the design process.

COST OF DESIGN

One of the less fortunate consequences of Moores Law is that the time and money required to design a chip goes up steadily. The cost of designing a chip comes from several factors:

Skilled designers are required to specify, architect, and implement the chip. A design team may range from a half-dozen people for a very small chip to 500 people for a large, high-performance microprocessor

These designers cannot work without access to a wide range of computer- aided design (CAD) tools. These tools synthesize logic, create layouts, simulate, and verify designs. CAD tools are generally licensed and you must pay a yearly fee to maintain the license. A license for a single copy of one tool, such as logic synthesis, may cost as much as $50,000 US.

The CAD tools require a large compute farm on which to run. During the most intensive part of the design process, the design team will keep dozens of computers running continuously for weeks or months.

A large ASIC, which contains millions of transistors but is not fabricated on the state-of-the-art process, can easily cost $20 million US and as much as $100 million. Designing a large microprocessor costs hundreds of millions of dollars.DESIGN COSTS AND IP

We can spread these design costs over more chips if we can reuse all or part of the design in other chips. The high cost of design is the primary motivation for the rise of IP-based design, which creates modules that can be reused in many different designs

TYPES OF CHIPS

The preponderance of standard parts pushed the problems of building customized systems back to the board-level designers who used the standard parts. Since a function built from standard parts usually requires more components than if the function were built with custom designed ICs, designers tended to build smaller, simpler systems. The industrial trend, however, is to make available a wider variety of integrated circuits. The greater diversity of chips includes:

More specialized standard parts. In the 1960s, standard parts were logic gates; in the 1970s they were LSI components. Today, standard parts include fairly specialized components: communication network interfaces, graphics accelerators, floating point processors. All these parts are more specialized than microprocessors but are used in enough volume that designing special-purpose chips is worth the effort. In fact, putting a complex, high-performance function on a single chip often makes other applications possiblefor example, single-chip floating point processors make high-speed numeric computation available on even inexpensive personal computers.

Application-specific integrated circuits (ASICs)Rather than build a system out of standard parts, designers can now create a single chip for their particular application. Because the chip is specialized, the functions of several standard parts can often be squeezed into a single chip, reducing system size, power, heat, and cost. Application-specific ICs are possible because of computer tools that help humans design chips much more quickly.

Systems-on-chips (SoCs).Fabrication technology has advanced to the point that we can put a complete system on a single chip. For example, a single-chip computer can include a CPU, bus, I/O devices, and memory. SoCs allow systems to be made at much lower cost than the equivalent board-level system. SoCs can also be higher performance and lower power than board-level equivalents because on-chip connections are more efficient than chip-to chip connections. A wider variety of chips is now available in part because fabrication methods are better understood and more reliable. More importantly, as the number of transistors per chip grows, it becomes easier and cheaper to design special-purpose ICs. When only a few transistors could be put on a chip, careful design was required to ensure that even modest functions could be put on a single chip. Todays VLSI manufacturing processes, which can put millions of carefully-designed transistors on a chip, can also be used to put tens of thousands of less-carefully designed transistors on a chip. Even though the chip could be made smaller or faster with more design effort, the advantages of having a single-chip implementation of a function that can be quickly designed often outweighs the lost potential performance. The problem and the challenge of the ability to manufacture such large chips is designthe ability to make effective use of the millions of transistors on a chip to perform a useful function.

CMOS TECHNOLOGY

CMOS is the dominant integrated circuit technology. In this section we will introduce some basic concepts of CMOS to understand why it is so widespread and some of the challenges introduced by the inherent characteristics of CMOS.

POWER CONSUMPTION

POWER CONSUMPTION CONSTRAINTS

The huge chips that can be fabricated today are possible only because of the relatively tiny consumption of CMOS circuits. Power consumption is critical at the chip level because much of the power is dissipated as heat, and chips have limited heat dissipation capacity. Even if the system in which a chip is placed can supply large amounts of power, most chips are packaged to dissipate fewer than 10 to 15 Watts of power before they suffer permanent damage (though some chips dissipate well over 50 Watts thanks to special packaging). The power consumption of a logic circuit can, in the worst case, limit the number transistors we can effectively put on a single chip. Limiting the number of transistors per chip changes system design in several ways. Most obviously, it increases the physical size of a system. Using high-powered circuits also increases power supply and cooling requirements. A more subtle effect is caused by the fact that the time required to transmit a signal between chips is much larger than the time required to send the same signal between two transistors on the same chip; as a result, some of the advantage of using a higher-speed circuit family is lost. Another subtle effect of decreasing the level of integration is that the electrical design of multi-chip systems is more complex: microscopic wires on-chip exhibit parasitic resistance and capacitance, while macroscopic wires between chips have capacitance and inductance, which can cause a number of ringing effects that are much harder to analyze. The close relationship between power consumption and heat makeslow-power design techniques important knowledge for every CMOS designer. Of course, low-energy design is especially important in battery-operated systems like cellular telephones. Energy, in contrast, must be saved by avoiding unnecessary work. We will see throughout the rest of this book that minimizing power and energy consumption requires careful attention to detail at every level of abstraction, from system architecture down to layout. As CMOS features become smaller, additional power consumption mechanisms come into play. Traditional CMOS consumes power when signals change but consumes only negligible power when idle. In modern CMOS, leakage mechanisms start to drain current even when signals are idle. In the smallest geometry processes, leakage power consumption can be larger than dynamic power consumption. We must introduce new design techniques to combat leakage power.

DESIGN AND TESTABILITYDESIGN VERIFICATIONOur ability to build large chips of unlimited variety introduces the problem of checking whether those chips have been manufactured correctly. Designers accept the need to verify or validate their designs to make sure that the circuits perform the specified function. (Some people use the terms verification and validation interchangeably; a finer distinction reserves verification for formal proofs of correctness, leaving validation to mean any technique which increases confidence in correctness, such as simulation.) Chip designs are simulated to ensure that the chips circuits compute the proper functions to a sequence of inputs chosen to exercise the chip. manufacturing test But each chip that comes off the manufacturing line must also undergo manufacturing testthe chip must be exercised to demonstrate that no manufacturing defects rendered the chip useless. Because IC manufacturing tends to introduce certain types of defects and because we want to minimize the time required to test each chip, we cant just use the input sequences created for design verification to perform manufacturing test. Each chip must be designed to be fully and easily testable. Finding out that a chip is bad only after you have plugged it into a system is annoying at best and dangerous at worst. Customers are unlikely to keep using manufacturers who regularly supply bad chips. Defects introduced during manufacturing range from the catastrophic contamination that destroys every transistor on the waferto the subtlea single broken wire or a crystalline defect that kills only one transistor. While some bad chips can be found very easily, each chip must be thoroughly tested to find even subtle flaws that produce erroneous results only occasionally. Tests designed to exercise functionality and expose design bugs dont always uncover manufacturing defects. We use fault models to identify potential manufacturing problems and determine how they affect the chips operation. The most common fault model is stuck-at-0/1: the defect causes a logic gates output to be always 0 (or 1), independent of the gates input values. We can often determine whether a logic gates output is stuck even if we cant directly observe its outputs or control its inputs. We can generate a good set of manufacturing tests for the chip by assuming each logic gates output is stuck at 0 (then 1) and finding an input to the chip which causes different outputs when the fault is present or absent.

TESTABILITY AS A DESIGN PROCESS

Unfortunately, not all chip designs are equally testable. Some faults may require long input sequences to expose; other faults may not be testable at all, even though they cause chip malfunctions that arent covered by the fault model. Traditionally, chip designers have ignored testability problems, leaving them to a separate test engineer who must find a set of inputs to adequately test the chip. If the test engineer cant change the chip design to fix testability problems, his or her job becomes both difficult and unpleasant. The result is often poorly tested chips whose manufacturing problems are found only after the customer has plugged them into a system. Companies now recognize that the only way to deliver high-quality chips to customers is to make the chip designer responsible for testing, just as the designer is responsible for making the chip run at the required speed. Testability problems can often be fixed easily early in the design process at relatively little cost in area and performance. But modern designers must understand testability requirements, analysis techniques which identify hard-to-test sections of the design, and design techniques which improve testability

RELIABILITY

RELIABILITY IS A LIFETIME PROBLEM

Earlier generations of VLSI technology were robust enough that testing chips at manufacturing time was sufficient to identify working partsa chip either worked or it didnt. In todays nanometer-scale technologies, the problem of determining whether a chip works is more complex. A number of mechanisms can cause transient failures that cause occasional problems but are not repeatable. Some other failure mechanisms, like overheating, cause permanent failures but only after the chip have operated for some time. And more complex manufacturing problems cause problems that are harder to diagnose and may affect performance rather than functionality.

DESIGN-FOR MANUFACTURABILITY

A number of techniques, referred to as design-for-manufacturability or design-for-yield, are in use today to improve the reliability of chips that come off the manufacturing line. We can make chips more reliable by designing circuits and architectures that reduce design stresses and check for problems. For example, heat is one major cause of chip failure. Proper power management circuitry can reduce the chips heat dissipation and reduce the damage caused by overheating. We also need to change the way we design chips. Some of the convenient levels of abstraction that served us well in earlier technologies are no longer entirely appropriate in nanometer technologies. We need to check more thoroughly and be willing to solve reliability problems by modifying design decisions made earlier.

INTEGRATED CIRCUIT DESIGN TECHNIQUES

To make use of the flood of transistors given to us by Moores Law, we must design large, complex chips quickly. The obstacle to making large chips work correctly is complexitymany interesting ideas for chips have died in the swamp of details that must be made correct before the chip actually works. Integrated circuit design is hard because designers must juggle several different problems:

Multiple levels of abstraction. IC design requires refining an idea through many levels of detail. Starting from a specification of what the chip must do, the designer must create an architecture which performs the required function, expand the architecture into a logic design, and further expand the logic design into a layout like the one in Figure 1-2. As you will learn by the end of this book, the specification-to-layout design process is a lot of work.

Multiple and conflicting costs. In addition to drawing a design through many levels of detail, the designer must also take into account costsnot dollar costs, but criteria by which the quality of the design is judged. One critical cost is the speed at which the chip runs. Two architectures that execute the same function (multiplication, for example) may run at very different speeds. We will see that chip area is another critical design cost: the cost of manufacturing a chip is exponentially related to its area, and chips much larger than 1 cm2 cannot be manufactured at all. Furthermore, if multiple cost criteriasuch as area and speed requirementsmust be satisfied, many design decisions will improve one cost metric at the expense of the other. Design is dominated by the process of balancing conflicting constraints.

Short design time. In an ideal world, a designer would have time to contemplate the effect of a design decision. We do not, however, live in an ideal world. Chips which appear too late may make little or no money because competitors have snatched market share. Therefore, designers are under pressure to design chips as quickly as possible. Design time is especially tight in application-specific IC design, where only a few weeks may be available to turn a concept into a working ASIC.

FIELD-PROGRAMMABLE GATE ARRAYS(FPGA)

A field-programmable gate array (FPGA) is a block of programmable logic that can implement multi-level logic functions. FPGAs are most commonly used as separate commodity chips that can be programmed to implement large functions. However, small blocks of FPGA logic can be useful components on-chip to allow the user of the chip to customize part of the chips logical function. An FPGA block must implement both combinational logic functions and interconnect to be able to construct multi-level logic functions. There are several different technologies for programming FPGAs, but most logic processes are unlikely to implement anti-fuses or similar hard programming technologies, so we will concentrate on SRAM-programmed FPGAs.

LOOKUP TABLES

The basic method used to build a combinational logic block (CLB) also called a logic elementin an SRAM-based FPGA is the lookup table (LUT). As shown in Figure , the lookup table is an SRAM that is used to implement a truth table. Each address in the SRAM represents a combination of inputs to the logic element. The value stored at that address represents the value of the function for that input combination. An n-input function requires an SRAM with locations.

Fig -5 Lookup Tables

Because a basic SRAM is not clocked, the lookup table logic element operates much as any other logic gate as its inputs change, its output changes after some delay.

PROGRAMMING A LOOKUP TABLE

Unlike a typical logic gate, the function represented by the logic element can be changed by changing the values of the bits stored in the SRAM. As a result, the n-input logic element can represent functions (though some of these functions are permutations of each other).

Fig-6 Programming A Lookup Table A typical logic element has four inputs. The delay through the lookup table is independent of the bits stored in the SRAM, so the delay through the logic element is the same for all functions. This means that, for example, a lookup table-based logic element will exhibit the same delay for a 4-input XOR and a 4-input NAND. In contrast, a 4-input XOR built with static CMOS logic is considerably slower than a 4-input NAND. Of course, the static logic gate is generally faster than the logic element. Logic elements generally contain registersflip-flops and latchesas well as combinational logic. A flip-flop or latch is small compared to the combinational logic element (in sharp contrast to the situation in custom VLSI), so it makes sense to add it to the combinational logic element. Using a separate cell for the memory element would simply take up routing resources. The memory element is connected to the output; whether it stores a given value is controlled by its clock and enable inputs.

COMPLEX LOGIC ELEMENTMany FPGAs also incorporate specialized adder logic in the logic element. The critical component of an adder is the carry chain, which can be implemented much more efficiently in specialized logic than it can using standard lookup table techniques. The wiring channels that connect to the logic elements inputs and outputs also need to be programmable. A wiring channel has a number of programmable connections such that each input or output generally can be connected to any one of several different wires in the channel.

PROGRAMMABLE INTERCONNECTION POINTS

Simple version of an interconnection point, often known as a connection box.

Fig-6 Programming A Lookup Table

A programmable connection between two wires is made by a CMOS transistor (a pass transistor). The pass transistors gate is controlled by a static memory program bit (shown here as a D register). When the pass transistors gate is high, the transistor conducts and connects the two wires; when the gate is low, the transistor is off and the two wires are not connected.

CHAPTER 4SOFTWARE REQUIREMENTS

Verification Tool Modelsim 6.4c Synthesis Tool Xilinx ISE 9.1

INTRODUCTION TO MODELSIM

ModelSim /VHDL, ModelSim /VLOG, ModelSim /LNL, and ModelSim /PLUS are produced by Model Technology Incorporated. Unauthorized copying, duplication, or other reproduction is prohibited without the written consent of Model Technology. The information in this manual is subject to change without notice and does not represent a commitment on the part of Model Technology. The program described in this manual is furnished under a license agreement and may not be used or copied except in accordance with the terms of the agreement. The online documentation provided with this product may be printed by the end-user. The number of copies that may be printed is limited to the number of licenses purchased. ModelSim is a registered trademark of Model Technology Incorporated. Model Technology is a trademark of Mentor Graphics Corporation. PostScript is a registered trademark of Adobe Systems Incorporated. UNIX is a registered trademark of AT&T in the USA and other countries. FLEXlm is a trademark of Globetrotter Software, Inc. IBM, AT, and PC are registered trademarks, AIX and RISC System/6000 are trademarks of International Business Machines Corporation. Windows, Microsoft, and MS-DOS are registered trademarks of Microsoft Corporation. OSF/Motif is a trademark of the Open Software Foundation, Inc. in the USA and other countries. SPARC is a registered trademark and SPARCstation is a trademark of SPARC International, Inc. Sun Microsystems is a registered trademark, and Sun, SunOS and OpenWindows are trademarks of Sun Microsystems, Inc. All other trademarks and registered trademarks are the properties of their respective holders.

ModelSim is a useful tool that allows you to stimulate the inputs of your modules and view both outputs and internal signals. It allows you to do both behavioural and timing simulation, however, this document will focus on behavioural simulation. Keep in mind that these simulations are based on models and thus the results are only as accurate as the constituent models.

Standards Supported

ModelSim VHDL supports both the IEEE 1076-1987 and 1076-1993 VHDL, the 1164-1993 Standard Multivalue Logic System for VHDL Interoperability, and the 1076.2-1996 Standard VHDL Mathematical Packages standards. Any design developed with ModelSim will be compatible with any other VHDL system that is compliant with either IEEE Standard 1076-1987 or 1076-1993. ModelSim Verilog is based on IEEE Std 1364-1995 and a partial implementation of 1364-2001, Standard Hardware Description Language Based on the Verilog Hardware Description Language. The Open Verilog International Verilog LRM version 2.0 is also applicable to a large extent. Both PLI (Programming Language Interface) and VCD (Value Change Dump) are supported for ModelSim PE and SE users.

MODELSIM

Basic Steps For SimulationThis section provides further detail related to each step in the process of simulating your design using ModelSim.

Step 1 - Collecting Files and Mapping LibrariesFiles needed to run ModelSim on your design: design files (VHDL, Verilog, and/or SystemC), including stimulus for the design libraries, both working and resource modelsim.ini (automatically created by the library mapping command

Providing stimulus to the designYou can provide stimulus to your design in several ways: Language based testbench Tcl-based ModelSim interactive command, force VCD files / commands See "Using extended VCD as stimulus" (UM-458) and "Using extended VCD as stimulus" 3rd party test bench generation tools

What is a library in ModelSim?A library is a location where data to be used for simulation is stored. Libraries are ModelSims way of managing the creation of data before it is needed for use in simulation. It also serves as a way to streamline simulation invocation. Instead of compiling all design data each and every time you simulate, ModelSim uses binary pre-compiled data from these libraries. So, if you make a changes to a single Verilog module, only that module is recompiled, rather than all modules in the design.

Working and resource libraries

Design libraries can be used in two ways: 1) as a local working library that contains the compiled version of your design; 2) as a resource library. The contents of your working library will change as you update your design and recompile. A resource library is typically unchanging, and serves as a parts source for your design. Examples of resource libraries might be: shared information within your group, vendor libraries, packages, or previously compiled elements of your own working design. You can create your own resource libraries, or they may be supplied by another design team or a third party (e.g., a silicon vendor). For more information on resource libraries and working libraries, see "Working library versus resource libraries", "Managing library contents", "Working with design libraries, and "Specifying the resource librarie".

Creating The Logical Library vlib

Before you can compile your source files, you must create a library in which to store the compilation results. You can create the logical library using the GUI, using File > New > Library (see "Creating a library"), or you can use the vlib command. For example, the command:vlib workcreates a library named work. By default, compilation results are stored in the worklibrary.

Mapping The Logical Work To The Physical Work Directory vmap

VHDL uses logical library names that can be mapped to ModelSim library directories. If libraries are not mapped properly, and you invoke your simulation, necessary components will not be loaded and simulation will fail. Similarly, compilation can also depend on proper library mapping. By default, ModelSim can find libraries in your current directory (assuming they have the right name), but for it to find libraries located elsewhere, you need to map a logical library name to the pathname of the library. You can use the GUI ("Library mappings with the GUI", a command ("Library mappings with the GUI" ), or a project ("Getting started with projects" to assign a logical name to a design library.

The format for command line entry is:vmap This command sets the mapping between a logical library name and a directory.

Step 2 - Compiling the design with vlog/vcom/sccomDesigns are compiled with one of the three language compilers.Compiling Verilog - vlogModelSims compiler for the Verilog modules in your design is vlog . Verilog files may be compiled in any order, as they are not order dependent. See "Compiling Verilog files" for details.Verilog portions of the design can be optimized for better simulation performance. "Optimizing Verilog designs".

Compiling VHDL - vcomModelSims compiler for VHDL design units is vcom . VHDL files must be compiled according to the design requirements of the design. Projects may assist you in determining the compile order: for more information, see"Auto-generating compile order" (UM-46). See "Compiling VHDL files" (UM-73) for details. on VHDL compilation.

Compiling SystemC - sccomModelSims compiler for SystemC design units is sccom , and is used only if you have SystemC components in your design. See "Compiling SystemC files" for details.

Step 3 - Loading the design for simulationvsim Your design is ready for simulation after it has been compiled and (optionally) optimized with vopt . For more information on optimization, see Optimizing Verilog designs . You may then invoke vsim with the names of the top-level modules (many designs contain only one top-level module) or the name you assigned to the optimized version of the design. For example, if your top-level modules are "testbench" and "globals", then invoke the simulator as follows:vsim testbench globals

After the simulator loads the top-level modules, it iteratively loads the instantiated modules and UDPs in the design hierarchy, linking the design together by connecting the ports and resolving hierarchical references.

Using SDFYou can incorporate actual delay values to the simulation by applying SDF back annotationfiles to the design. For more information on how SDF is used in the design, see "Specifying SDF files for simulation" .Step 4 - Simulating the designOnce the design has been successfully loaded, the simulation time is set to zero, and youmust enter a run command to begin simulation. For more information, see Verilogsimulation , VHDL simulation , and SystemC simulation .The basic simulator commands are:add wave force bprunstep next

Step 5- Debugging The Design

Numerous tools and windows useful in debugging your design are available from the ModelSim GUI. For more information, seeWaveform analysis (UM-237), PSL Assertions andTracing signals with the Dataflow window. In addition, several basic simulation commands are available from the command line to assist you in debugging your design:describe drivers examine force log checkpointrestore show MODELSIM BASICSOn the left side of the interface, under the project tab is the frame listing of the files that pertainto the opened project. The Library frame lists the entities of the project (that have been ompiled).

To the right is the ModelSim shell frame. It is an extension of MS-DOS, so both ModelSim and MS-DOS commands can be executed.A. Creating a New ProjectOnce ModelSim has been started, create a new project:File > New > ProjectName the project FA, set the project location toF:/VHDL , and click OK.A new window should appear to add new files to theproject. Choose Create New File.

Enter F:\VHDL\FA.vhd as the file name and click OK.Then close the add new files window.Additional files can be added later by choosing from the menu: Project > Add File to Project

B. Editing Source FilesDouble click the FA.vhd source found under the Workspace windows project tab. This willopen up an empty text editor configured to highlig