eleven ways to boost your synchronizer

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1

Eleven Ways to Boost Your SynchronizerSalomon Beer, Member, IEEE, and Ran Ginosar, Senior Member, IEEE

Abstract— Synchronizers play an essential role in multipleclock domain systems-on-chip. The most common synchronizerconsists of a series of pipelined flip-flops. Several factors influ-ence the performance of synchronizers: circuit design, processtechnology, and operating conditions. Global factors apply tothe entire integrated circuit, while others can be adjustedfor each individual synchronizer in the design. The followingguidelines are provided to improve synchronizers: avoiding scanand reset, selecting minimum size flip-flop cells, minimizingrouting, reducing jitter in coherent clock domain crossings, optingfor high-performance process flavor and minimum VTH, over-provisioning to account for variations, maximizing supply voltage,and manipulating clock duty cycle.

Index Terms— Metastability, MTBF, multistage synchronizers,synchronization, synchronizer, tau effective.

I. INTRODUCTION

SYNCHRONIZERS play a key role in modern multipleclock system-on-chip (SoC) designs [1]. Such designs

present thousands of clock domain crossings (CDC), wherethe system is prone to metastability errors. To mitigatethose failures and ensure reliable signal transition betweenCDCs, synchronizers are used to convert domain timings.The type of synchronizer to be used for each CDC is deter-mined by the specific properties of the two clock domainsinvolved. Different classifications of CDC have been studied.In [1]–[4], the classification of CDC is based on theirfrequency and phase relations, such as mesochronous, ple-siochronous, and heterochronous CDC. The latter group maybe further subdivided into ratiochronous and nonratiochronous[5], [6] CDCs. When there is no frequency and phase relation-ship, the clock domains are assumed to be mutually asynchro-nous. A different classification is based on clock sources [7].Clocks are classified as noncoherent when they are sourcedfrom different references and coherent when they share acommon reference clock. The latter is the case when severalphase locked loops are sourced from the same oscillator. Foreach category, specialized synchronizers have been developedto exploit the CDC relationship and ensure correct operationimproving performance and reliability. In [8]–[12], synchro-nizers for mesochronous, plesiochronous, and ratiochronousCDCs are proposed. The N-flip-flop synchronizer is usuallyemployed in the asynchronous case [13]. The N-flip-flop

Manuscript received November 20, 2013; revised March 27, 2014; acceptedJune 9, 2014. The work of S. Beer was supported by the HPI Institute.

The authors are with the Department of Electrical Engineering,Technion-Israel Institute of Technology, Haifa 32000, Israel (e-mail:[email protected]; [email protected]).

Color versions of one or more of the figures in this paper are availableonline at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TVLSI.2014.2331331

synchronizer comprises a concatenated series of flip-flops, asshown in Fig. 1. This concatenated flip-flop structure not onlycan be used as a stand-alone solution but it is also a central partin many other synchronizers such as first-in-first-out (FIFO)synchronizers [14] and represents a critical part that has beenstudied intensively.

The VLSI designer that is to use concatenated flip-flops inher circuit is usually faced with questions about how manystages to use in the N−flip-flop synchronizer. The designerwho wishes to use flip-flops from a standard cell library wouldlike to know what the parameters are that influence the meantime between failures (MTBF) of the system before signingoff the design. This knowledge is increasingly valuable innanoscale SoC designs because several factors have emergedthat challenge the reliability of synchronizers. In particular,the required number of synchronizers in a design is growingrapidly; the variability of semiconductor parameters as wellas the sensitivity to operational conditions has increasedwith scaling. Prediction of MTBF in CDC depends on avariety of parameters, categorized as circuit parameters,process technology parameters, and operating conditionsparameters, as shown in Fig. 2. The circuit considerationsinclude questions, such as what the necessary number ofstages to include in the synchronizer is, as well as what theappropriate flip-flops to use in each stage are.

Placement and routing of the flips-flops in the pipeline isalso classified as a circuit consideration. Process relates tothe choice of technology node as well as the process familyand variability of each node. The selection of the thresholdvoltage of the transistors can be considered a process property,but since modern technologies allow mixing different thresh-old levels in the same design, we consider it a circuit/processproperty and is presented in the intersection of both areas.Operating conditions are frequencies of the CDCs, supplyvoltages, duty cycle and temperature. Jitter is considered inbetween process and operating conditions because it is affectedby both. This classification can be subdivided into factorsaffecting metastability in a global or local way. Global factorsaffect all transistors in the design in the same way, whilethe effects of local factors may vary for different CDCswithin the same IC. Local parameters are bolded in Fig. 2,while global ones are italicized. For global parameters, weprovide analytical insight into how they affect metastability.For the local parameters, we provide guidelines on how tochoose the flip-flops forming the synchronizer and techniquesthat either improve reliability or prevent errors.

All the parameters described above are essential fordetermining the settling time constants (τ ) and the aper-ture width TW of the flip-flops within the synchronizer.

1063-8210 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.


2 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS

Fig. 1. Typical N-flip-flop synchronizer.

Fig. 2. Classification of factors affecting metastability. ∗ Items are discussedin Section II, numbered ones are detailed in Section III, italicized items areglobal factors, and bolded ones are design guidelines.

Previously, [15] presented guidelines on how not to builda synchronizer. A decade later and after many publi-cations on the topic, we present ground rules on howto improve the performance of N-flip-flop synchroniz-ers. This paper summarizes publications and accumu-lated industry expertise, which we believe is useful fordesigners to optimize the benefit from their synchro-nizers. Sections III-A1–III-A3 and Sections III-B1–III-B3contain new results. Sections III-A4 and III-A5, andSections III-C1–III-C3 are based on previous publications andare analyzed here from a design perspective.

This paper is organized as follows. In Section II, we presenta framework of the synchronization problem and introducea baseline circuit. In Section III, we provide 11 rules toimprove the performance of synchronizers and in Section IV,we conclude the work.

II. SYNCHRONIZATION FRAMEWORK

In this section, we describe the synchronization frameworkof the N-flip-flop synchronizer. The equations and derivationsof this section form a common ground for subsequent sections.

As stated above, most synchronizers include an N-flip-flopsynchronizer comprising a pipeline of flip-flops. These con-catenated flip-flops are designed to reduce the probability ofsynchronization failure.

Generally, to reduce the probability of failures, the numberof flip-flops in the pipe is increased. Increasing the number ofstages increases resolution time that decreases the chance ofmetastable state in the subsequent logic. When the numberof stages increases, latency through the pipeline increases,thereby reducing performance. Thus, latency is traded off forfailure probability. Usually, the probability of failure of the

Fig. 3. Generalized master–slave circuit.

N-flip-flop synchronizer is measured by the MTBF

MTBF = eS/τ

TW · FC · FD(1)

where FC and FD are the clock and data transition frequen-cies, S is a predetermined time allowed for metastabilityresolution, τ is the resolution time constant, and TW is aparameter describing a vulnerable time window, which isdetermined experimentally. TW and τ are intrinsic circuit para-meters that depend on the flip-flops used in the synchronizerand on the technology. The resolution time (S) is determinedby the number of flip-flops in the synchronizer. The larger thenumber of flip-flops the larger the resolution time allowed.Ignoring propagation and setup times, the resolution time isgiven by [14]

S = (N − 1)TC (2)

where N is the number of flip-flop stages and TC is the clockperiod of the receiving clock domain.

For each stage in the N-flip-flop synchronizer, we consider ageneralized flip-flop circuit similar to the one shown in Fig. 3.The circuit comprises a master and a slave latch. Each one ofthese latches is characterized by a resolution time constantτi (i ∈ {M, S}). The scheme in Fig. 3 is an abstract schemethat serves as a framework, and most flip-flop circuits arederivations of a similar form. We consider some of thosederivations in the following sections.

Based on the resolution time constant for each latch in aflip-flop, the overall effective resolution time constant for theflip-flop is given by [41]

τeff =(

α

τM+ (1 − α)

τS

)−1

(3)

where α represents the duty cycle of the clock. Using thisformula, a model for the resolution time constant of each latch


BEER AND GINOSAR: ELEVEN WAYS TO BOOST YOUR SYNCHRONIZER 3

can be obtained and then combined in (3). From small signalanalysis, τi (i ∈ {M, S}) can be approximated by [14]

τi ∝ CQ

gmi ∈ {M, S} (4)

where CQ includes the gate and diffusion capacitances of themetastable synchronizer nodes (Qi , Q̄i , i ∈ (M, S)) and thecoupling capacitance between the gate and the source anddrain of the transistors connected to the metastable nodes.gm is the transconductance of the transistors in the latch.

Near metastability, the transistors operate in the sat-uration region, and hence the transconductance can beapproximated by

gm = gmn + gmp

=(

μnCoxWn

L

1

1 + √a

+ μpCoxWp

L

√a

1 + √a

)

×(VDD − |VThP | − VThN )α (5)

where a = μnWn/μpWp , VTN and VTP are the transistorthreshold voltages for the N and P transistors, respectively,μn and μp are the electron and hole mobilities, and α is thevelocity saturation index [16].

III. BOOSTING SYNCHRONIZERS

This section describes 11 methods to improve the perfor-mance of synchronizers. The methods are divided into threecategories, circuit, process and operating conditions, with eachsection containing the boosting methods for each category asdescribed in Fig. 2. Minimum threshold voltage (#7) lies inbetween circuit and process categories and is described inthe process subsection. Jitter (#9) lies between process andoperating conditions category and is described in the operat-ing conditions subsection. Temperature, included in operatingconditions, cannot be directly manipulated by the designer,and hence its impact is included in the process variationsubsection (#8). The influence of factors marked by ∗ in Fig. 2,such as the number of stages, process node, and frequenciesis addressed in Section II above. Since those parameters areincluded in (1), their influence on MTBF is straightforward.

A. Boosting the Synchronizer Circuit

1) No Scan/BIST in Synchronizer Flip-Flops: Scan is usedin design-for-test circuits. The objective is to make testingeasier by providing a simple way to set and observe everyflip-flop in the integrated circuit. In general, a scan-enable pinis added to each flip-flop. When that signal is asserted, all flip-flops in the design are connected in a long-shift register. Forthis purpose, additional transistors are added to the standardflip-flop circuit. One example of such a circuit is shown inFig. 4 [17].

The scan path element receives its input either from theD input or from the previous scan element via SDI. It iscontrolled by the scan-enable signal MODE, and the NOR

gate is transparent when scan is disabled (MODE = 0).The scan flip-flop presented is only one example of manyderivatives and topologies existing in industry applications and

Fig. 4. Scan D-flip-flop with NOR gate.

TABLE I

NORMALIZED τ IN SCAN FLIP-FLOPS (SIMULATIONS, 65-nm CMOS)

academic publications. Other variants of scan flip-flops arepresented in [18] and [19].

Most of these configurations produce a negative effect onmetastability resolution and induce an increase in τ . In thecircuit of Fig. 4, the capacitance of the metastable node of theslave latch (QS) is increased by the diffusion capacitances ofthe transmission gate (TG1), generating a higher τ , accordingto (4). To demonstrate the effect quantitatively, we simulatedthe circuit of Figs. 3 and 4 and [18]. The transistors weresized to enable comparison of the different circuits, followingsizing in commercial libraries. Table I shows the results ofcircuit simulations confirming the increase in τ for the scanflip-flops examined. The table includes simulations for τ ofmaster and slave latch and calculation of the effective τof the flip-flop based on a 50% duty cycle, following [41].All τ values are normalized to τM of the circuit in Fig. 3,and are simulated in functional rather than scan mode. Thedifferent flip-flop circuits do not affect the regeneration natureof the master latch, and hence τM is almost the same in allflip-flops. However, the slave latch is affected by the scanTG1 and SDO inverter, significantly increasing both τS and theresulting effective τeff of the flip-flop. The increase in τS forthe circuit of Fig. 4 with respect to Fig. 3 is about 24%, whichinduces a significant decrease in MTBF. For instance, for aCDC with FC = 500 MHz, FD = 100 MHz, using a two flip-flop synchronizer with τeff = 55 ps and TW = 30 ps, MTBFreduces from almost 130 years (for Fig. 3) to one month (forFig. 4), a reduction of three orders of magnitude.

In summary, the use of flip-flops with scan capabilitieshas many benefits for IC test and quality control. However,those benefits usually increase the effective capacitance inmetastable prone latches, increasing τ and reducing syn-chronizer MTBF. Thus, we recommend avoiding the useof flip-flops with scan capabilities in synchronizers whenpossible.



Fig. 5. Flip-flop with asynchronous reset.

TABLE II

NORMALIZED τ IN ASYNCHRONOUS RESET FLIP-FLOP

(SIMULATIONS, 65-nm CMOS)

2) No Reset in Synchronizer Flip-Flops: Another topologyfrequently available in flip-flops is the use of asynchronousreset. The main advantage of this technique is to force thecircuit into a known state to initialize hardware. There existmany different circuits that implement reset. One implemen-tation of such a circuit is presented in Fig. 5. When the flip-flop is reset (RST = 0), the N-type transistor dischargesnode QM , setting QM = 0, QM = 1, and the node QS ischarged through the P-type transistor setting QS = 0, andQ = 0. The two reset transistors add parasitic capacitances tonodes QM and QS , and following (4), increase τ for both themaster and the slave latches of the flip-flop. Table II providessimulation results comparing τ for the circuits of Figs. 3 and 5.The results confirm the prediction that the reset transistorsinduce an increase in both τM and τS . The overall increase isof about 6% in the effective τ and should be accounted for bythe designer. In other circuit implementations of asynchronousor synchronous reset, this increment may be more pronounced.

In summary, the use of asynchronous reset needs to be con-sidered by the designer. The impact on synchronizer flip-flopsneeds to be evaluated. To achieve a minimum τ , asynchronousreset should be avoided.

3) Minimum Flip-Flop Cell Size: One of the challengesfaced by the design engineer is to determine which flip-flopcell size from the available library could be used in thesynchronizer.

According to (4), τ is affected by both the capacitanceof the latch and its transconductance. Increasing gate sizewill increase both its capacitance and its gm . In a first-orderapproximation, both changes are canceled out, and the valueof τ remains unchanged. Second-order effects, especially theexternal load connected to the latch, should be consideredto determine appropriate sizing. If the VLSI designer candetermine the size of each transistor inside the library flip-flop, then the loads on each latch should be chosen small todecrease τ . For the circuit of Fig. 3, that would imply reducingthe size of transistors of TG1, inverter INVD for the master

Fig. 6. Normalized τs and τeff versus cell sizes in library.

Fig. 7. Placing and routing constraints for multistage synchronizer.

latch, TG1, and inverter INVQ for the slave latch. However,in general, the designer cannot directly affect the sizing ofthe internal inverters in the flip-flop, but has to choose froma predefined list of sizes that represent a general measure forthe cell size. In most digital libraries, the differently sized flip-flop cells are optimized so that they handle different fan-outloads without drastically increasing the delay of the flip-flop.This is generally achieved by increasing the size of the outputstage inverter (INVQ) for the different cell sizes. The internalportions of the flip-flops, however, are typically unchangedamong these differently sized cells. Thus, the increased INVQsize dramatically loads the slave latch, increasing its τ , whenthe flip-flop cell size is increased.

Fig. 6 shows τ for different flip-flop sizes available inthe simulated library. The library provides five sizing optionsfor flip-flop cells: ×3, ×5, ×10, ×15, and ×20, the onlydifference between the cells being the size of the INVQtransistors. τs increases almost linearly with the increase incell size. Since τM is not affected by INVQ, the resultingτ based on (3) increases as well.

In summary, the use of the smallest available flip-flops inthe library is encouraged to obtain minimum τ and maximumMTBF. This sometimes counter-intuitive guideline should beused for all library flip-flop cells in the synchronizer.

As a final remark, we note that careful attention should bepaid when applying this guideline since even though this trendis valid in all reviewed libraries, there may exist other libraries,where flip-flop cell sizing may behave differently. Some duediligence is encouraged.

4) Minimum Routing Between Flip-Flops: To achievedesired MTBF values, high speed synchronizers are usuallybuilt as pipelines of N flip-flops (Fig. 7). From (1), the



Fig. 8. Clock-data phase histogram for fd = 125 MHz and fC = 150 MHz.

resolution time (S) is determined by [14]

S = (N − 1) · (TC − tCQ − tpd − tsu) (6)

where tCQ is the clock-to-Q propagation delay of each flip-flopin the synchronizer, tpd is the routing delay to the next flip-flopin the pipeline, tsu is the setup time, and TC is the clock period(TC = 1/ fc). When TC is long compared to tCQ, tpd, andtsu, it can be neglected. However, when the receiving domainfrequency is high, it should be considered. To increase S tothe maximum possible value, the IC designer needs to reducethe routing delay tpd to a minimum. This can be achieved byimposing stringent constraints on these delays (Fig. 7) and byplacing them closely together.

5) CDC Coherency: When the two clocks are related infrequency or phase, the standard N-flip-flop pipeline synchro-nizer may result in much lower MTBF than predicted by (1).

A metastability event may occur when data and clocksignals at the input of a flip-flop or latch toggle within acertain time window (W ). If for every cycle, the toggle occurswithin the W window, the probability of failure increases dras-tically. In [42] and [43], it is shown that the time differencesbetween clock and data in coherent clock domains (wherethe two clocks are generated from a common source) canachieve only discrete possible values. For example, when thefrequency of the two clock domains are fd = 125 MHz andfC = 150 MHz, the clock data phase can achieve only fivepossible values, as shown in Fig. 8. If the metastability window(blue) happens to fall in between phases, the probability ofmetastability is very low and negligible. On the other hand,if the metastability window happens to overlap one of thephases (red), the probability of failure may be higher, and theMTBF may be lower, than predicted by (1). This is because (1)assumes a uniform distribution of the clock-data time phasesdifferences, while in coherent CDC, the phase distribution maybe nonuniform as is shown in Fig. 8. The exact form of thephase probability distribution in the case of coherent clockdomains is further discussed in [43] along with mitigationtechniques to reduce MTBF in such cases.

In coherent clock domains, the relative position of the phasedistribution relative to the metastability window defines theMTBF. Defining Q = fd/ gcd( fd , fc) and σ being the clockjitter. When TC > 2Qσ , the phase represents a nonuniform

Fig. 9. Clock-data phase probability density function diagram. (a) ForTC > 2Qσ . (b) TC < 2Qσ .

distribution, having maxima and minima. This happensbecause the distance between the ideal phase positions (TC/Q)is larger than the standard deviation (σ ) of the jitter, and themaxima are well separated. Only when TC < 2Qσ , can thephase be approximated by a continuous uniform distributionand (1) holds (Fig. 9).

In summary, formula (1) may not always apply and maynot provide a lower bound on MTBF. In coherent CDC cases,special caution must be exercised when assessing MTBF.Specific measures for addressing this issue are describedin [43].

B. Boosting the Process Technology

1) Process Flavor: The selection of the process technologyto fabricate an IC has diverse criteria, power and performancebeing the most critical ones. Foundries provide a variety ofprocess families tailored to different needs that are oftendenominated process flavors. The aim of this subsection isto analyze the different flavors with respect to metastabilityperformance.

Process flavors differ in terminology and type depend-ing on the vendor. A popular classification divides thetechnology node into low-power (LP) and high-performance(HP) flavors. However, in modern technologies, more detailedclassifications are available. In [21], two classifications areadded: the low-power-high-k metal gates and the high-performance-for-mobile (HPM) flavors. In [22], three flavorsare available: LP, high-performance-low-power, and HPM;while in [23], the denominations are super-low-power,low-power-high-performance, and high-performance-plus.

Physical factors affecting the different flavors include nom-inal supply voltage, threshold voltage of the transistors, gate



fabrication, and stress memorization techniques (SMT). Theexact ingredients behind each process flavor depend on thefoundry vendor and are a combination of the above mentionedfactors. The exact proportion of each factor is usually acarefully guarded secret. An additional important factor thatinfluences future designs is the use of nonplanar transistorarchitectures. We now analyze how each factor affects metasta-bility parameters.

The effect of supply voltage and threshold voltage onmetastability is evaluated in separate subsections below sincethey can vary within the chip due to multiple power domainson chip or multithreshold circuits within the same IC.

Gate fabrication refers to how the transistor gate and gatedielectric are generated. Oxynitride gate dielectrics [24] havebeen employed for many years, where the silicon oxide dielec-tric is infused with a small amount of nitrogen. The nitridecontent raises the dielectric constant and increases resistanceagainst dopant diffusion through the gate dielectrics [25].In recent years, high-k dielectrics were introduced in conjunc-tion with metal gates [26]. The exact material employed for thedielectric varies between foundries and is a topic of constantresearch.

The gate oxide in a MOSFET can be modeled as a parallelplate capacitor [28]

Cox = κε0

tA (7)

where A is the capacitor area, t is the thickness of the capacitoroxide insulator, ε0 is the permittivity of free space, and κ isthe relative dielectric constant of the utilized material. Thevalue of κ ranges from 3.9 in silicon dioxide to almost80 for high-k materials [20]. The high permittivity (κ) ofthe high-k dielectrics allows the device engineer to achievehigher capacitances and current while keeping the dielectricthicker, thereby significantly reducing gate leakage. From(4), (5), and (7), we conclude that using high-k dielectricsincreases Cox, thereby increasing gm and reducing τ .

Enhancement of channel mobility in high-k/metal gate tran-sistors is achieved by channel strain engineering. In general,the application of tensile strain in nMOS and compressivestrain in pMOS channel enhances device performance [29].These stress memorization techniques affect mobility, which,by (4) and (5), affect τ .

While high-k/metal gate technologies and strained siliconplay a significant role in today’s fabrication process, evo-lution to nonplanar transistor architectures is expected asscaling advances. 3-D transistors, such as tri-gate [26], [30] orFinFET [31], will become important in solving further shortchannel effects and in improving performance. The impact of3-D transistors on metastability should be evaluated by meansof the effective gm and CQ (4).

Table III shows simulations for a commercial 65 nmprocess comparing LP and HP flavors. The comparison hasbeen performed under the same supply voltage and standardthreshold voltage conditions. The results are normalized tothe τ HP value. The LP flavor shows a τ almost 3.5 timeshigher compared to HP. This enormous difference cannot beneglected, especially when migrating circuits among different

TABLE III

τ VALUES FOR DIFFERENT PROCESS FLAVORS


TABLE IV

τ VALUES FOR DIFFERENT THRESHOLD VOLTAGES


technology flavors. If the designer can choose the processflavor, HP is preferred from a maximum MTBF point of view.

In summary, the choice of a process flavor has a large influ-ence on metastability as τ and MTBF are directly influencedby gate fabrication techniques and SMT. When process flavorchanges, τ should be reevaluated, and the number of stagesin all synchronizers needs to be reexamined to maintain thedesired MTBF as predicted by (1).

2) Minimum Threshold Voltage (VTH): While tradition-ally, a single level of transistor threshold voltage wasavailable in the chip, as determined by the fabricationprocess, modern technologies offer a choice of a variety ofthreshold voltages for different transistors in the same IC.Multithreshold CMOS (MTCMOS) technology has emergedas an increasingly popular technique to reduce leakage powerin HP ICs [35], [36]. The choice of which threshold voltageto use is a compromise between performance and power.Lowering the threshold voltage generates faster transistors (5)with higher leakage currents, while increasing the thresholdreduces leakage but slows down the transistors. In moderntechnologies, the choice of VTH is usually made among threeof five predetermined values, such as ultralow, low, standard,high, and ultrahigh VTH. As determined by (5), using lowthreshold transistors increases gm and reduces τ . Simulationsof such examples can be seen in Table IV. Three differentthreshold levels were simulated generating different τ values.The lowest value is achieved for the lowest VTH. In summary,for the flip-flops forming the synchronizer, transistors withminimum VTH are preferred to reduce τ .

3) Process Variations: An important challenge faced bythe IC designer when considering synchronization is thenumber of flip-flop stages to be used to achieve certainreliability (MTBF). The number of such stages (NS ), following(1) and (6), is given by:

NS =⌈

τ · ln(MTBF · fc · fd · TW )

TC − tCQ − tpd − tsu

⌉+ 1. (8)

However, the designer, who must also assure correct oper-ation under process, supply voltage, and junction tempera-



Fig. 10. Normalized τ simulations for different process corners.

ture variations process–voltage–temperature (PVT), needs toaccount for this variability when calculating the number ofstages of the synchronizer. The need for over-provisioning inthe number of stages comes at the cost of augmented latencyand power. The challenge is to determine the minimum over-provisioning needed to provide acceptable MTBF with theminimum number of stages. Assuming that under worst case(wc) PVT conditions, the value of τ becomes τwc and thevalue of TW becomes T wc

W and the number of stages neededfor wc (Nwc

S ) becomes

NwcS = τwc

τ nom (NnomS − 1) + 2. (9)

The constant 2 in (9) is due to �x� < x + 1. The equationindicates that the number of stages needed in wc may be muchhigher than in nominal operation. In (9), we have neglectedthe influence of the change of TW in wc PVT since its effectis minor compared to the effect of τ .

In modern technologies, process variation can be highresulting in large τ variation [37]–[39]. Fig. 10 showsτ simulations versus supply voltage for different processcorners, fast-fast (FF), typical-typical (TT), and slow-slow(SS) process corners, which constitute ±3σ deviations. Theτ variability can be of several tens of percent. The ratiobetween τSS and τTT ranges from 0.4 to 0.7 for the supplyvoltages studied. In nominal VDD, τSS is near half of τTT . Onthe other hand, the ratio between τFF and τTT is in the rangeof 2.2–1.5, as is shown in Fig. 11.

TW simulations versus supply voltage for different processcorners are shown in Fig. 12. Note that simulation precisionis limited. However, the figure shows that TW variationsover process and supply voltage are bounded. Further, TW

influences MTBF only linearly [while τ influence is expo-nential (1)]. Hence, computing TW at, e.g., nominal voltageand typical process corner, and assuming twice that value asupper bound on TW in (1) is acceptable.

Temperature influence on τ depends on the supply voltageand threshold of the transistors. A complete study was intro-duced in [33] and [40]. Both μ and VTH decrease with increas-ing temperature [44]; however, decreasing μ increases τ ,

Fig. 11. τ/τTT versus VDD/VDDnom simulations for different processcorners.

Fig. 12. Normalized TW simulations for different process corners.

while decreasing VTH decreases τ . When the impact of achange in μ on τ is larger than the impact of a change inVTH on τ , increasing temperature causes an increase in τ .Conversely, when the impact of VTH dominates over that of μ,increasing temperature causes a decrease in τ . The dominantfactor is determined by the ratio of the supply voltage tothe threshold voltage. In modern technologies, where multiplesupply voltages can be selected and VDD approaches the valueof 2VTH; τ decreases when the temperature is increased. Whensupply voltage is high and threshold voltage is low, the trendis reversed and τ increases with temperature. Simulations ofτ versus temperature for different supply voltages are shownin Fig. 13 showing this effect.

The last PVT factor to consider, supply voltage, is examinedin Section III-C2 below. We note that when calculating thenumber of stages using (8), the designer must bear in mindthat usually when the supply voltage is decreased, clockfrequencies should decrease to fulfill critical path timings. As aconsequence, the value of TC in (8) increases, thereby reducingthe number of stages in the worst case.

In summary, the number of stages in the synchronizer mustaccount for worst case PVT variations using (9). A summaryof τ relations for different corners is shown in Table V.



Fig. 13. Normalized τ simulations versus temperature for nominal andreduced supply voltage.

TABLE V

τ VALUES FOR DIFFERENT PROCESS CORNERS AT

NOMINAL VDD (SIMULATIONS, 65-nm CMOS)

Fig. 14. Clock-data phase histogram for fd = 125 MHz and fC = 150 MHz,Q = 5 with Gaussian noise jitter.

C. Boosting the Synchronizer Operating Conditions

1) Reduce Jitter: Noise in ICs is manifested as jitter insignals. In coherent clocks CDC, when jitter is present and if itis normally distributed N(0, σ 2), Fig. 8 turns into Fig. 14 [43].As seen in Section III-A5, the overlap between the metasta-bility window and the phase peaks determines the probabilityof failure. The number of phase peaks Q and the value ofjitter determine the spread of each peak and the nature ofoverall phase distribution. When TC < 2Qσ , the overall phasedistribution can be approximated by a continuous uniformdistribution and (1) holds. When TC > 2Qσ , the distributionis nonuniform (Fig. 14) presenting maxima and minima. Inthis case, the amount of jitter is critical. When the system isin a distribution valley (blue window), reducing the jitter can

Fig. 15. Simulations of normalized τ versus normalized supply voltage.

result in almost negligible metastability. This is exemplifiedin [43].

In summary, in coherent clock CDC, jitter should beminimized.

2) Supply Voltage: The selection of supply voltage is asignificant system decision, which directly determines perfor-mance and power of a circuit. As shown in [32] and [33],supply voltage has a substantial effect on τ . Supply voltagemay vary within an IC due to different power domains onchip and also dynamically by means of techniques, suchas dynamic voltage scaling, and due to IR drops. Fig. 15presents simulations of normalized τ versus normalized supplyvoltage. When voltage is reduced by 15% from nominal VDD,τ increases more than 3.5 times its nominal value. This can beexplained in (5): when VDD is decreased, gm decreases, whichaccording to (4), generates an increase in τ [34]. Hence, whenthe supply voltage is decreased for power considerations, therepercussion on MTBF should be considered. In summary,the use of higher supply voltages is recommended whensynchronization issues are critical.

3) Duty Cycle: Flip-flops used in synchronizers are madeof two concatenated latches as shown in Fig. 3. Since (1)only accounts for one value of τ , the imperative questionis what value of τ should be used, the master’s or theslave’s? Recent advances in multistage metastability modelingrefined equation (1) to a generalized form for multistage flip-flops [41], [45]

MTBF(N) = 1

TW (N) fD fCexp

(NT

τN

)(10)

where τN is defined by

1

τN= 1

N

N∑1

1

τeff(i)(11)

and τeff (i) is the effective τ of flip-flop i , defined in (3). From(3), (10), and (11), it is clear that the resolution time of both themaster and the slave should be considered. Flip-flop designersshould not improve the design of the master in detriment ofthat of the slave and vice versa.



Fig. 16. τeff/τM versus duty cycle for different τS/τM ratios.

Fig. 17. MTBF versus duty cycle for different τS/τM ratios.

An interesting aspect of (3) is its dependence on duty cycle.Since duty cycle can be changed after fabrication, it can beused retroactively to minimize τeff , thus maximizing MTBF.Fig. 16 shows τeff as a function duty cycle, and Fig. 17shows MTBF as a function of duty cycle. The variations ofMTBF due to duty cycle can span several orders of magnitude,motivating calibration of circuits after fabrication. In Fig. 16,τeff for different ratios of τM and τS is shown. All plots arenormalized to τM that remains unchanged in all cases. WhenτM = τS , the duty cycle does not influence the effective τ , andits value is constant for every α. When τS > τM , increasingduty cycle reduces τeff , while when τS < τM , decreasing dutycycle increases τeff . The scenario of difference between τM

and τS is possible even when master and slave are designedwith the same τ , but after fabrication, a large mismatch appearsdue to process variations.

In summary, clock duty cycle may be adjusted to improveMTBF.

IV. CONCLUSION

In this paper, we presented guidelines and techniquesto improve the performance of N-flip-flop synchronizers toachieve minimum τ and maximum MTBF. We describedthe tradeoff between capacitance and transconductance in thesynchronizer nodes that is useful to evaluate other aspectsaffecting metastability. The factors affecting metastability weredivided into circuit, process and operating conditions, andpresented boosting techniques for each. We distinguish globalfactors that affect the complete IC from design guidelines thatcan be applied for individual synchronizers. For the global per-spective, we advocate fabrication in high performance flavortechnologies with the minimum threshold voltage allowed bythe process. Increasing the supply voltage improves metasta-bility resolution. The use of scan and reset flip-flops shouldbe avoided when possible to improve reliability. The minimumflip-flop cell size should be selected to improve τ . We haveshown a formula to account for process variability and ana-lyzed the minimal required amount of over-provisioning. Sincethe master and slave latches may have different τ , overallMTBF is affected by the clock duty cycle. Clean powersupplies and low jitter clock distribution networks providebetter reliability in coherent clock domains, where severalclocks are sourced from a common oscillator.

REFERENCES

[1] R. Ginosar, “Metastability and synchronizers: A tutorial,” IEEE DesignTest Comput., vol. 28, no. 5, pp. 23–35, Sep./Oct. 2011.

[2] D. G. Messerschmitt, “Synchronization in digital system design,” IEEEJ. Sel. Areas Commun., vol. 8, no. 8, pp. 1404–1419, Oct. 1990.

[3] P. Teehan, M. Greenstreet, and G. Lemieux, “A survey and taxonomyof GALS design styles,” IEEE Design Test Comput., vol. 24, no. 5,pp. 418–428, Sep./Oct. 2007.

[4] W. J. Dally and J. W. Poulton, Digital Systems Engineering. Cambridge,U.K.: Cambridge Univ. Press, 1998.

[5] M. W. Heath, W. P. Burleson, and I. G. Harris, “Synchro-tokens:A deterministic GALS methodology for chip-level debug and test,” IEEETrans. Comput., vol. 54, no. 12, pp. 1532–1546, Dec. 2005.

[6] J. M. Chabloz and A. Hemani, “A flexible interface for rationallyrelated frequencies,” in Proc. Int. Conf. Comput. Design (ICCD), 2009,pp. 109–116.

[7] S. Beer, R. Ginosar, R. Dobkin, and Y. Weizman, “MTBF estimationin coherent clock domains,” in Proc. IEEE 19th Int. Symp. Asynchron.Circuits Syst. (ASYNC), May 2013, pp. 166–173.

[8] U. Frank, T. Kapschitz, and R. Ginosar, “A predictive synchronizer forperiodic clock domains,” Formal Methods Syst. Design, vol. 28, no. 2,pp. 171–186, 2004.

[9] R. Kol and R. Ginosar, “Adaptive synchronization,” in Proc. Int. Conf.Comput. Design (ICCD), 1998, pp. 188–189.

[10] W. K. Stewart and S. A. Ward, “A solution to a special case ofthe synchronization problem,” IEEE Trans. Comput., vol. 37, no. 1,pp. 123–125, Jan. 1988.

[11] L. F. G. Sarmenta, G. A. Pratt, and S. A. Ward, “Rational clocking[digital systems design],” in Proc. IEEE Int. Conf. Comput. Design,VLSI Comput. Process. (ICCD), Oct. 1995, pp. 271–278.

[12] W. J. Dally and S. G. Tell, “The even/odd synchronizer: A fast,all-digital, periodic synchronizer,” in Proc. IEEE Symp. Asynchron.Circuits Syst. (ASYNC), May 2010, pp. 75–84.

[13] L. Kleeman and A. Cantoni, “Metastable behavior in digital systems,”IEEE Design Test Comput., vol. 4, no. 6, pp. 4–19, Dec. 1987.

[14] D. Kinniment, Synchronization and Arbitration in Digital Systems.New York, NY, USA: Wiley, 2007.

[15] R. Ginosar, “Fourteen ways to fool your synchronizer,” in Proc. IEEE9th Int. Symp. Asynchron. Circuits Syst. (ASYNC), May 2003, pp. 89–96.

[16] K. Kanda, K. Nose, H. Kawaguchi, and T. Sakurai, “Design impact ofpositive temperature dependence on drain current in sub-1-V CMOSVLSIs,” IEEE J. Solid-State Circuits, vol. 36, no. 10, pp. 1559–1564,Oct. 2001.



[17] S. Gerstendorfer and H. Wunderlich, “Minimized power consumptionfor scan-based BIST,” in Proc. Int. Test Conf., 1999, pp. 77–84.

[18] S. Ganesan and S. P. Khatri, “A modified scan-D flip-flop design toreduce test power,” in Proc. 15th IEEE/TTTC Int. Test Synth. Workshop(ITSW), Apr. 2008.

[19] N. Parimi and S. Xiaoling, “Design of a low-power D flip-flop for test-per-scan circuits,” in Proc. Can. Conf. Electr. Comput. Eng., vol. 2.May 2004, pp. 777–780.

[20] G. D. Wilk, R. M. Wallace, and J. M. Anthony, “High-κ gate dielectrics:Current status and materials properties considerations,” J. Appl. Phys.,vol. 89, no. 10, p. 5243, 2001.

[21] (2014, Apr.). TSMC 28 nm Document Description [Online]. Available:http://www.tsmc.com/english/dedicatedFoundry/technology/28nm.htm

[22] (2014, Apr.). UMC 28 nm Document Description [Online]. Available:http://www.umc.com/english/process/i.asp

[23] (2014, Apr.). Global Foundries 28 nm Document Description [Online].Available: http://www.globalfoundries.com/technology/28nm.aspx

[24] S. Chakraborty, T. Yoshida, T. Hashizume, H. Hasegawa, and T. Sakai,“Formation of ultrathin oxynitride layers on Si(100) by low-temperatureelectron cyclotron resonance N2O plasma oxynitridation process,”J. Vac. Sci. Technol. B, vol. 16, no. 4, p. 2159, 1998.

[25] J. Yugami et al., “Advanced oxynitride gate dielectrics for CMOSapplications,” in Proc. Extended Abstracts Int. Workshop Gate Insul.(IWGI), 2003, pp. 140–145.

[26] S. Datta, “Recent advances in high performance CMOS transistors:From planar to non-planar,” in Proc. Electrochem. Soc. Interf., 2013,pp. 41–46.

[27] N. Agrawal, Y. Kimura, R. Arghavani, and S. Datta, “Impact of transistorarchitecture (bulk planar, trigate on bulk, ultrathin-body planar SOI)and material (silicon or III–V semiconductor) on variation for logic andSRAM applications,” IEEE Trans. Electron Devices, vol. 60, no. 10,pp. 3298–3304, Oct. 2013.

[28] J. M. Rabaey, Digital Integrated Circuits: A Design Perspective.Upper Saddle River, NJ, USA: Prentice-Hall, 1996.

[29] K. Mistry et al., “A 45 nm logic technology with high-k+metal gatetransistors, strained silicon, 9 Cu interconnect layers, 193 nm drypatterning, and 100% Pb-free packaging,” in Proc. IEEE Int. ElectronDevices Meeting (IEDM), Dec. 2007, pp. 247–250.

[30] B. S. Doyle et al., “High performance fully-depleted tri-gate CMOStransistors,” IEEE Electron Device Lett., vol. 24, no. 4, pp. 263–265,Apr. 2003.

[31] H.-S. P. Wong, D. J. Frank, and P. M. Solomon, “Device designconsiderations for double-gate, ground-plane, and single-gated ultra-thinSOI MOSFET’s at the 25 nm channel length generation,” in IEEE Int.Electron Devices Meeting (IEDM) Tech. Dig., Dec. 1998, pp. 407–410.

[32] J. Zhou, D. Kinniment, G. Russell, and A. Yakovlev, “Adapting syn-chronizers to the effects of on chip variability,” in Proc. IEEE Int. Symp.Asynchron. Circuits Syst. (ASYNC), Apr. 2008, pp. 39–47.

[33] S. Beer, R. Ginosar, J. Cox, T. Chaney, and D. Zar, “Metastability chal-lenges for 65 nm and beyond; simulation and measurements,” in Proc.Design, Autom. Test Eur. Conf. Exhibit. (DATE), 2013, pp. 1297–1302.

[34] S. Beer and R. Ginosar, “Supply voltage and temperature variations insynchronization circuits,” Technion, Haifa, Israel, Tech. Rep. 4562954,2013.

[35] L. Wei, Z. Chen, K. Roy, M. C. Johnson, Y. Ye, and V. K. De, “Designand optimization of dual-threshold circuits for low-voltage low-powerapplications,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 7,no. 1, pp. 16–24, Mar. 1999.

[36] S. Srichotiyakul et al., “Stand-by power minimization through simul-taneous threshold voltage selection and circuit sizing,” in Proc. 36thDesign Autom. Conf. (DAC), 1999, pp. 436–441.

[37] J. Zhou, D. Kinniment, G. Russell, and A. Yakovlev, “Adapting syn-chronizers to the effects of on chip variability,” in Proc. IEEE Int. Symp.Asynchron. Circuits Syst. (ASYNC), Apr. 2008, pp. 39–47.

[38] S. Nassif et al., “High performance CMOS variability in the 65 nmregime and beyond,” in Proc. IEEE Int. Electron Devices Meeting(IEDM), Dec. 2007, pp. 569–571.

[39] (2006). International Technology Roadmap for Semiconductors (ITRS)[Online]. Available: http://www.itrs.net/

[40] D. J. Kinniment, C. E. Dikes, K. Heron, G. Russell, and A. Yakovlev,“Measuring deep metastability and its effect on synchronizer perfor-mance,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 15,no. 9, pp. 1028–1039, Sep. 2007.

[41] S. Beer, J. Cox, T. Chaney, and D. M. Zar, “MTBF bounds for multistagesynchronizers,” in Proc. IEEE 19th Int. Symp. Asynchron. Circuits Syst.(ASYNC), May 2013, pp. 158–165.

[42] A. Cantoni, J. Walker, and T.-D. Tomlin, “Characterization of a flip-flopmetastability measurement method,” IEEE Trans. Circuits Syst. I, Reg.Papers, vol. 54, no. 5, pp. 1032–1040, May 2007.

[43] S. Beer, R. Ginosar, R. Dobkin, and Y. Weizman, “MTBF estimation incoherent clock domains,” in Proc. IEEE Int. Symp. Asynchron. CircuitsSyst. (ASYNC), 2013, pp. 166–173.

[44] D. Wolpert and P. Ampadu, “A sensor to detect normal or reversetemperature dependence in nanoscale CMOS circuits,” in Proc. IEEE Int.Symp. Defect Fault Tolerance VLSI Syst. (DFT), Oct. 2009, pp. 193–201.

[45] I. W. Jones, S. Yang, and M. Greenstreet, “Synchronizer behavior andanalysis,” in Proc. 15th IEEE Int. Symp. Asynchron. Circuits Syst.(ASYNC), May 2009, pp. 117–126.

Shlomi Beer received the B.Sc. degree in electricalengineering and the B.A. degree in physics fromthe Technion–Israel Institute of Technology, Haifa,Israel, in 2004, where he is currently working towardthe Ph.D. degree in computer engineering.

He held engineering and algorithmic developmentpositions with Freescale Semiconductor, Austin, TX,USA, from 2005 to 2011. He has authored severalpublications and patents in the field of computerarchitecture, VLSI systems, and computer visionalgorithms. He is a Haso-Plattner-Institut Fellow

with Technion.

Ran Ginosar received the B.Sc. (summa cumlaude) and Ph.D. degrees in electrical and computerengineering from the Technion–Israel Institute ofTechnology, Haifa, Israel, and Princeton University,Princeton, NJ, USA, in 1978 and 1982, respectively.His Ph.D. research focused on shared-memory mul-tiprocessors.

He was with AT&T Bell Laboratories, MurrayHill, NJ, USA, from 1982 to 1983, and joinedthe Technion faculty in 1983. He was a VisitingAssociate Professor with the University of Utah,

Salt Lake City, UT, USA, from 1989 to 1990, and a Visiting Faculty withIntel Research Labs, Santa Clara, CA, USA, from 1997 to 1999. He isan Associate Professor with the Departments of Electrical Engineering andComputer Science and serves as the Head of the VLSI Systems ResearchCenter at Technion. He has co-founded several companies in various areasof VLSI systems. His current research interests include VLSI architecture,manycore computers, asynchronous logic and synchronization, networks-on-chip, and biologic implant chips.

eleven ways to boost your synchronizer

Documents