aging analysis of digital integrated circuits - tum1. introduction in biology, aging of an organisms...

150
TECHNISCHE UNIVERSITÄT MÜNCHEN Lehrstuhl für Entwurfsautomatisierung Aging Analysis of Digital Integrated Circuits Dominik Lorenz Vollständiger Abdruck der von der Fakultät für Elektrotechnik und Informationstechnik der Technischen Universität München zur Erlangung des akademischen Grades eines Doktor-Ingenieurs genehmigten Dissertation. Vorsitzender: Univ.-Prof. Dr. sc.techn. Andreas Herkersdorf Prüfer der Dissertation: 1. Univ.-Prof. Dr.-Ing. Ulf Schlichtmann 2. Prof. Diana Marculescu, Ph.D., Carnegie Mellon University, PA, USA Die Dissertation wurde am 31.01.2012 bei der Technischen Universität München eingereicht und durch die Fakultät für Elektrotechnik und Informationstechnik am 24.04.2012 angenommen.

Upload: others

Post on 20-Feb-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

  • TECHNISCHE UNIVERSITÄT MÜNCHENLehrstuhl für Entwurfsautomatisierung

    Aging Analysis of Digital Integrated Circuits

    Dominik Lorenz

    Vollständiger Abdruck der von der Fakultät für Elektrotechnik undInformationstechnik der Technischen Universität München zur Erlangung desakademischen Grades eines

    Doktor-Ingenieurs

    genehmigten Dissertation.

    Vorsitzender: Univ.-Prof. Dr. sc.techn. Andreas HerkersdorfPrüfer der Dissertation:

    1. Univ.-Prof. Dr.-Ing. Ulf Schlichtmann2. Prof. Diana Marculescu, Ph.D.,

    Carnegie Mellon University, PA, USA

    Die Dissertation wurde am 31.01.2012 bei der Technischen Universität Müncheneingereicht und durch die Fakultät für Elektrotechnik und Informationstechnik am24.04.2012 angenommen.

  • Acknowledgments

    This thesis results from my work as an research assistant at the Institute for ElectronicDesign Automation at the Technische Universität München.First of all, I would like to thank Professor Ulf Schlichtmann for giving me the oppor-

    tunity to do research at his institute and for encouraging me to work on this novel topic.His guidance and continued support, as well as the open and creative atmosphere at theinstitute have been essential for the successful completion of this research project.I also would like to thank the second examiner Professor Diana Marculescu for her

    interest in my research.Most of the work would not have been possible without the valuable cooperation of

    the Infineon Technologies employees working together with me on the HONEY researchproject. A special thanks goes to Georg Georgakos for his guidance and the fruitfuldiscussions with him.It is a pleasure for me to thank my colleagues at the EDA institute for their collab-

    oration and their friendship. It was a great time at the institute, which I will neverforget.Finally, I want to express my heartfelt gratitude towards my wife, Nicole, and my little

    sunshine, Annika, for their continuous support or just for smiling when I come home.

    3

  • Contents

    1. Introduction 91.1. Objective of this thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101.2. Semi-custom design flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.3. Structure of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    2. Fundamentals 152.1. (Static) timing analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

    2.1.1. Gate models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.1.2. Timing graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.1.3. Incremental timing analysis . . . . . . . . . . . . . . . . . . . . . . 182.1.4. Sequential circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.1.5. Path enumeration . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    2.2. State of the art of aging analysis . . . . . . . . . . . . . . . . . . . . . . . 272.2.1. Circuit level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272.2.2. Gate level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

    3. Aging effects and their impact on standard cells 353.1. Aging effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    3.1.1. Negative Bias Temperature Instability . . . . . . . . . . . . . . . . 373.1.2. Hot Carrier Injection . . . . . . . . . . . . . . . . . . . . . . . . . . 443.1.3. Stress conditions in CMOS logic gates . . . . . . . . . . . . . . . . 46

    3.2. Impact on gate performance . . . . . . . . . . . . . . . . . . . . . . . . . . 493.2.1. Impact on combinational gates . . . . . . . . . . . . . . . . . . . . 493.2.2. Impact on flip-flops . . . . . . . . . . . . . . . . . . . . . . . . . . . 533.2.3. Impact on power dissipation . . . . . . . . . . . . . . . . . . . . . . 56

    3.3. Technology trend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573.4. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

    4. Aging-aware static timing analysis 634.1. Aging-aware STA flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 644.2. Workload determination . . . . . . . . . . . . . . . . . . . . . . . . . . . . 664.3. AgeGate: Aging-aware gate model . . . . . . . . . . . . . . . . . . . . . . 69

    4.3.1. Canonical gate model . . . . . . . . . . . . . . . . . . . . . . . . . 694.3.2. Degradation equations . . . . . . . . . . . . . . . . . . . . . . . . . 704.3.3. Calculation of Stress Probabilities . . . . . . . . . . . . . . . . . . 71

    4.4. Characterizing the standard cells . . . . . . . . . . . . . . . . . . . . . . . 774.4.1. Obtaining the sensitivities . . . . . . . . . . . . . . . . . . . . . . . 78

    5

  • Contents

    4.4.2. Obtaining the internal gate structure . . . . . . . . . . . . . . . . . 784.4.3. Simplification of the gate model . . . . . . . . . . . . . . . . . . . 79

    4.5. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 804.5.1. Waveform dependence of parameter drift . . . . . . . . . . . . . . 804.5.2. Comparison of AgeGate, circuit-level simulation and measurements 804.5.3. Aging analysis results . . . . . . . . . . . . . . . . . . . . . . . . . 81

    4.6. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

    5. Identifying possible critical paths in aged circuits 855.1. Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 865.2. Identification of PCPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

    5.2.1. Slack reduction step . . . . . . . . . . . . . . . . . . . . . . . . . . 875.2.2. Path delay reduction step . . . . . . . . . . . . . . . . . . . . . . . 885.2.3. Arrival time reduction step . . . . . . . . . . . . . . . . . . . . . . 885.2.4. Delay to sink reduction step . . . . . . . . . . . . . . . . . . . . . . 905.2.5. Common edge reduction step . . . . . . . . . . . . . . . . . . . . . 915.2.6. Removing edges and nodes . . . . . . . . . . . . . . . . . . . . . . 94

    5.3. Realistic aged path delays . . . . . . . . . . . . . . . . . . . . . . . . . . . 945.3.1. Gate delay interval . . . . . . . . . . . . . . . . . . . . . . . . . . . 965.3.2. Realistic aged path delays for an inverter chain . . . . . . . . . . . 965.3.3. Maximal aged path delay of a general path . . . . . . . . . . . . . 975.3.4. Minimal aged path delay for a general path . . . . . . . . . . . . . 985.3.5. Minimal aged circuit delay . . . . . . . . . . . . . . . . . . . . . . 1015.3.6. Use of minimal aged circuit delay in reduction steps . . . . . . . . 1025.3.7. Wrap-up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

    5.4. Considering process variations . . . . . . . . . . . . . . . . . . . . . . . . . 1025.4.1. Block-based statistical static timing analysis . . . . . . . . . . . . 1035.4.2. Representation of timing quantities . . . . . . . . . . . . . . . . . . 105

    5.5. Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1065.5.1. Aging-aware timing model for modules . . . . . . . . . . . . . . . . 1065.5.2. Monitoring of aging circuits . . . . . . . . . . . . . . . . . . . . . . 108

    5.6. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1135.6.1. Minimal aged delay . . . . . . . . . . . . . . . . . . . . . . . . . . 1135.6.2. Node and edge reduction . . . . . . . . . . . . . . . . . . . . . . . 1145.6.3. Possible critical paths . . . . . . . . . . . . . . . . . . . . . . . . . 115

    5.7. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

    6. Conclusion 119

    A. Constraints for NAND and NOR gates 121

    B. More detailed results for PCP identification 123

    Bibliography 125

    6

  • Contents

    Acronyms 143

    List of Symbols 145

    7

  • 1. Introduction

    In biology, aging of an organisms is defined as a progressive, irreversible process thatinevitably ends with death. The maximal lifetime of an individual is significantly affectedby aging [Wikipedia, 2011].The same is true for integrated circuits (ICs). Aging effects cause the circuit per-

    formance to degrade and they have a significant impact on the specified lifetime of acircuit.Circuit aging can be regarded as a time-dependent variation. Aging is not the only

    variability the IC industry must cope with. In fact, variability has always been a factof life in the IC industry. The reasons for variability can be classified into these threecategories:

    Variations of the operating conditions: Primarily changes in supply voltage and oper-ating temperature.

    Process variations: These denote deviations in process parameters from their nominalvalues that are present in an IC after it has been manufactured. Examples arevariations in the concentration of dopants or the oxide thickness. In contrast toaging, manufacturing variations do not change over time once the IC has beenmanufactured.

    Time-dependent variations: These denote changes in the physical (and consequently,in the electrical) properties of an IC over time caused by aging effects.

    Variations of the operating conditions are handled during the design process by speci-fying a range (e.g. VDD,min and VDD,max) within which the IC has to meet the specifiedproperties (e.g. frequency or power consumption). Process variations have traditionallybeen considered by specifying so-called process corners which describe e.g. for delay thebest or worst realistic combinations of process parameters, thus establishing generousguardbands against parameter variations. This modeling is increasingly considered tobe problematic and statistical design methodologies have therefore been proposed as aremedy for dealing with manufacturing variations. A detailed overview of this field isgiven in Blaauw et al. [2008].Time-dependent variation caused by aging effects, on the other hand has by far not re-

    ceived a similar amount of attention. Aging effects lead to a change of device parametersover time dependent on the operating conditions over lifetime and the workload. Theworkload defines the portion of the lifetime a device spends in a particular operatingpoint. Negative bias temperature instability (NBTI), for instance, is regarded as themost severe aging effect nowadays. NBTI results in an increased threshold voltage (Vth)

    9

  • 1. Introduction

    of PMOS transistors whenever the transistor is in inversion. The threshold voltage drift(∆Vth) is accelerated by elevated temperature or supply voltage.The impact of variations on the circuit performance increases due to the continued

    technology scaling [Nassif, 2000]. The same absolute variation of the gate length, forinstance, increases the relative variation since the nominal gate length is scaled by afactor of 0.7× every two years according to Moore’s law [Moore, 1965]. The supplyvoltage is scaled as well. Therefore, a supply voltage variation or a threshold voltagevariation have a larger impact on circuit performance. This is the case if a constantabsolute variation for the different variability mechanisms is assumed. However, thevariation caused by aging effects is going to increase, since these effects strongly dependon the strength of the electrical fields. The electrical fields continue to increase withscaling, because the transistor sizes are scaled more aggressively than the supply voltagesince several technology generations1.Variability is the reason why performance and power consumption vary from chip

    to chip and over time. To be able to still manufacture working and reliable productsdespite increasing variability, the performance guardbands must be increased or othertechniques must be applied to make a product robust against variations. Examples ofsuch techniques are dynamic voltage frequency scaling (DVFS) [Semeraro et al., 2002;Talpes and Marculescu, 2005; Herbert and Marculescu, 2009] or the use of redundantcircuitry [Lyons and Vanderkulk, 1962]. Therefore, the operating frequency is not ashigh as it may be, chip area is wasted and the power consumption is higher than nec-essary. Hence, conservative safety margins and variation-aware design techniques makethe design of competitive products more difficult and lead to a minimization or evenelimination of the advantages of moving to the next technology node. One way out ofthis dilemma according to Austin et al. [2008] are innovative design techniques to reducethe reliability costs again.

    1.1. Objective of this thesisThe contribution of this thesis to reduce the reliability costs are methods to accuratelyanalyze the timing degradation of a circuit caused by drift-related aging effects. Thisallows to tighten the safety margins again.Within this thesis the following objectives have been set and achieved:• Investigate the impact of aging effects on transistors, how can they be modeledand on which parameters do they depend. Furthermore, quantify the degradationof the properties of standard cells caused by aging effects.

    • Develop and implement an aging analysis to determine the timing degradationof ICs on gate-level. The developed aging-aware gate model should consider thedominant aging effects.

    • Develop an aging analysis on higher abstraction levels. This enables consideringaging in earlier design stages and for complex systems.

    1Under the assumption that no breakthroughs are achieved to mitigate aging on technology level.

    10

  • 1.2. Semi-custom design flow

    • Furthermore, another approach is developed to reduce the safety margins evenfurther by enabling a better-than-worst-case design style. To assure that such anaggressively designed circuit still works correctly during the specified lifetime, thedegradation of the circuit caused by aging is periodically monitored and counter-measures are taken if the circuit ages too much.

    In the course of this thesis seven pre-publications [Lorenz et al., 2009a,b, 2010a,b,c,d,2012] have been contributed to the scientific community. Furthermore, a patent for atime margin monitor for the assessment of aging and process variation was filed andgranted [Henzler et al., 2009].

    1.2. Semi-custom design flow

    In the beginning of IC design in the early 1970s2, circuit design was entirely manual work,even the layout was drawn by hand. However, without the development of sophisticatedelectronic design automation (EDA) tools, the design of state-of-the-art ICs would notbe possible.Figure 1.1 depicts a simplified design flow from a hardware description language (HDL)

    to a layout also referred to as register transfer level (RTL) to GDSII flow. The purposeof this figure is to illustrate where timing analysis (TA) is required and an aging-awareTA would reduce the uncertainty of the delay prediction. Design flows are getting moreand more complex and according to Scheffer et al. [2006, Chapter 1] this trend continues,amongst other things, due to variability and reliability challenges:

    “The RTL to GDSII flow has undergone significant changes in the last 25years. The continued scaling of CMOS technologies significantly changed theobjectives of the various design steps. The lack of good predictors for delayhas led to significant changes in recent design flows. Challenges like leakagepower, variability, and reliability will continue to require significant changesto the design-closure process in the future”.

    Everything starts with a product specification which includes constraints for perfor-mance, area, and power. Further constraints, especially in advanced technologies, arereliability and yield. The next step is to write a synthesizable description in a HDL(VHDL or Verilog). This representation at RTL is then transferred into a logic rep-resentation by logic synthesis [Sentovich et al., 1992]. A netlist of generic cells (e.g.,NAND and NOT cells), which represent the logic function, is obtained and mapped tocells from a standard cell library. Next, the cells of the netlist are placed and the netsare routed. Before the chip can be processed, tested and packaged, the sign-off is per-formed by thoroughly verifying that the timing and other electrical performances meetthe specification.

    2The first microprocessor, Intel’s 4004, was fabricated 1971.

    11

  • 1. Introduction

    specification

    HDL

    log. functions

    netlist

    layout

    tape-out

    log.synthesis

    tech.mapping

    place &route

    sign-off

    Figure 1.1.: IC design flow.

    It is very expensive and time consum-ing to process a chip. Therefore, it isnot feasible to iteratively design a chip byprocessing it, testing it and making de-sign changes. In fact, the IC industry isquite unique by heavily relying on abstractmodels for designing a product. There are,for instance, transistor models to simulatethe voltage and current waveforms on cir-cuit level; or gate models, which provide,amongst other things, the delay of thestandard cells. The goal of models is toprovide all the information that is neces-sary on a particular abstraction level andomit unimportant information. Only byabstraction it is possible to design state-of-the-art circuits with up to billions oftransistors3. The models must be as accu-rate as possible to provide a good predic-tion for the performance, power and areaof a design. Otherwise the final productmight not meet the specification.TA is a crucial step during the design

    of a digital circuit. Due to complexityreasons TA is done on gate level or evenhigher abstraction levels. Basically, the gate and wire delays along the longest, the socalled critical path, are added up and it is verified whether the resulting circuit delayfulfills the timing specification, or not. When a circuit ages, the gate delays increaseand the circuit may violate the timing specification although the specifications were metright after manufacturing (see Figure 1.2).In Chapter 3 it is shown that aging significantly degrades the gate delay. To consider

    this, a TA with an aging-aware gate model is required. Such an aging-aware TA isdeveloped in this thesis.TA is required in many design flow steps, not just for the final timing sign-off. This

    enables the consideration of timing at every synthesis step and the synthesis tool canoptimize the design until the timing constraints are met. With each synthesis step, theavailable information is getting more accurate which in turn increases the accuracy ofthe TA. Only the multi-level logic functions are known at the logic synthesis stage andthe circuit delay can only be roughly estimated by the logic depths of those functions.During technology mapping it is first known which gates from the standard cell libraryare instantiated. From this step on aging can be considered by an aging-aware gatemodel. The exact net length is available during the place and route synthesis stage, which

    3Intel’s Six-Core Core i7 CPU from 2010 has 1.17 · 109 transistors.

    12

  • 1.3. Structure of the thesis

    Figure 1.2.: Aging-aware timing analysis of a circuit. Aging effects degrade transistorparameter, which results in increased gate delays over time. The criticalpath delay increases as well and the timing specification might be violatedduring the specified lifetime.

    increases the accuracy of the TA by knowing the parasitic capacitance and resistance ofthe nets. Finally, the coupling capacitances are available for timing sign-off, which againincreases the accuracy of the TA. Hence, an aging-aware TA is beneficial at all synthesissteps from technology mapping on.

    1.3. Structure of the thesisChapter 2 discusses the fundamentals of TA and the state of the art of aging-aware timinganalysis. Chapter 3 introduces the two dominant drift-related aging effects, NBTI andhot carrier injection (HCI). Their physical mechanisms are explained and it is shownhow the device parameter degradation can be modeled. Furthermore their impact onthe gate performance is investigated. An aging-aware timing analysis flow is described inChapter 4. Its basis is an aging-aware gate model called AgeGate. Accuracy benefits ofthe proposed approach are demonstrated on benchmark circuits. The degradation of acircuit strongly depends on the operating conditions and the workload. Chapter 5 showsmethods to identify the paths of a circuit that might become critical without knowingthe exact operating conditions and workload. Two applications are presented which usethis information: An aging-aware timing model for modules and a methodology to designbetter-than-worst-case circuits by monitoring all possible critical paths and interferingif one of them degrades too much. Finally, the thesis is summarized in Chapter 6.

    13

  • 2. Fundamentals

    2.1. (Static) timing analysis

    Timing analysis is required for many different steps during the design process of a digitalcircuit. The most obvious task for a TA is to determine the maximum clock frequencya circuit can operate at. Therefore, a TA as accurate as possible is needed for timingsign-off at the end of the digital design flow. A TA is also needed for circuit optimiza-tion. During synthesis and layout (placement as well as routing), timing analysis isperformed in the inner optimization loop. This requires a timing analyzer that respondsto several thousand timing queries as fast as possible (see incremental timing analysis inSection 2.1.3). When local optimizations are performed on a design (e.g., buffer insertion[Alpert et al., 1999]), the TA checks that no timing constraint is violated due to a localmodification.The timing analysis of complex digital circuits with up to millions of gates is performed

    on gate level (or even higher abstraction levels), since a SPICE simulation on circuit levelof such large circuits is too time consuming. The required input vectors for a SPICEor logic simulation are another problem. It is not practical to simulate a circuit for allpossible input vectors. Nevertheless, a SPICE simulation is more accurate and can beused to verify the paths with the longest delays that are determined by a static timinganalysis (STA).A STA has two main advantages compared to a timing simulation on circuit level. It

    is significantly faster, since a simplified gate model (see Section 2.1.1) and a simplifiedinterconnect model are used. Furthermore, no input vectors are needed, because thelogic function of a gate is not considered for the signal propagation. Instead, the prop-agation of signal arrival times just depends on the circuit topology. Bellido et al. [2006,Chapter 2] compare state-of-the-art gate models to a SPICE simulation. The averagespeed-up is three orders of magnitude and the mean error is 6.75%.A STA tool can operate in an early and a late mode. In late mode, the latest arrival

    times of a signal are determined. In early mode, on the other hand, the earliest time asignal transition can take place at a node is obtained. The circuit delay calculation andthe verification of the setup time constraints (see Section 2.1.4) are performed in latemode. The hold time constraints are checked in early mode.

    2.1.1. Gate models

    For STA a gate model is needed to compute the gate delays. The gate model providesa delay for a falling and a rising input transition for each of its timing arcs. A timingarc is defined from a gate input to a gate output (see Figure 2.1(a)). Typically, it is

    15

  • 2. Fundamentals

    AB Z

    CL"1"

    (a) NAND gate witha transition at inputA. Timing arcs are de-picted as lines.

    (b) Corresponding wave-forms

    assumed that the output transition is caused by the switching of just one input signal(single input switching assumption). A simultaneous transition at two or more inputscan significantly increase the gate delay. Hence, gate models that take simultaneousinput switching into account are more accurate [Chen et al., 2001].To obtain the gate delays, the gates of a standard cell library are pre-characterized

    by SPICE simulations. Those simulations are used to create a gate model. During theSTA, just the gate model is evaluated. This is the reason, a STA is much faster thanperforming a SPICE simulation for the entire circuit.There are several techniques to model the gate delay. One of the first was to use the

    following equation [Sapatnekar, 2004, chap. 4]:

    d = k1 · CL + k2 (2.1)

    The gate delay is split into two parts. The dependence of the gate delay on the outputload (CL) is given by k1 and the intrinsic gate delay is given by k2. CL is given bythe input capacitance of succeeding gates and the interconnect capacitance. This quitesimple model neglects the impact of the input slope (sIN ) on the gate delay.To consider the impact of the slope, signals are modeled as ramps for STA (see Fig-

    ure 2.1(b)). A signal is defined by two values: the arrival time (AT) and the correspond-ing slope. The slope (s) is given by the transition time. This is the time a signal takes tochange from logic “0” to logic “1”. Hence, bounds for the logic values have to be defined(e.g., 50% of VDD for signal crossing and 20% and 80% of VDD for transition time).A commonly used gate model is based on a look-up table (LUT). The industry quasi-

    standard, the liberty file format from Synopsys, is such a LUT-based gate model. Itstores the gate delays in 2-dimensional LUTs dependent on input slope and output load(see Figure 2.1):

    d = f(sIN , CL) (2.2)

    Values in between the stored values of the LUTs are obtained by interpolation. Theinput slope is now required in addition to the output load in order to compute the gatedelay. For this reason, the output slope (sOUT ) is stored dependent on sIN and CL inLUTs as well. Now, the input slope of a gate can be calculated based on the output slopeof its predecessor gate. An advantage of LUT-based gate models is that their accuracycan easily be increased by characterizing the gate at additional supporting points.

    16

  • 2.1. (Static) timing analysis

    Figure 2.1.: LUT-based gate model

    Due to the ongoing miniaturization, the input capacitance of the gates decreases andthe resistance of the interconnect network increases. This leads to an increased inaccu-racy when purely capacitive loads are assumed. Due to this, an effective capacitancewas introduced by Qian et al. [1994]. The effective capacitance represents the complexinterconnect network by a single value. This enabled the continued usage of the existingmodels.However, the signal waveform in advanced technologies differs significantly from a

    simple ramp (signals have a long “tail” now), which leads to inaccuracies as well. Thisis the reason why current source models (CSMs) are developed. The goal of CSMs isto model the signal waveform more accurately by modeling gates as voltage controlledcurrent sources which charge the complex interconnect network and the fan-out gates.Several approaches have been published. The composite current source model (CCSM)

    [Synopsys, 2006] stores time-current waveforms in the LUTs. The effective current sourcemodel (ECSM) [Cadence, 2007] differs only slightly from the CCSM by storing time-voltage waveforms, which are again converted to current waveforms and applied to theinterconnect network. CCSM and ECSM have the advantage that they are compatibleto the existing timing analysis tools and were adopted quite fast by the industry.Another CSM approach by Croix and Wong [2003] is to store the static output current

    depending on gate input voltage and gate output voltage in LUTs. By solving differentialequations the voltage waveform at the succeeding gate input can be computed.The aging-aware gate model introduced in Chapter 4 is LUT-based. However, Knoth

    et al. [2011] show that the approach can be combined with a CSM [Knoth et al., 2010]to an aging-aware CSM.

    2.1.2. Timing graph

    A timing graph (TG) is used in STA tools to represent a combinational circuit. A TGis a directed acyclic graph (DAG): TG = (N,E). The nodes N of a timing graph arethe gate in- and outputs. These are connected by two types of edges E. The weights ofedges connecting gate inputs with gate outputs are the gate delays for the correspondingtiming arc. Edges between gate outputs and inputs of succeeding gates represent thedelays caused by the interconnect network.The focus of this thesis is on aging effects causing a drift of transistor parameters.

    Hence, the passive interconnect network is not affected and not considered in the courseof this thesis. This enables us to simplify the timing graph. The nets of the gate levelnetlist can be taken as nodes N and the weighted edges E correspond to gate delays.

    17

  • 2. Fundamentals

    (a) Gate level netlist for ISCAS’85 cir-cuit c17

    10

    11T

    9

    8

    7

    6

    1

    2

    3

    4

    5

    S

    (b) Simplified timing graph for c17(for every net just one node is addedand not two, as it is described in thetext)

    Figure 2.2.: Circuit and corresponding timing graph

    The gate model provides a delay for a rising and a falling input transition. Hence,every TG edge has two edge weights. To be able to use unmodified standard graphalgorithms, this should be avoided. A very clean and elegant way is described by Ju andSaleh [1991]: For every net two nodes are added to the timing graph, one for a risingtransition, and another one for a falling transition. If two nets, u and v, are connectedby an inverting gate, the node u for a rising (falling) transition is connected to the nodev for a falling (rising) transition. If it is a non-inverting gate, the node u for a rising(falling) transition is connected to v for a rising (falling) transition. That way everyedge in the timing graph has just one edge weight.Two additional nodes are added to the TG. A source node node (S) connected to all

    primary input (PI) nodes; and all primary output (PO) nodes are connected to a sinknode (T ) (see Figure 2.2). To model unequal arrival times at the primary inputs, delayscan be assigned to the edges from S to the PIs.

    2.1.3. Incremental timing analysis

    When the TG is annotated with gate delays as edge weights, the circuit delay can bedetermined. The circuit delay is defined by the path (P ) with the longest path delay(D(P )). This path is called critical path (Pcrit), its path delay is the critical path delay(D(Pcrit) or just Dcrit).The circuit delay can be determined by path-based or block-based methods. The

    path-based method enumerates all paths in the TG and computes their path delays byadding up the gate delays along the path. The critical path with the longest path delaydetermines the circuit delay. The path-based method has an exponential worst-casetime-complexity because the number of paths in a circuit increases (in the worst case)exponentially with the number of nodes.The block-based method propagates the arrival times (ATs) through the circuit, start-

    ing at S until T is reached. For a given node n, AT(n) is the maximal point in time

    18

  • 2.1. (Static) timing analysis

    Figure 2.3.: Computation of the arrival time (AT).

    that the signal at n can change1. The arrival time of a node n can be calculated whenthe arrival times of all predecessor nodes i and the gate delays d of all incoming edgesare known (see Figure 2.3):

    AT(n) = maxi∈predecessors(n)

    (AT(i) + d((i, n))

    )(2.3)

    AT(T ) corresponds to the circuit delay. In contrast to the path-based method, eachnode is just visited once, hence, the time complexity is O(|N |).Hence, the difference between the block-based and the path-based method is that the

    former calculates maximal arrival times for each node whereas the latter computes allpath delays first and then calculates the maximum out of them.Both methods add up the gate delays without considering the logic function of the

    gate. Hence, the critical path may not be sensitizable. A path is not sensitizable if theredoesn’t exist an input assignment that enables a signal to propagate along the path (seeSection 5.3.3). A path that is not sensitizable is called false path. If the critical pathis a false path, then the circuit delay is overestimated. The path-based method caneasily recognize a false path by checking every path whether it is sensitizable. For theblock-based method this is more difficult, since one cannot easily determine the pathwith the next longest path delay if the critical path is a false path. An efficient methodto enumerate the paths with respect to the path delay is discussed in Section 2.1.5.When the static timing analyzer is used in the inner optimization loop, the design is

    often modified only slightly before the timing must be reevaluated. It would be veryinefficient to analyze the complete design again in this case. The incremental timinganalysis instead just analyzes the part of the timing graph that is affected by the change.The foundation of an incremental timing analysis is that every timing quantity (e.g.,

    arrival time or gate delay) has a valid flag (e.g., ATvalid or dvalid). It is crucial thatwhenever the circuit and therefore the timing graph changes the valid flags of timingquantities that are affected are reset. This is done by two recursive functions reset_nodeand reset_edge. In reset_edge the controlling node of the arrival time (ATctrl) isneeded. The controlling node is the predecessor node that defines the arrival time (i.e.,the node i in Equation 2.3 that is responsible for the maximal arrival time at n)

    1or minimal time a signal changes if hold time constraints should be checked

    19

  • 2. Fundamentals

    Function reset_node(node)/* Function to set the arrival time of a node to invalid */ATvalid(node)← False;foreach successor suc of node do

    /* Delay of outgoing edges are invalid because edge input slope isinvalid */

    reset_edge(node, suc);end

    Function reset_edge(u,v)/* Recursive function to set the delay of an edge (u, v) to invalid */dvalid((u, v))← False;if ATctrl(v) == u then

    /* Arrival time at node v is invalid because it was controlled byedge (u, v) */

    reset_node(v);end

    Whenever a timing quantity is read, first, it has to be checked whether it is stillvalid. If not, then it must be recalculated. This is done by two recursive functions,update_node and update_edge. Let’s assume the circuit delay should be reevaluatedafter a design change. First, it is checked if the arrival time at T is still valid. If this isthe case, then the change did not affect the circuit delay. Otherwise, one has to proceedbackwards into the timing graph starting at T until one reaches valid arrival times andgate delays and recalculate AT(T ) based on those values.The algorithm to calculate the circuit delay for an incremental timing analyzer is given

    in Algorithm 1. As an initialization step the arrival time of the source node, which isequal to 0, must be set to valid. Then, the arrival time at the sink node is queried. Thepropagation of the arrival time from source node to sink node is done behind the scenesby update_node and update_edge.

    Algorithm 1: Circuit delay computation/* Set arrival time at source node to valid */ATvalid(S)← True;/* Update arrival time at the sink node */update_node(T);

    Figure 2.4 shows an example for the incremental timing analysis. Due to a de-sign change the arrival time at node 6 is invalid, resulting in the other nodes markedred (or dark gray) also being invalid. Now the circuit delay is reevaluated by callingupdate_node(T). This results in recursively calling update_node for all invalid nodes

    20

  • 2.1. (Static) timing analysis

    Function update_node(node)/* Recursive function to update the arrival time of a node */if ATvalid(node) == True then

    return AT(node)else

    AT(node)← maxi∈predecessors(n)(update_node(i) + update_edge((i,node))

    )end

    Function update_edge(u,v)/* Recursive function to update the gate delay of an edge (u, v) */if dvalid((u, v)) == False then

    /* Update gate delay based on input slope and output load */slope = get_slope_from_node(u);load = get_load_from_node(v);d((u, v)) = get_delay_from_LUT(slope, load);dvalid((u, v)) = True

    endreturn d((u, v))

    down to node 6.The methods to identify possible critical paths in an aged circuit, discussed in Chap-

    ter 5, continuously modify the TG by removing nodes and edges. Hence, without anincremental TA, the STA would have to be performed whenever the TG is modified.There are several other timing quantities of interest. AT gives the maximal time a

    signal takes from the source node to a given node. Delay to sink (D2S), on the otherhand, defines the maximal time a signal takes from a given node until it reaches the sinknode. D2S is calculated as follows:

    D2S(n) = maxi∈successors(n)

    (D2S(i) + d((n, i))

    )(2.4)

    To calculate D2S for all nodes, one starts at T and computes D2S for the predecessornodes until S is reached.The required time (REQT(n)) is the time a signal must be at a node n such that it

    arrives at T in time. Therefore, REQT at T must be specified first. REQT at a node nis the difference between REQT(T ) and the D2S at n:

    REQT(n) = REQT(T )−D2S(n) (2.5)

    The difference between required time and arrival time is called slack (SLACK):

    SLACK(n) = REQT(n)−AT(n) (2.6)

    21

  • 2. Fundamentals

    10

    11T

    9

    8

    7

    6

    1

    2

    3

    4

    5

    S

    Figure 2.4.: Example of the incremental timing algorithm. Arrival time at red (darkgrey) nodes is not valid. To update arrival time at node T, all invalidarrival times are recursively updated (dashed arrows).

    A negative slack implies that the signal arrives at a node after it has to in order to fulfillthe required time at the sink node. The slack of a node is an important information forcircuit optimization.

    2.1.4. Sequential circuits

    In contrast to a combinational circuit, a sequential circuit has storage elements in addi-tion to logic gates. Hence, the output of a sequential circuit does not only depend onthe input signals but on the internal state as well. For synchronous sequential circuits,the output signals of the combinational logic, which are fed back into the combinationallogic, are synchronized by a clock signal (see Figure 2.5).Due to its simplicity, regarding design and verification, the common storage element

    used in synchronous sequential circuits is the flip-flop (FF). FFs capture the data signalat the active clock edge (in Figure 2.5 the rising transition is the active clock edge).Synchronous sequential circuits can be used to realize finite state machines. They can

    also be used to split complex combinational circuits into several parts. That way theperformance of the circuit can be increased, since just the circuit parts must fulfill thetiming constraints. This is called pipelining and is used, for instance, in microprocessors.To store a date correctly into a flip-flop, the following two timing constraints have to

    be fulfilled (see waveform in Figure 2.5):

    • setup time (tSUP ) is the time interval the data signal has to be stable before theactive clock edge to sample the date correctly. This can be verified during STA bythe following inequality:

    dCLK−to−Q +Dmax + tSUP < tCLK (2.7)

    The clock-to-Q delay (dCLK−to−Q) is the delay from an active clock edge until theoutput of the sending FF changes. Dmax is the maximal delay of the combinationalcircuit to the receiving FF input.

    22

  • 2.1. (Static) timing analysis

    QD

    combinatoriallogic

    Clk

    PI PO

    TSUP THLDClk

    D

    Figure 2.5.: Diagram of a sequential logic circuit. The timing constraints (setup andhold time) of a flip-flop are given as well.

    • hold time (tHLD) is the time interval that the data signal has to remain stableafter the active clock edge to sample the date correctly. This can be checked bythe following inequality:

    dCLK−to−Q +Dmin > tHLD (2.8)

    Dmin is the minimal circuit delay to the receiving FF input. Dmin is obtained bythe STA tool in the early mode.

    The STA algorithm must be modified slightly to analyze sequential circuits. The flip-flops are removed from the netlist. Every signal connected to a FF input becomes a POand every signal connected to a FF output becomes a PI. The remaining circuit is nowpurely combinational and the TG can be set up. The timing constraints for the flip-flopsare considered by weights of edges to the sink node and from the source node. Edgeweights from S to former FF outputs are set to dCLK−to−Q.To check the setup time constraints, the edge weights from former FF inputs to T are

    set to tSUP . If the maximal arrival time at the sink node is less than tCLK , then allsetup time constraints are met.To check the hold time constraints, the edge weights from former FF inputs to T are

    set to tHLD. Now, if the minimal arrival time at the sink node is greater than tCLK ,then all hold time constraints are met. The minimal arrival time at a node is calculatedby simply exchanging the max-operation in Equation 2.3 with the min-operation.

    2.1.5. Path enumerationWhen a block-based STA is performed, the circuit delay is given by the arrival time atthe sink node. The corresponding critical path can be obtained efficiently, because the

    23

  • 2. Fundamentals

    Figure 2.6.: An example for calculating the branch slacks.

    controlling nodes are stored for the delays to sink. The controlling node of a node nis the successor node which is responsible for the maximal D2S at n. By following thepath from a node to its controlling node starting at S, the critical path is determined.However, often not only the critical path itself is of interest, but also those paths with

    the next longest path delays. These paths are required, for instance, to simulate theirdelay again on circuit level. This problem is referred to as k most critical paths problem.Determining the next longest paths is not as easy as determining Pcrit in a block-basedSTA approach.Ju and Saleh [1991] propose an efficient way to compute the k most critical paths.

    One advantage of their algorithm is that k does not have to be specified in advance, butthe path enumeration can be suspended and continued as required. The key idea of thealgorithm is the introduction of branch slacks (BSs).In an initialization phase, the BSs are calculated for every edge in the TG. Therefore,

    the successor nodes vi of a node u are sorted according to the following cost functionfcost:

    fcost(u, vi) = d((u, vi)) + D2S(vi) (2.9)This is the maximal delay from node u to T over the edge (u, vi). The branch slack isnow the difference between the cost function of two nodes vi and vi+1 next to each otherin the sorted successor list of u:

    BS(u, vi) = fcost(u, vi)− fcost(u, vi+1) (2.10)

    The branch slack of an edge (u, vi) tells us that the path with the next longest pathdelay, which branches out from node u, goes over edge (u, vi+1) and its path delay isBS(u, vi) shorter. Figure 2.6 shows the calculation of the branch slacks.In the path enumeration phase, the next longest paths are determined by means of

    the branch slacks. First, Pcrit is determined as discussed before. The path with the nextlongest path delay branches out of Pcrit at the edge (u, vi) with the smallest branch slack.This path can be determined by branching off at u to vi+1 and following the controllingnodes of vi+1 recursively until the sink node is reached.Additional paths can be computed as follows. The path Pk+1 with the next longest

    path delay should be determined. Pk+1 can be generated by branching out at a branch

    24

  • 2.1. (Static) timing analysis

    point from one of the k already determined paths. Therefore, a data structure list[i] isrequired, which keeps a list of branch points for every path Pi that is already determined.This list is sorted according to the branch slacks. Hence, the branch point resulting inthe path with the next longer path delay which branches out from Pi comes first in thelist. The data structure next_delay is another sorted list, which contains the delay ofthe next longest path branching out from every already determined path Pi. The nextlongest path delay for Pi can be calculated as follows:

    next_delay(Pi) = D(P i)− BS of the first element in list[i] (2.11)

    When the next longest path should be determined one takes the first path fromnext_delay and looks in list[i] for the first branch point for this path (see Algorithm 2).In Figure 2.7 an annotated TG with branch slacks and delays to sink is given. Table 2.1

    shows the corresponding execution trace of the k most critical path algorithm for thefirst five iterations. Given are the determined path and its delay, the branch points withcorresponding branch slacks and the next longest path delay of a path branching outfrom this path. The first path is Pcrit with a path delay of 12. Pcrit has two branch pointsS with BS = 1 and node 6 with BS = 2. The branch points are ordered in non-decreasingorder with respect to the branch slack. Hence, next_delay is 11 (= D(Pcrit)−BS((S, 2)))and the corresponding path is branching out from Pcrit at S. To determine the path inthe second iteration the path with the largest next_delay is taken. In this case there isonly one next_delay, hence, the path in the second iteration is branching out from Pcritat S. The used branch point is crossed out (indicated by the arrow with the 2 on topstanding for the iteration in which it is crossed out). The next_delay = 9 is computedfor the second path and a new next_delay for the first path must be calculated as well(indicated by the arrow with the 2 on top). The execution trace shows how the algorithmcontinues to determine the next three longest paths.

    Algorithm 2: k most critical pathsP1 ← Pcrit ;prepare list[1] and calculate next_delay(P1) ;k ← 1;while path enumeration not stopped yet do

    i ← path with longest next_delay;j ← first branch point in list[i] ;generate the next longest path Pk+1 by branching out from the j-th node onpath Pi;prepare list[k + 1] and calculate next_delay(Pk+1) ;remove first element in list[i] and update next_delay(Pi);k ← k + 1 ;

    endreturn (P1, P1, . . . , Pk)

    25

  • 2. Fundamentals

    10

    11T

    9

    8

    7

    6

    1

    2

    3

    4

    5

    S

    BS=2

    BS=1

    BS=2BS=1 BS=5

    BS=2

    BS=2

    5

    4

    2

    2

    0

    0

    0

    5

    4

    2

    8

    9

    12

    7

    11

    6

    3

    2

    4

    4

    3

    4

    4

    3Figure 2.7.: TG with branch slacks (arc between to edges) and delays to sink (number

    next to the node)

    k path(delay) branch points(branch slack) next_delay

    1 S, 2, 6, 7, 10, T (12) ���* 2S(1),���*

    36(2) 11 2→ 10

    2 S, 4, 6, 7, 10, T (11) ���* 4S(2),���*

    56(2), 4(5) 9 4→ 9

    3 S, 2, 6, 8, 10, T (10) 8(2) 84 S, 1, 7, 10, T (9) S(2) 75 S, 4, 6, 8, 10, T (9) S(2), 8(2) 7

    Table 2.1.: Execution trace of the k most critical paths algorithm for the five slowestpaths.

    26

  • 2.2. State of the art of aging analysis

    The algorithm discussed so far is not only capable of enumerating all paths from Sto T , it can determine all paths from an arbitrary node to T . In order to enumerateall paths from the source node to an arbitrary node, the algorithm must be slightlychanged. Most important is to introduce join slack (JS). Join slacks are quite similar tobranch slacks. The join slack is the delay difference between two path segments from Sto a given node.In this thesis the k most critical paths algorithm is required in Chapter 5. It is used

    to consider common edges when the possible critical paths of a circuit are identified andto determine whether a possible critical path of an aged circuit is sensitizable.

    2.2. State of the art of aging analysisSeveral tools have been published that analyze the circuit performance degradationcaused by aging effects on circuit level as well as gate level [Liu et al., 2006].Tools that analyze the degradation caused by drift related aging effects, such as NBTI

    and HCI, are discussed in the following. There are other tools as well that compute theimpact on circuit reliability caused by electromigration (EM) [Blaauw et al., 2003] orradiation-induced soft errors [Miskov-Zivanov and Marculescu, 2008].

    2.2.1. Circuit levelThe general flow of tools to analyze the performance degradation on circuit level can bedivided into the following three steps:

    1. The fresh circuit is simulated and the current and voltage waveforms at the tran-sistor terminals, which are relevant for the prediction of the device degradation,are stored.

    2. Those waveforms are used to generate degraded device models for each individualdevice.

    3. Finally, the degraded circuit performances are obtained by a second SPICE simu-lation with aged device models.

    The first published reliability simulator is called Berkeley reliability tools (BERT) [Tuet al., 1993]. BERT is able to determine the performance degradation caused by HCI. Be-sides that, BERT can compute the probability that a circuit fails due to time-dependentdielectric breakdown (TDDB) and EM. In the first step, BERT determines the draincurrent Id(t), the gate current Ig(t) and the substrate current Isub(t). In the secondstep, from Id(t), Ig(t) and Isub(t) a parameter AGE is determined for every transistor.AGE quantifies the amount of degradation:

    AGENMOS =∫ tlife

    0

    Id(t)W ·Hn

    (Isub(t)Id(t)

    )mndt (2.12)

    AGEPMOS =∫ tlife

    0

    1Hp

    (Ig(t)W

    )mpdt (2.13)

    27

  • 2. Fundamentals

    H and m are determined experimentally for a given technology. W is the transistorwidth and tlife the lifetime. Of course it is not possible to simulate the circuit forthe entire lifetime tlife. Hence, the circuit is simulated for a shorter time interval andAGENMOS and AGEPMOS are extrapolated.Two methods are implemented in BERT to determine the degraded device models.

    Either by interpolating between degraded device model cards for a particular AGE orthe parameter degradation ∆p of the aged device model card are obtained by functionsdependent on AGE:

    ∆p = f(AGE) (2.14)

    After generating the degraded device models, the degraded circuit performance can besimulated in the third step.Commercial reliability simulators, like RelXpert [Cadence, 2003], are already available

    and the latest versions of HSPICE [Synopsys, 2008] and ELDO [Karam et al., 2001]come with an integrated reliability analysis. RelXpert can consider the impact of HCIand NBTI. ELDO is capable of determining the degraded device parameters iteratively.Therefore, the specified lifetime is divided into n time intervals (of equal length). Thesteps one and two are conducted in every time interval. That way, the impact of thedegraded waveforms on the parameter drift can be considered.Maricau and Gielen [2010] analyze the combined impact of aging and process variation

    on circuit behavior. Like ELDO, it is an iterative approach, but the length of the timeintervals is variable. In Section 4.5.1 it is proven by a simple experiment that such aniterative approach is (at least for digital circuits) not necessary.A drawback of commercial tools like RelXpert and ELDO is that the degradation

    equations are proprietary. Hence, the user has to trust the tool and cannot verify howthe degradation is calculated. Kufluoglu et al. [2010] show that RelXpert only reachesan acceptable accuracy when the proprietary degradation equations are replaced byimproved user defined equations.Reliability simulators on circuit-level can be very accurate. However, a reliability

    simulation on circuit-level is quite time consuming and realistic input vectors are re-quired. For the first step of the aging analysis, input vectors are needed that cause arealistic/worst-case degradation of the circuit. The third step requires input vectors tomeasure the degraded circuit performances. In general, the input vectors in the first andthird step are not equal.Like SPICE simulators for timing analysis (see Section 2.1), these tools are not capable

    of simulating complex digital circuits. Nevertheless, they can be used to verify the criticalaged path determined by a aging-aware timing analysis on gate level.

    2.2.2. Gate level

    Aged LUT-based gate models

    Although reliability simulators on circuit level are not applicable for timing analysis ofcomplex digital circuits, they can be used to characterize aged gate models.

    28

  • 2.2. State of the art of aging analysis

    Figure 2.8.: Aged LUT-based gate model as proposed in [Chen et al., 2011].

    Chen et al. [2011] propose a path-based analysis flow, although the gate model canalso be used for a block-based approach. HSPICE [Synopsys, 2008] is used to generateseveral aged LUTs for different conditions like lifetime, temperature or signal probability.This approach results in a lot of LUTs, especially when the workload at the gate inputsshould be considered. If, for instance, LUTs should be generated for five different signalprobabilities, 5 LUTs would be enough for a gate with one input (see Figure 2.8). A gatewith three inputs already needs 125(= 5 · 5 · 5) LUTs and there are gates in a standardcell library that have even more inputs.The aging-aware gate model GLACIER [Wu et al., 2000] considers HCI and defines a

    factor α as follows:α(sIN , CL, TD) =

    dageddfresh

    (2.15)

    The aged gate delay daged and the fresh gate delay dfresh have to be simulated. dfreshis dependent on input slope sIN and output load CL. daged is also dependent on thetransition density TD at the input. For a multiple input gate, daged depends on TD atevery input. To reduce the complexity, it is assumed that the gate delay for each inputcan be calculated by considering the contribution from the switching of all gate inputsseparately from one another as follows:

    α =(

    n∑i=1

    αi

    )− (n− 1) (2.16)

    Where n is the number of transistors connected in series and αi is the contribution of oneinput pin i when just this input switches. However, this approach neglects the impactof the workload at the other inputs and of the internal gate structure on the parameterdrift (see Section 4.3.3).When a reliability simulator on circuit level is used to characterize a gate library, then

    the gate models are valid just for one specific use profile. Hence, the gate models aredependent on the use profile. If, for example, the specified life time changes, the entirelibrary has to be re-characterized.

    29

  • 2. Fundamentals

    Figure 2.9.: Gate delay degradation as a linear function of ∆Vth

    Aged gate delay as a function of parameter drift

    All other proposed gate models have in common that they just consider NBTI and dagedis the sum of dfresh and the degradation as a function of the threshold voltage drift∆d(∆Vth) caused by NBTI:

    daged = dfresh + ∆d(∆Vth) (2.17)

    The advantage of such a gate model is that it is independent of the use profile and theworkload, because they only impact the parameter drift and the drift is computed duringthe analysis and not in advance during the gate model characterization.As long as the parameter drift caused by aging is small enough, a linear approximation

    for the dependence of ∆d and ∆Vth can be used (see Figure 2.9):

    daged = dfresh +∂d

    ∂Vth·∆p (2.18)

    Paul et al. [2006] use the α-power law [Sakurai and Newton, 1990] to obtain thesensitivity ∂d∂Vth :

    Id ∝ (Vgs − Vth)α (2.19)

    It is assumed that the gate delay is solely determined by recharging the output load (nointrinsic gate delay):

    d = CL · VDDId

    = const.(Vgs − Vth)α(2.20)

    Differentiating the expression with respect to Vth results in:

    ∂d

    ∂Vth= α · d(Vgs − Vth)

    (2.21)

    In contrast to that, Kumar et al. [2006] determine the dependence ∆d(∆p) by simu-lation and store the results in LUTs. Kumar et al. [2006] also describe how to calculatethe threshold voltage drift iteratively based on the reaction diffusion (RD) equations forNBTI (see Section 3.1.1). However, this involves solving an equation for every stress andrecovery phase during the lifetime and makes the calculation of the drift very inefficient,especially for long lifetimes. A third contribution is that arbitrary signals result in the

    30

  • 2.2. State of the art of aging analysis

    Figure 2.10.: Transformation of arbitrary signals into periodic signals with same signalprobability and transition density.

    ΔV

    th

    time

    long term prediction model

    Figure 2.11.: Drawing of an NBTI threshold voltage drift caused by consecutive stressand relaxation phases (thin black line) and the ∆Vth drift given by the longterm prediction model (thick orange line).

    same drift as periodic signals with same signal probability and transition density. Hence,it is not necessary to know the exact waveform of the gate input signals, but it is enoughto know their signal probabilities and transition densities (see Figure 2.10). Otherwise,aging analysis would not be feasible, if exact input signals are unknown when a circuitis developed.Wang et al. [2007b] derive a closed form equation to calculate the upper bound of

    the parameter drift caused by NBTI (see long term prediction model in Figure 2.11).Hence, the drift does not have to be calculated iteratively. It is also shown that NBTIhas a negligible impact on the clock distribution network of a sequential circuit. Forsequential circuits it is important that the delay of the clock distribution network to thesending and the receiving FFs have the same delay. Only that way it is assured that thesignals in the combinational logic have one full clock period to propagate from sendingto receiving FFs. Wang et al. [2007b] argue that the clock period is unaffected by aging,because the clock signals to the sending and receiving FFs are delayed equally. However,clock gating is not considered. If the sending and receiving FFs are in separate clockdomains, both clock signals can degrade differently. This would have to be consideredduring the analysis of sequential circuits.The gate model by Luo et al. [2007b] is based on the α-power law as well. It considers

    different temperatures in active and standby mode. In standby mode the transistorsdegrade as well, but due to the lower temperature and the exponential dependence ofparameter drift on temperature, the parameter drift is much smaller. In Section 4.3.3 itis shown how different temperatures can be considered for the gate model introduced inthis thesis.Luo et al. [2007a] introduce a model that takes the stacking effect into account. Stack-

    ing effect describes the effect that not all transistors in a transistor stack have VDD as

    31

  • 2. Fundamentals

    their gate source voltage.All gate models so far have in common that they use just one value for ∆Vth, although,

    in general ∆Vth differs for different transistors of a gate. Either ∆Vth is calculated forevery transistor and the maximum is taken or the ∆Vth of the transistor with an inputtransition is taken.Kumar et al. [2007a] show that the parameter drift of a NOR gate with two inputs

    depends on the signal probability at both inputs. However, this is just shown exemplarilyand there is no formal algorithm derived to calculate the parameter drift of arbitrarylogic gates dependent on the signal probabilities at their inputs.Stempkovsky et al. [2009] don’t propose a self-contained aging-aware gate model, but

    an algorithm to compute the time each individual transistor of a gate is in stress con-dition. It considers the signal correlation at the gate inputs. The model also takes intoaccount that the supply voltage, which must be applied to the source and drain contactsof a PMOS transistor that it is stressed due to NBTI, can come from the drain or thesource contact (see Section 4.4.2).Aging effects are stochastic processes. NBTI, for instance, is caused by breaking Si-H

    bonds and this happens with a certain probability. This results in a distribution of thethreshold voltage drift. Hence, two identical transistors that are stressed identically donot have the same threshold voltage drift. Kang et al. [2007] model the Vth variation ofPMOS transistors and investigate its impact on SRAM cells and combinational logic. Luet al. [2009] propose a statistical reliability analysis which jointly considers the impactof process variation and aging effects.Table 2.2 compares all aging-aware gate models discussed so far and the proposed

    aging-aware gate model, AgeGate.First optimization methods to minimize the impact of NBTI have been published. This

    can for instance be done by pin reordering and logic restructuring [Wu and Marculescu,2009] or by controlling the signals at internal nodes when the circuit is idle [Bild et al.,2009].

    32

  • 2.2. State of the art of aging analysisTa

    ble2.2.:C

    ompa

    rison

    ofstate-of-the

    -art

    gate

    mod

    elswith

    theprop

    osed

    aging-aw

    arega

    temod

    elAgeGate.

    Gatemod

    elDescriptio

    nNBT

    IHCI

    Individu

    altran

    sistor

    drifts

    Aged

    output

    slope

    Use

    profi

    leinde

    pend

    ent

    mod

    el[C

    henet

    al.,20

    11]

    aged

    LUT

    33

    37

    7

    [Wuet

    al.,20

    00]

    aged

    LUT

    73

    3a

    37

    [Pau

    letal.,20

    06]

    α-pow

    erlaw

    37

    77

    3

    [Kum

    aret

    al.,20

    06]

    simulated

    sensi-

    tivities

    37

    77

    3

    [Wan

    get

    al.,20

    07b]

    closed

    form

    ex-

    pressio

    nfor

    pa-

    rameter

    drift

    37

    77

    3

    [Luo

    etal.,20

    07b]

    diffe

    rent

    tempe

    ra-

    ture

    inactiv

    ean

    dstan

    dbymod

    e

    37

    77

    3

    [Luo

    etal.,20

    07a]

    considerss

    tacking

    effect

    37

    77

    3

    [Kum

    aret

    al.,20

    07a]

    individu

    altran

    -sis

    tor

    drifts

    considered

    37

    3b

    73

    [Luet

    al.,20

    09]

    jointly

    considers

    aging

    effects

    and

    processvaria

    tion

    37

    77

    3

    AgeGate

    based

    oncano

    ni-

    calg

    atemod

    el3

    33

    33

    a neglectsim

    pact

    oftheworkloadat

    othe

    rinpu

    tsan

    dof

    internal

    gate

    structureon

    parameter

    drift

    b Doesn’tde

    scrib

    eform

    alway

    tocalculateindividu

    altran

    sistor

    drifts

    33

  • 3. Aging effects and their impact onstandard cells

    The objective of this thesis are methods to analyze the degradation of complex digitalcircuits due to aging. But prior to that, the aging effects and their impact on theperformance of single gates are investigated.Aging effects can be classified into effects that cause a catastrophic failure of a device

    and effects that cause a drift of device parameters with time. For the analysis of thecircuit degradation the drift-related aging effects have to be taken into account. Inaddition, the amount of gate performance degradation due to an aging effect and onwhich factors it depends1 is investigated. This helps to decide which dependencies haveto be modeled by the aging-aware gate model that is developed in Chapter 4.To determine the impact of aging effects on the degradation of the gate performance,

    it is proceeded as follows (see Figure 3.1): The parameter drifts, caused by aging effects,and the sensitivity of a gate performance with respect to a parameter drift are obtained.Combining both information provides the degradation of the gate performance.Finally, it is identified how the degradation due to aging evolves over different process

    technologies. The parameter drifts due to HCI do not show a consistent trend, but it isshown that the circuits are getting more and more sensitive to a parameter drift becauseof the reduced supply voltage.

    3.1. Aging effects

    Aging effects change device parameters with time. It can be distinguished betweenaging effects that lead to an abrupt, catastrophic failure and effects that lead to a deviceparameter drift. Representatives that lead to a catastrophic failure are TDDB and EM.TDDB can be split up into two phases [Lee et al., 2006]. The first phase is called soft

    break down (SBD). With time, traps in the gate oxide are generated and these trapseventually form a conducting path through the oxide. Once a conducting path has beenestablished, new traps are generated due to thermal damage. The new traps result inhigher currents, the temperature in the oxide is further increased and even more trapsare formed. This condition is called thermal runaway and finally leads to a hard breakdown (HBD) and the transistor suddenly fails.The phenomenon that electrons carry metal atoms along a wire is called electromigra-

    tion. EM causes shorts or opens in signal wires and especially in supply wires [Stronget al., 2009].

    1e.g., dependency of the gate delay degradation on temperature and supply voltage

    35

  • 3. Aging effects and their impact on standard cells

    0.8 1 1.2 1.40

    0.05

    0.1

    90nm; 10y; 125°C; W=10µm; Lmin

    |∆V

    th| [V

    ]

    Supply Voltage VDD

    [V]

    (a)

    0 0.02 0.04 0.06 0.08 0.10

    10

    20

    30

    INV; 90nm; 27°C; 1.2V

    |∆Vth

    | [V]

    ∆d

    ela

    y (

    falli

    ng

    in

    pu

    t) [

    %]

    (b)

    Figure 3.1.: 36mV Vth drift due to NBTI at 1.2V VDD (a). Sensitivity of the gate delaydegradation to a threshold voltage drift (b). Hence, NBTI causes about10% degradation of the output delay for a rising input transition.

    Aging effects that cause a catastrophic failure have to be treated stochastically bycomputing a failure rate or a mean time to failure for a circuit.

    Aging effects that cause a parameter drift, on the other hand, can be treated determin-istically. They cause a degradation of the transistor characteristics, which, in turn, leadsto a degradation of the gate performance. This is the reason why drift-related aging ef-fects have to be considered for an aging-aware timing analysis. The two dominant effectsthat cause a parameter drift are negative bias temperature instability (NBTI) and hotcarrier injection (HCI). Both effects are described in detail in the following subsections.

    Unfortunately, the classification of drift-related aging effects and aging effects thatcause a catastrophic failure are not as unambiguous as described so far. For the latter, aparameter drift can be observed as well before the catastrophic failure takes place. Theresistance of a wire first increases and then an open is generated due to electromigration.For TDDB, conducting paths lead to a gradually increase of the gate current duringthe SBD phase before the transistor actually fails. If the time interval in which aparameter drift can be observed is short, it is not required that this effect is consideredfor an aging-aware TA — the device is going to fail anyway within a short period oftime. Lee et al. [2006] show that the time between a SBD and a HBD is significant inadvanced technologies. A gate model for the SBD phase of TDDB is already proposedin [Choudhury et al., 2010]. The equivalent circuit used to model the impact of SBD ona transistor could also be used to incorporate SBD into the proposed aging-aware gatemodel discussed in Chapter 4. EM does not affect the gate itself, but the delay of signallines and the voltage drop across supply lines. Hence, if EM becomes relevant, it mustbe considered in the wire load model for timing analysis.

    36

  • 3.1. Aging effects

    Source DrainGate

    channel

    gate oxide

    SiSiSiSiSiSiSiSi SiSiSiSiSiSiSiSiSi Si

    Si

    O O O O OH H H H

    Si

    OO

    Si

    OSi O

    O

    OO

    O

    Figure 3.2.: Cross section of a PMOS transistor.

    3.1.1. Negative Bias Temperature Instability

    NBTI is regarded the most severe aging effect nowadays. It is a research topic for thelast 40 years [Miura and Matukura, 1966] and gains increased interest in the last decadedue to the problems it causes in modern semiconductor technologies [Entner, 2007].NBTI only affects PMOS transistors. The stress mode for NBTI is a negatively biased

    gate terminal with respect to source and drain. Hence, the transistor is in inversion.The main impact of NBTI on a PMOS transistor can be modeled by an increase of theabsolute value of the threshold voltage. A (normally-off) PMOS transistor has a negativethreshold voltage. Due to NBTI the threshold voltage becomes more negative. It couldbe misleading to say that NBTI decreases the threshold voltage, because a reductionof Vth (for NMOS transistors) implies a performance increase. The convention for thisthesis is to say that NBTI increases (the absolute value of) the threshold voltage |Vth|.Like the name negative bias temperature instability implies, NBTI is accelerated by anincreased temperature and an increased supply voltage.

    Physical mechanism of NBTI

    There is still no consensus yet on the physical mechanism of NBTI. One quite populartheory is the RD model.According to Alam et al. [2007], NBTI originates from broken Si-H bonds at the

    interface between the substrate and the gate oxide. Figure 3.2 shows a cross section ofa transistor. The substrate consists of crystalline silicon (Si). To isolate the gate fromthe substrate, a layer of silicon dioxide (SiO2) is grown upon the substrate. The gateitself consists of polycrystalline silicon. After the SiO2 layer is processed, dangling bondsremain at the Si/SiO2 interface. A dangling bond is a Si atom with an unsatisfied valence.Dangling bonds are called interface states. These states can capture charges and have asignificant negative impact on the transistor performance. During the manufacturing ofa chip, interface states are satisfied by hydrogen atoms (H). Those Si-H bonds can breakup again during the NBTI stress mode. The generated interface states are responsible

    37

  • 3. Aging effects and their impact on standard cells

    for the degradation of the transistor parameters. There are contradictory opinions aboutwhat happens with the vacant H atoms. It is still under discussion whether there is adiffusion of neutral H atoms, a diffusion of H2 molecules, or a drift of H+ ions in thedirection of the gate. Alam et al. [2007] argue that H atoms react to H2 and H2 thendiffuses.The generation of the interface states and the diffusion of the hydrogen can be modeled

    by a RD system. In a RD system two processes are involved: A local reaction and adiffusion (or drift) of the reaction products.The rate of interface state generation due to NBTI is given by the following equation:

    dNitdt

    = kF (N0 −Nit)︸ ︷︷ ︸generation

    − kRNH(0)Nit︸ ︷︷ ︸annealing

    (3.1)

    N0 is the initial number of Si-H bonds, Nit is the number of interface states and kFis the rate constant of broken bond creation (dissociation rate constant). NH(0) is thenumber of hydrogen atoms at the Si/SiO2 interface. The process of Si-H bond breakingcan also be reversed. This is described by the second term. kR is the rate constant ofreverse annealing of a dangling bond and a H atom to a Si-H bond. This annealing orrecovery effects is a special property of NBTI. It means that the number of interfacestates decreases again when the stress is removed.The creation of interface states is limited by the diffusion (or drift) of hydrogen. This

    is modeled by a second rate equation:

    dNitdt

    = −DHdNHdx

    +NH · µH · Eox (3.2)

    DH is the diffusion coefficient, µH is the mobility and Eox the electrical field acrossthe oxide. The second term can be neglected for neutral atoms or molecules. kF , kR andDH are temperature dependent. kF depends on the electrical field as well. This meansthat for the generation of interface states an electrical field is required but not for theannealing and the diffusion. Equations 3.1 and 3.2 form a system of partial differentialequations. This system can either be solved numerically or a closed form equation canbe derived if some justified assumptions are made:

    Nit(t) =√kFN02kR

    (DHt)1/4 (3.3)

    The assumptions are that the rate of interface states is small and Nit is much smallerthan N0. The time dependence for H diffusion is 1/4 and for H2 diffusion it is 1/6. Thedependence of Nit on Vth is given by [Schroder and Babcock, 2003]:

    Vth ∝ −qNit(ΦS)Cox

    (3.4)

    Cox is the oxide capacitance and ΦS is the surface potential. By increasing Nit theabsolute value of Vth is increased. Other device parameters are also going to change dueto Vth:

    38

  • 3.1. Aging effects

    −1.4 −1.2 −1 −0.8 −0.6 −0.4 −0.2 0−3

    −2.5

    −2

    −1.5

    −1

    −0.5

    0

    0.5x 10

    −4

    Vds

    [V]

    I d [

    A]

    |∆ V

    th|=0mV

    |∆ Vth

    |=33mv

    |∆ Vth

    |=66mV

    |∆ Vth

    |=100mV

    Degradation

    Figure 3.3.: Output characteristic of a PMOS transistor for altered values of ∆Vth.

    Id ∝ (Vgs − Vth)2 (3.5)gm ∝ (Vgs − Vth) (3.6)

    The drain current Id is important for the performance of digital circuits and thetransconductance gm is relevant for analog circuits. Figure 3.3 shows the output char-acteristic of a PMOS transistor for altered values of ∆Vth.Unfortunately, the reaction diffusion theory is not able to explain all properties of

    NBTI. The RD theory cannot model the temporal behavior of the recovery effect, thebias dependence of the recovery effect, and the dependency of the parameter drift onthe duty cycle of the signal at the gate terminal [Grasser et al., 2009].One attempt to explain this is by extending the RD model by a second component

    [Islam et al., 2007]. Besides the creation of interface states, hole trapping might beresponsible for the threshold voltage drift as well. The holes are trapped by alreadyexisting traps in the oxide. Another explanation is a two-stage model based on E’centers [Grasser et al., 2009]. E’ centers are a well known defect in SiO2 oxides. Inthe first stage the E’ centers are charged and discharged. This explains the recoveryeffect. In the second stage a dangling bond can be created at the Si/SiO2 interface by apositively charged E’ center.

    Modeling of NBTI

    To compute the threshold voltage drift for NBTI, degradation equations from an industrypartner are used:

    ∆Vth = A · exp(

    EakB · T

    )· Vgsb · tstressn ·

    1 + CW

    (3.7)

    The drift is dependent on temperature T , the gate-source voltage Vgs, the time tstressthe transistor is in NBTI stress mode and the transistor width W . A, Ea, kB, b, n and

    39

  • 3. Aging effects and their impact on standard cells

    100

    101

    101

    102

    90nm; Vnom

    ; 125°C; W=10µm; Lmin;

    lifetime [y]

    ∆V

    t [m

    V]

    Figure 3.4.: Time dependence of Vth drift due to NBTI.

    0 50 100 1500

    0.02

    0.04

    0.06

    0.08

    90nm; 10y; SP=0%; W=10µm; Lmin

    |∆V

    th| [V

    ]

    T [°C]

    1.08V

    1.2V

    1.32V

    Figure 3.5.: Temperature dependence of ∆Vth for altered values of Vgs.

    C are constants. The time dependence (n) is shown in Figure 3.4. Reported values forn in the literature are between 0.15 and 0.30 [Massey, 2004]. This could be a clue forH as well as for H2 diffusion. ∆Vth increases monotonically with time (without takingrecovery into account). For an aging-aware timing analysis, this means that it is enoughto verify that a circuit is fast enough at the end of the specified lifetime. Due to thepower law, the drift increases very fast at the beginning and settles with time. Supposen is 0.25. If you have a certain threshold voltage drift after a time t1, it takes 16 · t1 tohave a threshold voltage drift twice as high.The temperature dependence (see Figure 3.5) is modeled by the Arrhenius equation.

    The reported values for the activation energy Ea vary between 0.1 and 0.36 eV [Massey,2004]. The voltage dependence is given by a power law. The higher the gate-sourcevoltage is, the higher is the electrical field across the gate oxide and the resulting drift.For the drift, the temperature and voltage over the lifetime are important. From nowon, they are referred to as effective temperature (Teff ) and effective supply voltage(Veff ), to distinguish them from the current temperature Tcurr and voltage Vcurr at themoment the circuit is analyzed. The current values of temperature and voltage definethe sensitivities, as can be seen later in Section 3.2.1.

    40

  • 3.1. Aging effects

    0 0.5 1 1.5 2 2.5 30.02

    0.04

    0.06

    0.08

    0.1

    0.12Vnom; 125°C; 10y; SP=0%(wc); Lmin

    Width [µm]

    V th [

    V]

    120nm90nm65nm LP65nm HPmin. width in cell library

    Figure 3.6.: Transistor width dependence. Marked is the minimal transistor width usedin the standard cell libraries.

    Just a vertical electrical field and no lateral field exists during the homogeneous stressmode for NBTI. The creation of interface states is uniformly distributed over the wholegate area and a dependence on transistor sizes should not be observable. A dependenceon transistor length for very short transistors is reported in literature [Massey, 2004],but not modeled in the degradation equations. However, a transistor width dependencefor small transistors is modeled by the degradation equations. Some kind of edge effectsare assumed to be responsible for the dependence on transistor sizes. Figure 3.6 showsthe transistor width dependence for different technologies. Marked are the minimal tran-sistor widths used in the standard cell libraries. One can see that for some technologies(65nm LP) the transistor width actually affects the drift and for other technologies(120 nm, 90 nm) the minimal transistor width used in the standard cell library is toolarge to have a significant effect on transistor drift.NBTI strongly depends on the process technology as well. Manufacturing steps that

    have an impact on NBTI drift are, for instance, concentration of hydrogen, deuteriumand nitrogen in the oxide, the gate material, and initial quality of the Si/SiO2 interface[Schroder and Babcock, 2003].NBTI is a statistical process [Schlünder et al., 2011]. A Si-H bond is broken with a

    certain probability. Hence, the threshold voltage drift for defined stress parameters isa probability distribution. However, the degradation equations just provide the meanvalue for the drift. Rauch III [2002] shows that the sigma of the threshold voltage driftis dependent on the transistor area:

    σ(∆Vth) ∝1√W · L

    (3.8)

    It is also shown that ∆Vth due to aging and ∆Vth due to process variation are uncorre-lated [Fischer et al., 2008].

    41

  • 3. Aging effects and their impact on standard cells

    ΔVth

    time

    Figure 3.7.: Drift over time for an AC stress.

    NBTI2 is the only aging effect that shows a recovery effect. In the RD model, recoverycan be explained by the second term in Equation 3.1. This term describes the reverseannealing of Si-H bonds. There is no consensus about whether the complete drift recoversor a permanent part remains [Massey, 2004]. What has been understood is that therecovery of a certain amount of drift takes substantially longer than the time needed togenerate this drift. In [Grasser et al., 2009] a proportion of recovery to degradation of2.5/1 in logarithmic timescale is reported. This means, for instance, when a thresholdvoltage drift is generated with 25mV/decade the recovery has a slope of 10mV/decade.The recovery effect makes it more difficult to characterize NBTI and complicates the

    analysis of a circuit as well. To extract the constants for the degradation equation, singletransistors are stressed under defined conditions and the resulting drifts are measured.Before the drift can be measured, the stress has to be removed. Reisinger et al. [2007]argue that a conventional measurement set up takes up to 1 s to obtain the thresholdvoltage drift. Hence, the transistor has 1 s to recover before the drift is measured.Reisinger’s proposed on-the-fly measurement just takes 1 µs and it is shown that thedrift already recovered 50% of its value in the interval between 1 µs and 1 s. How muchof the drift is recovered before 1 µs is unknown. 1 µs seems already sufficient fast, but ina circuit that is operated with 1GHz the recovery time might just be 1ps. Hence, theerror between the real drift value and the measured, already recovered value might belarger than 50%.The degradation due to NBTI is frequency independent, but it strongly depends on

    the duty cycle of the signal at the transistor gate. NBTI is a static aging effect. Thedrift is determined by the portion of the lifetime the gate voltage is negative with respectto source and drain and not by the number of signal transitions (frequency). Althoughthe degradation is frequency independent, a substantial difference between a DC and anAC stress is observed [Massey, 2004]. This is due to the recovery effect. For a DC stressthe drift cannot recover, it will monotonically increase. For an AC stress, the drift canrecover in between the stress phases. This results in a tooth saw curve for the drift overtime as depicted in Figure 3.7. Due to the fact that the drift builds up faster than itrecovers, the mean of the drift increases monotonically.Figure 3.8(a) shows the dependence of the drift on the stress-duty-cycle as modeled

    by the degradation equations. For a stress-duty-cycle of 100%, the transistor is con-stantly stressed (DC stress) and the drift is maximal. For a stress-duty-cycle of 0%, the

    2except from its counterpart positive bias temperature instability (PBTI)

    42

  • 3.1. Aging effects

    0 20 40 60 80 1000

    0.02

    0.04

    90nm; Vnom

    ; 125°C; 10y; W=10µm; Lmin

    Stress duty cycle [%]

    |∆V

    th| [V

    ]

    0 20 40 60 80 1000

    50

    100

    ∆V

    th [

    %]

    (a) (b) [Baumann et al., 2010]

    Figure 3.8.: Duty cycle dependence of NBTI.

    transistor is never in stress mode and there is no drift observable.

    Unfortunately, the degradation equations used in this thesis do not take the recoveryeffect into account. Figure 3.8(b) shows a measured curve of the stress-duty-cycle de-pendence with recovery for a 40nm technology [Baumann et al., 2010]. This curve hasa S-shape and the drift values for AC stress (stress-duty-cycle < 100%) are far belowthe drift for DC stress.

    Not being able to consider the recovery influences the accuracy of the proposed aginganalysis results3. However, due to the fact that the recovery effect has an impact onthe characterization as well, it is not for sure whether the results are too pessimisticor optimistic. On the one hand, recovery is not taken into account for the dependencyon the stress-duty-cycle. If it is assumed, for instance, that a transistor experiencesa stress-duty-cycle of 50%, the degradation equations that are used provide a drift ofabout 80% of the maximal drift. By considering recovery, the drift would just be about40% of the maximal drift (16mV/42mV from Figure 3.8(b)). Hence, the error of theanalysis would be 50%.

    However, it must be considered as well that recovery makes the measurement of thedrift more difficult. The drift values to extract the parameters for the degradationequations were not determined by the on-the-fly measurement set-up from [Reisingeret al., 2007]. This results in an error of at least 50% as well. In this case both errorswould cancel each other out. The measurement underestimates the actual drift by 50%,because the drift has already recovered from its initial value until the measurementstarts, and the analysis overestimates the drift by 50%, because recovery is not takeninto account for the stress-duty-cycle dependence.

    3if the workload is taken into account

    43

  • 3. Aging effects and their impact on standard cells

    Positive bias temperature instability

    NBTI occurs only for PMOS transistors. A similar aging effect for NMOS transistors iscalled PBTI. The stress condition for PBTI is that the NMOS transistor is in inversion.Hence, the gate terminal is positively biased with respect to source and drain.Before high-k metal gates were introduced, PBTI could be neglected. Since then,

    degradation due to PBTI is reported to be in the same order of magnitude than NBTI([Tschanz et al., 2009]).The developed aging analysis is based on a 90nm technology with SiO2 as gate dielec-

    tric. Hence, PBTI can be neglected. Nevertheless, there are no fundamental problemsto consider PBTI as well by the proposed aging analysis methodology.

    3.1.2. Hot Carrier InjectionHot carrier injection (HCI) affects both, NMOS and PMOS transistors. Carriers are ac-celerated until they have enough energy to overcome the potential barrier of the Si/SiO2interface and leave the channel. A small number of those hot carriers damage the gateoxide and the interface or get trapped into the oxide and form space charges. Bothmechanisms lead to a degradation of the transistor characteristics. The rest of the car-riers contributes to the gate current. Hot carriers are holes or electrons that gained ahigh kinetic energy by an electrical field. By secondary effects (e.g., electron-electronscattering) their energy can be further increased [Strong et al., 2009]. They are called“hot” because their energy is substantially higher than their energy in thermal equilib-rium. The carriers are accelerated by the drain-source voltage Vds across the invertedchannel. In the drain region the carriers have collected enough energy to overcome thepotential barrier of the Si/SiO2 interface. Hence, HCI is an asymmetric aging effect thatdamages the drain region of a transistor.

    Physical mechanism of HCI

    Four different mechanisms for hot carrier generation and injection can be distinguished[Renesas, 2008]:

    • Drain avalanche hot carrier (DAHC)

    • Channel hot carrier (CHC)

    • Secondary generated hot carrier (SGHC)

    • Substrate hot carrier (SHC)

    DAHC and CHC are the two major mechanisms and are further discussed.

    Drain avalanche hot carrier

    High energy carriers collide with Si atoms and generate electron hole pairs by impactionization (see Figure 3.9). Those generated carriers are themselves accelerated and can

    44

  • 3.1. Aging effects

    Source Drain

    Gate

    Id

    Ig

    Vg

    Vs Vd

    Figure 3.9.: Drain avalanche hot carrier.

    Source Drain

    Gate

    Id

    Ig

    Vg

    Vs Vd

    Figure 3.10.: Channel hot carrier.

    again cause impact ionization (avalanche multiplication). Some generated carriers areinjected into the oxide or damage the interface. DAHC is maximal for Vds = 2 · Vgs.

    Channel hot carrier

    This time, impact ionization is not the reason for carrier injection. For CHC (see Fig-ure 3.10), the hot carriers themselves are injected into the oxide. They are acceleratedin the direction of the gate by a high gate voltage. Some “lucky electrons” are able toovercome the potential barrier at the Si/SiO2 interface and enter the oxide. CHC ismaximal for Vds = Vgs.

    Modeling of HCI

    HCI damage can be modeled by an increase of the absolute value of the