1 models - university of california,...

UNIVERSITY OF CALIFORNIA, BERKELEYCollege of Engineering

Department of Electrical Engineering and Computer Science

Borivoje Nikolic Homework #1 Solutions EECS241 (Spring 2012)

Due Monday, February 27, 2012 in class

This is an individual assignment!The goal of this assignment is to get familiar with the class technology. It could befairly long knowledge of some scripting language (like Perl) could be useful.

1 Models

Use SPICE models to characterize a predictive 32nm LP CMOS process; parameter files are athttp://www.eas.asu.edu/∼ptm/ and on the class home page. The nominal supply voltage for thisprocess is 1V.

1.1

Determine the threshold voltage Vth, for the NMOS and PMOS devices (for VBS = 0V, L = 32nm,and W = 1µm), by extrapolating from the ID-VGS curve at low VDS . Explain your circuit setup.How does this result compare to values reported in the model file and the DC OP analysis?

Solution (contributed by Rachel Nancollas):In order to extrapolate Vth values, I swept Vgs of a pMOS and nMOS transistor at different valuesof Vds. I then plotted these Ids-Vgs curves and fit straight lines to the curves below threshold andabove threshold. I then took the intersection of these lines as my extrapolated Vth value.

(a) nMOS (b) pMOS

Figure 1.1: This figure shows Vth values for an nMOS transistor (left) and pMOS transistor (right).The hashed line shows the Vth value from the model file, the solid line has the Vth value ADEextracted for different values of Vds and the dashed line shows the Vth value I extrapolated fromVgs-Id curves. As we expect, at higher Vds values, Vth decreases due to DIBL. Also, in both plots,we note that the Vth values we extrapolate actually increase at low Vds values before decreasing dueto DIBL. The reason for this has to do with the fact that deep in the ohmic region, the Ids-Vgs curveis less linear above threshold, so my linear fits produce lower-than-expected Vth values. Overall,the nMOS Vth value is somewhere around .61V and the pMOS value is around .58V. Also, the Vthvalues produced by the model, extrapolation, and ADE agree better at lower Vds values.

1

1.2

For the model used in class,

ID,sat =W

L

µeffCoxECL

2(VGS − Vth)2

(VGS − Vth) + ECL

find the values of ECL that best fit the NMOS and PMOS characteristics. Use the Vth value fromProblem 1.1.

Solution (contributed by Rachel Nancollas):One of the transistor models that handles stacks more accurately in velocity saturation is given by:

ID,sat =W

L

µeffCoxEcL

2(VGS − Vth)2

(VGS − Vth) + EcL(1)

In order to find the value of EcL, I extracted Ids-Vgs curves deep in saturation (Vds=Vdd)for nMOS and pMOS transistors. To find µ and Cox, I used values (u0, toxe, and epsrox) whereCox = Eox

toxfrom the model file. I then used a least squares curve fit to fit Vth and EcL. (I originally

used the EcL values calculated in the previous section, but due to DIBL, I found that I could get abetter fit if I chose a Vth value that reflected the higher Vds.) The EcL values I found were about100mV for the nMOS transistor and 200mV for the pMOS. Overall, these values seem somewhatlow - based on calculating Vdsat values using EcL, I would expect EcL values closer to 600-800mV.I suspected the reason for this was as we saw in the first part of this question, the model file valuesare designed to fit SPICE/spectre’s transistor models and do not necessarily provide the best fit forour simplified models. Therefore, to improve my fit, I allowed the least square curve fit to fit valuesfor EcL, Vth, and K’. Doing this, I got more reasonable EcL values and the Vth and K’ parameterswere also physically realistic. Specifically, I found that for the nMOS transistor, EcL = 327.9mV,Vth = 483.8mV, and K’ = .0018V −1 (for comparison, the K’ value I calculated from the model filewas .0017V −1). Likewise, for the pMOS transistor, I found EcL = 548.3mV, Vth = 450.5mV, andK’ = .0012V −1 (while the value from the model file was .00091 V −1).

(a) nMOS (b) pMOS

Figure 1.2: This figure shows Ids-Vgs curves for nMOS and pMOS transistors and the fit providedby equation 1 for the EcL values calculated by a least squares fit.

1.3

Draw the boundary between the linear and velocity saturation regions on the output I-V charac-teristics of the NMOS transistor. Clearly label the relevant points. How does it compare with themodel?

2

Solution (contributed by Rachel Nancollas):By plotting Ids vs. Vds for a family of Vgs curves, we can get a sense for when the transistorsleave the linear regions and become velocity saturated. According to the models introduced in class,Vdssat or the value of Vds at which the transistor becomes velocity saturated is given by:

VDSat =(VGS − VTH)ECL

(VGS − VTH) + ECL(2)

Therefore, in figure 1.3, I plot Ids-Vds curves and draw the boundary between the linear andvelocity saturated region.

(a) nMOS (b) pMOS

Figure 1.3: This figure shows Ids-Vds curves for different values of Vgs for nMOS and pMOStransistors. The circles on these curves represent the VDSat values and corresponding currents ascalculated using the EcL values calculated in the previous section. Overall, these Vdsat valuesprovide a fairly reasonable mark between linear and saturation, but they are perhaps a bit lowbecause the EcL values we computed were slightly lower than expected. The dashed line representsthe boundary between linear and velocity saturation if we let EcL = 600mV for the nMOS and800mV for the pMOS. I suspect the reason my EcL (and thus Vdsat) values are a bit low has todo with the sensitivity of the least squares curve fit model to initial conditions when fitting a largenumber of parameters.

1.4

We will try to extract the parameters Vth and α for the alpha-power-law model, ID = K(VGS−Vth)α,from SPICE simulations. Assume WN = 1µm and WP = 1.5µm, and get at least 10 simulationpoints. Then, use Matlab to determine K, Vth and α for both NMOS and PMOS transistors. Hint:use the lsqcurvefit function.

Solution (contributed by Rachel Nancollas):Another commonly used model for fitting Id above threshold is using the alpha power law model.This model, which has an entirely empirical α fit parameter is given by:

IDSat = K ′(VGS − VTH)α (3)

To find the K ′, VTH , and α values, I plotted Ids-Vgs curves for nMOS and pMOS transistorsin saturation and used a least squares curve fit in MATLAB. The fits and extracted parameters areshown in figure 1.4.

3

(a) nMOS (b) pMOS

Figure 1.4: This figure shows Ids-Vgs curves for an nMOS (left) of W = 1µm and a pMOS (right)of W = 1.5 µm. The fits were produced using equation 4 and a least square curve fit. The resultingfit equations are listed on the plot.

1.5

By setting α = 1, find the best Vth’s that correspond to linear dependence of current on VGS .

Solution (contributed by Rachel Nancollas):Although the alpha power low model can provide a reasonably good fit to Ids-Vgs curves, theempirical nature of α does not provide much physical insight and makes calculations using themodel cumbersome. Therefore, it is not uncommon to use course estimate where α = 1:

IDSat = K ′(VGS − VTH) (4)

To find the Vth values for which this model fits my Ids-Vgs data, I again used a least square curvefit. In this case, I modified my fitting function so the Ids = 0 when the voltage was below thresholdand IDSat = K ′(VGS − VTH) above threshold. Using this model, I produced the fits shown in figure1.5. The Vth that provided the best linear fit were: Vth = 557.5mV for the nMOS transistor andVth = 546.99mV for the pMOS transistor.

4

(a) nMOS (b) pMOS

Figure 1.5: This figure shows the Ids-Vgs curves and corresponding α = 1 power fits for an nMOSand pMOS transistor. The extracted Vth values are displayed on the plots. Compared to the Vthvalues found in the first part (which were also calculated using linear fits), these values are a bitlower. This makes sense because the Ids curves in this plot were found in deep saturation V ds = V dd,so as we would expect from figure 1.1, the Vth values are lower because of DIBL.

5

2 Transistor Sizing

!

(a)! (b)! (c)

Figure 2.1: Schematics used in Problem 2.

2.1

Using SPICE and the 32nm LP model with 1V supply, find the required width of the PMOS transistorthat minimizes the propagation delay (tp,HL + tp,LH)/2 for the inverter in Figure 2.1(a).

Solution (contributed by Rachel Nancollas):When sizing transistors, we generally like to make the impedance of the pull up network equal tothat of the pull down network to minimize delay. With our old 141 long channel, unified model,this corresponded to making the pMOS twice as wide as the nMOS transistor. However, for shortchannel devices, because pMOS devices have lower mobility, they have smaller EcL values and entervelocity saturation later. Therefore, they act like long channel devices (with lower on resistance) fora wider range of Vgs values. Essentially, this means approximating pMOS devices as having twicethe on resistance is too pessimistic, so while the pMOS device should be wider, it should not betwice as wide. To demonstrate this and find the optimal pMOS width, I did a simulation where amade a string of ten inverters and swept the width of the pMOS transistor. I then looked at theaverage propagation delay of the devices as a function of the pMOS width as shown in figure 2.2.Based on this plot, we see that the optimal pMOS width is about 1.32µm, which corresponds toβ = 1.32.

6

Figure 2.2: This figure shows the average propagation delay in an inverter as a function of the widthof the pMOS transistor (when the nMOS transistor has a width of 1µm. The plot is concave upbecause delay is minimized when the pMOS and nMOS transistor have equal on resistance.

2.2

Find the required width W for the NMOS transistors in Figure 2.1(b) such that the equivalentresistance of the pull-down network is the same as the equivalent resistance of the pull down networkin Figure 2.1(a). Use hand analysis with ECL = 0.8V, VDD = 1.0V, and Vth = 0.4V.

Solution (contributed by Rachel Nancollas):

7

2.3

Using a hand calculation and ECL = 0.8V, find the W in Figure 2.1(c) that results in the equivalentpull-down resistance as in Problem 2.2.

8

Solution (contributed by Rachel Nancollas):

2.4

For the circuit in Figure 2.1(b), plot the switching trajectories in the IDS-VDS plane of the transistors,showing the path that transistors traverse. Do this in two cases: (1) when the top transistor is

9

switching, and the bottom is on; and (2) when the bottom is switching, and the top one is on.

Solution (contributed by Rachel Nancollas):I simulated the NAND gate and plotted the intrinsic current through the stacked nMOS transistorsas a function of the Vds drop across the transistors. I did this for the case where the input of thetop transistor switches from low to high (while the bottom one is on) and the case where the bottomtransistor switches while the top is on. These switching trajectories are shown in figure 2.3.

(a) Top Switching (b) Bottom Switching

Figure 2.3: This figure shows the switching trajectories of the top and bottom stacked transistorsin a NAND gate when the top transistor switches last (left) and the bottom transistor switches last(right). For the next part of the discussion, please refer to the schematics I have drawn in my notes.(left) As discussed, initially, the top transistor has a large Vds, but as C1 begins discharging and C2charges a little, Vds and Vgs decrease and the current though the top transistor decreases. At thesame time, the bottom transistor has constantly high Vgs, so when Vds increases a little (when C2charges a bit), the current increases. However, this transistor always remains in the linear regionand the current quickly goes back to zero as C2 discharges. (right). For the top transistor, there isinitially charge sharing between C1 and C2 so Vds starts lower. However, as the bottom transistorturns on, C2 discharges so Vds and Vgs of the top transistor increase. As the bottom transistorcontinues conducting, both C1 and C2 discharge, so the current through (and Vds across) the toptransistor continues to decrease. Due to charge sharing, the bottom transistor initially has some Vdsand as the transistor turns on, the current increases sharply. However, as current begins conductingthough the bottom transistor, C2 begins discharging, which makes the Vds and current though thebottom transistor decrease.

3 Variability

Consider the sequential circuit shown in Figure 3.1, which consists of flip-flops Fi and identicalcomplex gates Gi. You may assume that the clock network is ideal throughout this problem, andthe clock period is set to Tclk = 5ns.

3.1

What is the maximum number of complex gates nmax that the critical path can have and still meetthe clock period constraint if tF,clk−q = 0.2ns, tF,setup = 0.15ns, tF,hold = 0.3ns, tG,max = 0.45ns,and tG,min = 0.4ns? What is the minimum number of complex gates nmin that the shortest pathmust have to meet the hold time constraint? If we follow these restrictions on the minimum andmaximum logic depth, what will be the yield of this chip?

10

F1

…

CLK

F2

G2 Gn G1

Figure 3.1: Schematic for Problem 3.

Solution:The required clock period Tclk constrains the maximum number of gates that can be between theregisters, and can be used to solve for the maximum number of gates as follows:

Tclk ≥ tclk−q + tsetup + tlogic,max

≥ tclk−q + tsetup + nmaxtG,max

⇒ nmax =⌊ 1tG,max

(Tclk − tclk−q + tsetup)⌋

= b10.33c

= 10 gates

Similarly, the hold time constraint is used to find the minimum number of gates that must bebetween the flip-flops:

thold < tclk−q + tlogic,min

< tclk−q + nmintlogic,min

⇒ nmin =⌈ 1tG,min

(thold − tclk−q)⌉

= d0.25e

= 1 gate

The yield of this chip would be 100% because the constraints are satisfied and there is no randomnessin the timing parameters.

3.2

In an actual chip, each gate’s timing parameters will vary due to process variations. As we saw inclass, this is usually modeled as a zero-mean Gaussian random variable added to the ideal timingparameter. Using the following models for the timing parameters, what is the maximum number ofcomplex gates that can be in the critical path while having the path meet the Tclk constraint 99%of the time? Similarly, what is the minimum number of complex gates that can be in the shortestpath while meeting the hold time constraint 99% of the time? Assume that all timing parametersare independent and identically distributed (i.i.d).

tF,clk−q ∼ N (0.2ns, 0.05ns2)

tF,setup ∼ N (0.15ns, 0.04ns2)

tF,hold ∼ N (0.3ns, 0.05ns2)

tG,max ∼ N (0.45ns, 0.1ns2)

tG,min ∼ N (0.4ns, 0.07ns2)

11

Solution:To find nmax, we first need to model the maximum delay of the path Tmax. The sum of indepen-dent Gaussian random variables Xi is Gaussian with a mean equal to

∑iXi and variance

∑i σ

2i .

Therefore, the distribution of the maximum delay of the path is:

Tmax ∼ N (µclk−q + µsetup + nmaxµG,max, σ2clk−q + σ2

setup + nmaxσ2G,max)

∼ N (.35 + 0.45nmax, 0.09 + 0.1nmax)

Now, we are interested in when Tmax satisfies the Tclk constraint, which is when Tmax < Tclk. Wewant the cumulative probability of this event to be 0.99. In order to find the value of nmax thatmakes this true, we transform our problem to that of finding the argument x that makes the standardnormal CDF equal Φ(x) equal to 0.99:

Pr[Tmax < Tclk] = 0.99⇒Pr[N (.35 + 0.45nmax, 0.09 + 0.1nmax) < 5] = 0.99

⇒Pr[N (0, 1) <

5− (.35 + 0.45nmax)√0.09 + 0.1nmax

]= 0.99

Next, we use a standard normal CDF table to find the value of x that makes Φ(x) = 0.99, whichgives x = 2.33. Therefore,

x =5− (.35 + 0.45nmax)√

0.09 + 0.1nmax

2.33 =5− (.35 + 0.45nmax)√

0.09 + 0.1nmax

So,

nmax = b6.02c

= 6 gates

We can solve for the minimum number of gates to satisfy the hold time constraint with a probabilityof 0.99 in an analogous way. We are interested in the value of nmin that makes the cumulativeprobability that Tmin = tclk−q − thold + nmintG,min is greater than zero (which satisfies the holdtime constraint) equal to 0.99. We start with finding the distribution of the Tmin:

Tmin ∼ N (µclk−q − µhold + nminµG,min, σ2clk−q + σ2

hold + nmaxσ2G,min)

∼ N (−0.1 + 0.4nmin, 0.1 + 0.07nmin)

Again, we transform the CDF of the distribution to that of a standard normal and set it equal to0.99:

Pr[Tmin > 0] = 0.99⇒Pr[Tmin < 0] = 0.01⇒Pr[N (−0.1 + 0.4nmin, 0.1 + 0.07nmin) < 5] = 0.01

⇒Pr[N (0, 1) <

−(−0.1 + 0.4nmin)√0.1 + 0.07nmin

]= 0.01

12

Since we are now interested in the value of x that makes Φ(x) = 0.01, we can use the standardnormal table or the properties of Φ(x) to get that in this case x = −2.33. Now, we can solve fornmin:

x =0− (−0.1 + 0.4nmin)√

0.1 + 0.07nmin

−2.33 =0− (−0.1 + 0.4nmin)√

0.1 + 0.07nmin

So,

nmin = d3.76e

= 4 gates

The main takeaway from this problem is that the constraints get much tighter when some variabilityis included in the analysis.

3.3

The previous model of i.i.d. timing parameters might be too optimistic, so we can amend the modelto include the effect of some systematic variations in the form of correlated variations. In this case,we will model each of the complex gates as having correlated distributions with all other complexgates with a correlation coefficient of ρ = 0.3. You keep the flip-flop timing parameters as i.i.d.random variables. What is the probability that your critical path from Problem 3.2 does not meetthe Tclk constraint? What is the probability that your minimum path from Problem 3.2 does notmeet the hold time constraint? Hint: look up the multivariate Gaussian distribution, covariancematrices, and linear combinations of Gaussian random variables.

Solution:The key part of this problem is to realize that a linear combination of jointly Gaussian randomvariables (of which all the timing parameters in this problem are) is Gaussian. Let X be a vectorwhose components are jointly Gaussian random variables. Then, if we take a linear combination ofX (Y = aTX), then:

µY = aTµX

ΣY = aTΣXa

In this case, we have eight Gaussian random variables (one for clk-q, one for setup, and six for gatedelays). The a vector will be a 8x1 with all 1 entries, so the mean of the sum will be

µY = 0.2 + 0.15 + 6 ∗ 0.45 = 3.05

For the correlated gates, since ρ = σXY /(σXσY ), σXY = 0.03. The covariance matrix of X is:

ΣX =

0.05 0 0 0 0 0 0 00 0.04 0 0 0 0 0 00 0 0.1 0.03 0.03 0.03 0.03 0.030 0 0.03 0.1 0.03 0.03 0.03 0.030 0 0.03 0.03 0.1 0.03 0.03 0.030 0 0.03 0.03 0.03 0.1 0.03 0.030 0 0.03 0.03 0.03 0.03 0.1 0.030 0 0.03 0.03 0.03 0.03 0.03 0.1

13

which means the variance of Y is σ2Y = 1.59. Now that we have the mean and variance of the

maximum delay through the path, we can follow a similar procedure as in Problem 3.2, except nowwe have the argument to the standard normal CDF, and we want to find the value of the CDF atthat point. In this case, the argument is

x =Tclk − µY√

σ2Y

=5− 3.05√

1.59= 1.55

And,Φ(1.55) = 0.939

So the probability of satisfying the Tclk constraint is now 93.9% instead of 99%.

Following the same procedure, we find that the probability of passing the hold time constraint is 97%.

3.4

If a chip has two copies of the critical path from Problem 3.2 where the two copies are independent,what is the probability that both of the paths meet the Tclk constraint? What if the path has 100copies? How about if it has 1000? What does this imply about the design of the critical paths ofthe chip?

Solution:Since the paths are uncorrelated and we know from Problem 3.2 that the yield of each individualpath is 0.99, then the probability of two paths working is:

Pr[two paths pass] = Pr[path 1 passes] ·Pr[path 1 passes] = 0.992 = 0.98

Similarly,

Pr[100 paths pass] = Pr[path 1 passes]100 = 0.99100 = 0.37

Pr[1000 paths pass] = Pr[path 1 passes]1000 = 0.991000 = 4.3× 10−5

This means that if there are many critical paths, then they must be designed to have significantmargin for the timing constraints.

3.5

Repeat Problem 3.4 for the case of two paths if the paths have a correlation coefficient of ρ = 0.5.Hint: there is an exact expression for the PDF for this problem, which you can numerically integratewith a tool such as Matlab to get the CDF.

Solution:This problem boils down to finding the distribution of the maximum of two correlated Gaussianrandom variables. The paper ”Exact Distribution of the Max/Min of Two Gaussian Random Vari-ables” by Saralees Nadarajah and Samuel Kotz contains an exact expression for this distribution,which can be numerically integrated in Matlab to get the CDF. For two correlated, but identicallydistributed Gaussians, the expression becomes:

f(x) =2σφ(−x+ µ

σ

)Φ( (ρ− 1)(−x+ µ)

σ√

1− ρ2

)

14

where φ( · ) is the standard normal PDF and Φ( · ) is the standard normal CDF. Plugging in 5 for x(since we are concerned about the Tclk constraint, and Tclk = 5ns), we get:

Pr[two paths pass] = .9825

which is slightly larger than with two uncorrelated paths, as expected.

15

1 models - university of california,...

Documents