11-2 post-silicon tuning capabilities of 45nm low-power cmos … · 110 978-4-86348-010-0 2009...
TRANSCRIPT
Post-silicon tuning capabilities of 45nm low-power CMOSdigital circuitsCitation for published version (APA):Meijer, M., Liu, B., Veen, van, R., & Pineda de Gyvez, J. (2009). Post-silicon tuning capabilities of 45nm low-power CMOS digital circuits. In Proceedings of 2009 Symposium on VLSI Circuits, 16-18 June 2009, Honolulu,Hawaii (pp. 110-111). Piscataway: Institute of Electrical and Electronics Engineers.
Document status and date:Published: 01/01/2009
Document Version:Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)
Please check the document version of this publication:
• A submitted manuscript is the version of the article upon submission and before peer-review. There can beimportant differences between the submitted version and the official published version of record. Peopleinterested in the research are advised to contact the author for the final version of the publication, or visit theDOI to the publisher's website.• The final author version and the galley proof are versions of the publication after peer review.• The final published version features the final layout of the paper including the volume, issue and pagenumbers.Link to publication
General rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright ownersand it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.
• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal.
If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, pleasefollow below link for the End User Agreement:
www.tue.nl/taverne
Take down policyIf you believe that this document breaches copyright please contact us at:
providing details and we will investigate your claim.
Download date: 02. Feb. 2020
110 978-4-86348-010-0 2009 Symposium on VLSI Circuits Digest of Technical Papers
11-2Post-Silicon Tuning Capabilities of 45nm Low-Power CMOS Digital Circuits
Maurice Meijer1, Bo Liu2, Rutger van Veen1 and Jose Pineda de Gyvez1,2 1NXP Semiconductors, Eindhoven, The Netherlands
2Technical University of Eindhoven, Eindhoven, The Netherlands
Abstract Adaptive circuit techniques enable modification of power-
performance efficient circuit operation. Yet it is unclear if such techniques remain effective in modern deep-submicron CMOS. In this paper we examine the technological boundaries of supply voltage scaling and body biasing in 45nm low-power CMOS. We demonstrate that there exists an effective tuning range for power-performance and performance variability control. Our analysis is supported by ring oscillator test-chip measurements.
Introduction Modern integrated circuits have been equipped with supply
voltage scaling (VS) and body bias (BB) tuning approaches for improving power-performance efficient operation [1-3]. Although the benefits of such design technologies are well-established, their effectiveness is strongly process technology dependent. In this paper, we explore the technological boundaries of VS and BB for a state-of-the-art 45nm low-power (LP) CMOS process. In particular, we investigate power-performance trade-offs, leakage power savings, and how far process-dependent performance spread can be tuned. Moreover, we investigate if standard-Vth (SVT) with BB-tuning can eliminate the need for low-Vth (LVT) and high-Vth (HVT) masks.
Test-Chip Design and Tuning Ranges A test-chip with a size of 1.25x0.44mm2 has been implemented in
45nm LP-CMOS (Fig.1). It contains two copies of 10 inverter-based ring oscillators (ringos) with different chain lengths which are part of the same core. Layout identical ringo instances are placed at a distance of 125µm.The test-chip contains three cores with different threshold voltage options, namely SVT, LVT, and HVT. Each core has independent power supply voltage (VDD) pads for current measurements. The ground pads, and independent body bias voltage (VBB) pads for PMOS (Vnwell) and NMOS (Vpwell) devices are common for all cores. Frequency, power, and leakage have been measured for 57 dies on a 300mm wafer. The measurements were performed for 0.6V-1.3V VDD, and for a range of temperatures in between -40oC and 125oC. The nominal VDD setting equals 1.1V. Moreover, a body biasing was applied ranging from 1.1V reverse body bias (RBB) up to 0.5V forward body bias (FBB). In this paper we will use a symmetrical body bias, e.g. VBB=Vpwell=VDD-Vnwell, and 25oC, unless stated otherwise.
Power-Performance Tuning and Leakage Control Fig. 2 shows the frequency distributions versus VDD for 57 dies of
a LVT, SVT, and HVT 101-stage ringo, respectively. The symbols indicate the frequency of the median sample. We measured a frequency downscaling of 7.8x (LVT), 13.4x (SVT), and 33.3x (HVT) when VDD reduces from 1.1V to 0.6V. Energy and frequency trade-offs for the SVT chip sample are illustrated in Fig.3. Each cloud relates to a unique VDD value, and each dot in a cloud corresponds to a unique VBB. We measured a 44x power reduction, and 3.3x energy savings using VS from 1.1V to 0.6V. The use of BB at VDD=1.1V provided a large frequency tuning range from -19% (1.1V RBB) till 27% (0.5V FBB) w.r.t. the nominal operating point. The frequency tuning index factors are -11% (-31%) up to 17% (41%) for a LVT (HVT) median die sample. The impact of body biasing on energy is small, even for a low circuit activity of 0.3%. For SVT, we obtained 15% energy savings at the same performance w.r.t. the nominal operating point by using combined VS and BB (VDD=1.0V, 0.4V FBB). BB tuning was proposed to eliminate the use of LVT and HVT in 45nm CMOS [3]. At 1.1V VDD our experiments confirm that SVT with 0.5V FBB can achieve LVT
performance (Fig.4). However, this gives a 3.6x higher leakage than LVT for the median samples. VS is not preferred for achieving LVT performance due to the associated power penalty. Furthermore, we observed that SVT with RBB alone can not even achieve nominal HVT leakage (Fig.4). This is due to the small body factor (γ) available, and the presence of gate-induced drain leakage at large RBB values. Alternatively, VS alone or combined with RBB enable SVT circuits to effectively achieve HVT leakage (Fig.5). Fig.5 shows the SVT leakage current distributions versus VBB for two VDD values at 25oC. The symbols indicate the results for the median sample. At 1.1V VDD we measured 1.5x-2.8x leakage savings using optimal RBB settings. Reducing VDD from 1.1V down to 0.6V is more effective (4.0x-4.5x). Combined VS+RBB provided 10x-22x leakage savings. The actual leakage savings are strongly temperature dependent (Fig.6). The dominant leakage components determine if VS or RBB is more effective. VS+RBB showed maximum leakage savings around 75oC for the SVT median sample. The location of this maximum depends on process skew, and Vth option used.
Performance-Spread Compensation The impact of systematic and random process variability on ringo
timing has been determined by means of a clock-period correlation plot (Fig.7). The clock-period of two closely located layout-identical ringos has been correlated for each die sample. The statistical delay spread of an 11-, 21-, 31-, 41- and 101-stage ringo has been calculated for three BB values at 1.1V VDD (Fig.8). In Fig.8, the solid and open symbols relate to the 3σ systematic and random delay spread, respectively. The trend lines are extrapolated from the 101-stage ringo, which are closely matching the results from the other ringos. Observe that the delay spread reduces consistently when FBB is applied, while it increases for RBB. Fig.9 shows a more detailed analysis for the 21-stage ringo. The total delay spread is about 2x lower for 0.5V FBB w.r.t. the nominal BB case. Contrarily, the spread is about 2x higher for 1.1V RBB. Observe that FBB can significantly reduce both systematic and random delay spread in 45nm LP-CMOS. Fig.10 puts in perspective the clock period mean and 3σ-spread versus VBB for the 21-stage ringo. A ±11% spread was observed at the nominal operating point. This spread could be fully compensated through BB tuning using up to 0.2V FBB for slow die samples, and up to 0.7V RBB for fast die samples. Frequency and leakage was measured for all available samples (Fig.11). For each die sample, it was possible to tune its frequency to the nominal target spec through BB tuning. We required a BB range from 0.1V RBB up to 0.2V FBB. This gives basically an enhancement to 100% parametric yield for our sample set. We measured a 32% frequency increase with 0.5V FBB for the slowest die sample at a 26x leakage penalty. This offers sufficient tuning range for compensating process-dependent performance spread.
Conclusions Test-chip measurements show that VS and BB remain effective in
45nm LP-CMOS. We demonstrated the presence of large power-performance modification capabilities. For SVT circuits, VS enables 3.3x energy savings when the frequency can be reduced by 13.2x. Combined VS+BB results in 15% energy savings at no frequency penalty, and 10x-22x leakage savings at 25oC. BB tuning offers sufficient range to achieve process-related performance spread compensation. Moreover, FBB can effectively reduce timing uncertainty, e.g. we observe a 2x lower delay spread at 0.5V for a 21-stage SVT ringo. Finally, SVT+FBB can mimic LVT performance, while SVT+RBB can not achieve HVT leakage.
Authorized licensed use limited to: Eindhoven University of Technology. Downloaded on November 24, 2009 at 10:44 from IEEE Xplore. Restrictions apply.
1112009 Symposium on VLSI Circuits Digest of Technical Papers
References
000E+0
50E+6
100E+6
150E+6
200E+6
250E+6
300E+6
350E+6
400E+6
450E+6
000E+0 50E-15 100E-15 150E-15 200E-15 250E-15 300E-15
Energy consumption [J]
Freq
uenc
y [H
z]
101-stage SVT ring-oscillatorT=25oC, α=0.003
VDD=1.0V
VDD=0.9V
VDD=0.8V
VDD=0.7V
VDD=0.6V
VDD=1.1V
0V BB318MHz
251fJ
0.5V FBB
1.1V RBB
0.9
1.0
1.1
1.2
0 1 10
Normalized Leakage
Nor
mal
ized
Fre
quen
cy
-0.1 0 0.2
Num
ber o
f Sam
ples
10
0
Body Bias Voltage [V]
Nominal frequency
specification
101-stage SVT ring-oscillator57 die samplesVDD=1.1V, T=25oC
1E+6
10E+6
100E+6
1E+9
0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4
Power Supply Voltage [V]
Freq
uenc
y [H
z]
HVT
SVT
LVT
401MHz
212MHz310MHz
101-stage ring-oscillator57 die samplesnominal body bias, T=25oC
0
5
10
15
20
25
Temperature [degC]
Leak
age
Red
uctio
n Fa
ctor
VS (1.1V-0.6V) 12.0 5.5 4.2 3.5 3.1 2.9 2.6
BB (Vdd=1.1V) 1.1 1.7 2.6 3.5 4.3 4.7 4.7
BB (Vdd=0.6V) 1.6 3.5 5.3 7.1 7.9 8.1 8.0
VS+BB 19.3 19.5 22.2 24.4 24.6 23.1 21.2
-40 0 25 50 75 100 125
[1] Tschanz et.al., VLSI 2002, pp. 310-311.
[2] Nomura et.al., ISSCC 2008, pp. 262-263.
[3] Gammie et.al., ISSCC 2008, pp. 258-259.
Figure 3. Frequency vs. energy trade-offs for a 101-stage SVT ring-oscillator
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
0 1 10 100Normalized Leakage
Nor
mal
ized
Fre
quen
cy
HVT
SVT
LVTFBB
RBB
101-stage SVT ring-oscillatormedian die sampleVDD=1.1V, T=25oCVBB: 1.1V RBB up to 0.5V FBB
0.5V FBB
1.1V RBB
Figure 11. Performance spread compensation
Core A Core B Core CSVT
Selection circuit
45nm triple-well LP-CMOS process57 samples from the same 300mm-waferDie size of 1.25x0.44 mm2
2x30 fanout-1 inverter-based ring-oscillators
Nand-2 and inverter drawn device sizes
Nominal VDD
Vth options
Features
PMOS: W=415nm, L=40nmNMOS: W=290nm, L=40nm
1.1VSVT, LVT, HVT
45nm triple-well LP-CMOS process57 samples from the same 300mm-waferDie size of 1.25x0.44 mm2
2x30 fanout-1 inverter-based ring-oscillators
Nand-2 and inverter drawn device sizes
Nominal VDD
Vth options
Features
PMOS: W=415nm, L=40nmNMOS: W=290nm, L=40nm
1.1VSVT, LVT, HVT
LVT HVT
÷1024
VDD VNWELL
VSS VPWELL
VDDP
VSS
enable out
Ring-oscillator
Periphery
0.01
0.1
1
10
100
-1.2 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6
Body Bias Voltage [V]
Nor
mal
ized
Lea
kage
VDD=1.1V
VDD=0.6V HVT-min
HVT-max
FBBRBBSVT ring-oscillator core57 die samplesT=25oC
1.2µA
VS
Figure 1. Test-chip photograph and description Figure 2. Frequency vs. VDD trends
Figure 4. Frequency vs. leakage for a BB-tuned 101-stage ring-oscillator
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
-1.2 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6
Body Bias Voltage [V]
Nor
mal
ized
Clo
ck P
erio
d
21-stage SVT ring-oscillatorVDD=1.1V, T=25oC
FBBRBB
Figure 10. Estimated mean clock period and spread for 21-stage SVT ring-oscillator
3.0E-09
3.1E-09
3.2E-09
3.3E-09
3.4E-09
3.5E-09
3.0E-09 3.1E-09 3.2E-09 3.3E-09 3.4E-09 3.5E-09
Period Ring-A [s]
Perio
d R
ing-
B [s
]
Randomvariation
Total
varia
tion
101-stage SVT ring-oscil lator57 pairs of oscillatorsVDD=1.1V, T=25oC
Figure 7. Clock period correlation plot of two layout-identical 101-stage SVT ring-oscillators
000E+0
20E-12
40E-12
60E-12
80E-12
100E-12
120E-12
140E-12
160E-12
0 5 10 15 20 25 30 35 40
Number of Stages
3-si
gma
dela
y sp
read
[s]
SVT ring-oscillatorsVDD=1.1V, T=25oC
1.1V RBB
0.5V FBB
0V BB
Systematic 3σ-spreadRandom 3σ-spread
Figure 8. Systematic and random delay spread vs. number of ring-oscillator stages
000E+0
20E-12
40E-12
60E-12
80E-12
Body Bias Voltage [V]
3-si
gma
dela
y sp
read
[s]
Total spread 71.50E-12 34.36E-12 18.59E-12
Systematic spread 67.68E-12 31.43E-12 16.52E-12
Random spread 23.05E-12 13.88E-12 8.52E-12
-1.1 0 0.5
21-stage SVT ring-oscillatorVDD=1.1V, T=25oC
Figure 6. VS and BB dependent leakage savings vs. temperature for an SVT ring-oscillator core
Figure 9. Estimated delay spread for the available 21-stage SVT ring-oscillators
Figure 5. SVT leakage vs. VBB for two distinct VDD values at 25oC
Authorized licensed use limited to: Eindhoven University of Technology. Downloaded on November 24, 2009 at 10:44 from IEEE Xplore. Restrictions apply.