設計技術から見た 半導体集積回路の省電力技術 -...
TRANSCRIPT
設計技術から見た半導体集積回路の省電力技術
東京大学大規模集積システム設計教育研究センター(VDEC)1
生産技術研究所2
高宮 真
STRJ WS: March5, 2009, 特別講演
2
Outline低電力設計技術の動向(1)低電圧、(2)細粒度制御、(3)3次元
ロジック回路の電源電圧の下限(VDDmin)
細粒度基板バイアス制御による低電力化
3次元SSDのNANDフラッシュ向け昇圧回路に
よる低電力化(SSD: Solid State Drive)
3
電源電圧(VDD)の低減の必要性
■ 90nm→65nm→45nmとVDD=1Vが続いたが、今後は電力と信頼性の観点から、VDDの低減が必須
年2008 2010 2012 2014 2016 2018 2020 2022
00.10.20.30.40.50.60.70.80.9
1
Low-power logic
High-performance logic
302010
0
405060
Design rule
DR
AM
hal
f pitc
h (n
m)
電源
電圧
(V)
ITRS2007
研究のターゲット
Pswitching ∝ ƒVDD2
Pleakage ∝ IleakageVDD
4
エネルギー効率最適は低VDDで実現
Power, Delay積(PD積=エネルギー)はVDD=0.2-0.3Vで最小→速度が問われないアプリでは低VDDがenergy efficient
Nor
mal
ized
del
ay &
pow
er
VDD [V]0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110-5
10-4
10-3
10-2
10-1
100
00.10.20.30.40.50.60.70.80.91
Delay
Power
PD product
90nm CMOS SPICE
Nor
mal
ized
PD
pro
duct
5
Energy Efficientな超低VDDロジック
■ VDD=230mVまで動作はするが、320mVがエネルギー効率最高H. Kaul, M. Anders, S. Mathew, S. Hsu, A. Agarwal, R. Krishnamurthy, and S. Borkar, "A 320mV 56µW 411GOPS/Watt ultra-low voltage motion estimation accelerator in 65nm CMOS," IEEE ISSCC, pp. 316-317, Feb. 2008.
■超低VDDロジックに関する初めての企業(Intel)からの報告
6
時空間の細粒度電圧制御がトレンド
Number of power control domain on a chip1 10 100
ns
StaticCommon
fCLK, VDD, VTH
SingleDomain
fCLK3, VDD3, VTH3
fCLKn, VDDn, VTHn
Tim
e st
ep o
f pow
er c
ontr
ol
Domain1 2 3
4
n
1
μsConventional LSI
Future LSI
fCLK1, VDD1, VTH1
近年の低電力VLSI回路技術の方向性
7
細粒度制御には3次元技術との連携が必須
Package
InductorsCapacitors
Embedded
Power supply& other wires
Basechip
Stacked memories
Pads &bumps
L & Ccell array
Interposer
Sensor, MEMS,High voltage generation,Analog, RF etc.(3D stacked)
Parallelprocessors with own DC-DCconverters
8
細粒度と3次元に対する我々の取り組み
電源電圧の下限(VDDmin)の追求
チップ内トランジスタばらつき測定[1]
ロジック回路のVDDmin[2-3]
空間的細粒度制御
製造後の細粒度基板バイアス制御による低電力化[4]
メニーコア向けテスト手法[5]
3次元積層を用いたオンチップDC-DCコンバータ[6-7]
時間的細粒度制御
電源電圧、基板バイアスを高速に変化させる加速回路[8-11]
電源ノイズキャンセル回路[12]
3次元集積技術3次元SSDのNANDフラッシュ向け昇圧回路による低電力化[13]
今回発表
今回発表
今回発表
9
Outline低電力設計技術の動向(1)低電圧、(2)細粒度制御、(3)3次元
ロジック回路の電源電圧の下限(VDDmin)
細粒度基板バイアス制御による低電力化
3次元SSDのNANDフラッシュ向け昇圧回路に
よる低電力化(SSD: Solid State Drive)
10
Minimum Operating Voltage (VDDmin)
VDDmin is defined as the supply voltage (VDD) when the RO’s stop oscillation.RO’s are useful VDDmin detectors.
0 20 40 60 80 1000
50
100
150
200
Volta
ge (m
V)
Time (μs)
VDD
Vout
VDDmin
VoutSimulated
11
0 1000 2000 3000 40000.00.20.40.60.81.01.21.41.6
規格
化V T
H
トランジスタ位置 (µm)
1E-3 0.01 0.10
1
10
100
Pow
er (a
.u.)
空間周波数 (1/µm)
VD(1.0V)
… Vs(0V)
Vsel Vunsel
GND
clk
Vbs
Data
…
D Q
VD
VunselVS
Vsel
Vbs
Measuredtransistor
D Q D Q D Q D Q
VD(1.0V)
… Vs(0V)
Vsel Vunsel
GND
clk
Vbs
Data
…
D QD QD Q
VD
VunselVS
Vsel
Vbs
Measuredtransistor
D QD Q D QD Q D QD Q D QD Q
1000 Transistor
Array
4mm
4mm
(400
0 tr
ansi
stor
s)頻度
4000トランジスタ
フーリエ変換
チップ内VTHばらつきはランダム
キー技術(1) 挟ピッチかつ広範囲のTrアレー(2) 非選択Trのリークカット
トランジスタばらつき測定回路
チップ内トランジスタばらつき
90nmCMOS
チップ内VTHばらつきは4mmの範囲内でランダム
→細粒度の基板バイアス制御では補償不可
12
Analysis of Origin of VDDminVDD
1 2 3 10 11V1 V2 V3 V10 V11
V1
V2
V3
V4
V5
V6
V7
V8
V9
V10
V11 VINV
V1~V11
VDD=85mV
1 2 3 4 5 6 7 8 9 10 110
10
20
30
40
50
60
70
80
V INV
of in
vert
er, V
1~V 11
(mV)
Inverter number
85
FailVOUT_LOW_7 > VINV_8
H
L L
H H
Monte CarloSPICE
Each transistor has random VTH.
13
RO Circuits to Enable VDDmin Measurement
Ring Oscillator Output Buffer
VDD2
VSS2
Low swing
0V
1V 1V swingVDD2
VSS2(manually tuned)
VDD
VSS
The low swing output of RO is amplified to 1-V swing by the output buffer.
14
VDD Dependence of Oscillation Frequency variation
11-stagering oscillators
13 dies
Freq
uenc
y Va
riatio
ns(=
σ/ a
vera
ge) (
%)
VDD (V)0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.00
5
10
15
( )
pdpd TH
TH
DDpd
DD TH
pdTH
pd DD TH
pdTH
pd DD TH
dtt V
dVCVt
V Vt
Vt V V
tf Vf t V V
α
Δ = Δ
∝−
Δ α→ ∝ Δ
−
ΔΔ α→ ∝ − ∝ Δ
−実測90nm CMOS
Relative frequency variations increases with reduced VDD.
15
ゲート遅延のVDD依存
■ 低VDD回路の問題: 低速、PVTばらつきに敏感■ 対策: 並列動作、adaptive制御
実測90nm CMOSインバータRO
0.0 0.2 0.4 0.6 0.8 1.010k
100k
1M
10M
100M
1G
10G
Osc
illat
ion
freq
uenc
y (H
z)
VDD (V)
11-stage
1001-stage
VDDmin
13 dies
1mm
400μm
MicrographLayout
11-stage
101-stage
1001-stage
16
Measured Die-to-Die Distribution of VDDmin
0 50 100 150 200 250 300 350 400 4500
5
10
15
0 50 100 150 200 250 300 350 400 4500
5
10
15
0 50 100 150 200 250 300 350 400 4500
5
10
15
11-stage
Num
ber o
f die
s
VDDmin (mV)
101-stage
1k-stage
0 500
5
10
15
100 150 200 250 300 350 400 450
0 500
5
10
15
100 150 200 250 300 350 400 450
0 500
5
10
15
100 150 200 250 300 350 400 450
10k-stage
VDDmin (mV)
100k-stage
1M-stage
(4)
(5)
(6)N
umbe
r of d
ies
Inverter RO’s
(1)
(2)
(3)
17
10 100 1k 10k 100k 1M0
50
100
150
200
250
300
350
400
論理ゲートのVDDminをROで調査
■ 大規模になるほど、より高いVDDminが必要に→大規模ロジックになるほど、低VDD化が困難
リングオシレータ段数
平均
V DD
min
(mV)
実測90nm CMOSインバータRO複数チップ測定
2.2mm
1.3m
m
1M stage inverter RO
18
Analysis of Die-to-Die VDDmin Variations
10 100 1k 10k 100k 1M0
50
100
150
200
250
300
350
400
10 100 1k 10k 100k 1M0
50
100
150
200
250
300
350
400V D
Dm
in(m
V)
Number of stagesThe die-to-die VDDmin variations are not systematic but random.
15 dies
Inverter RO’s
19
Summary of Measured VDDmin
10 100 1k 10k 100k 1M0
50
100
150
200
250
300
350
400
Ave
rage
VD
Dm
in (m
V)
Number of stages
x4 inverter RO
Inverter RO
3NAND RO
2NAND RO
(1) Large # of stages(2) Large # of stacked Tr’s(3) Narrow gate widthincrease VDDmin.
(3 different lots)
20
Comparison of Measured and Calculated VDDmin
Sim075
Sim150
x4 inverter RO
10 100 1k 10k 100k 1M0
50
100
150
200
250
300
350
400
MATLABMeasurement
Ave
rage
VD
Dm
in(m
V)
Number of stages
Sim125
Sim100Inverter RO
x0.7522.525.8Sim075
x1.2537.543.0Sim125x1.545.051.6Sim150
x130.034.4Sim100
pMOSnMOS RemarksσVTH (mV)
Name
x0.7522.525.8Sim075
x1.2537.543.0Sim125x1.545.051.6Sim150
x130.034.4Sim100
pMOSnMOS RemarksσVTH (mV)
Name
Inverter RO’s
21
Reason Why Average VDDmin Increases with # of RO Stages
σnxx 10max log2+≅
Prob
abili
ty d
istr
ibut
ion
func
tion
x
3
2
1
0σ− 2x σ−x x σ+x σ+ 2x σ+ 3x σ+ 4x σ+ 5x σ+ 6x σ+ 7x
n=101
n=0
102
103104
106
109
105
107108
1010
( )nxf ,max
( )xf
maxx at n=106
The largest value distributions fmax(x,n) of n sampleswhich have Gaussian distribution f(x)
22
Comparison of Monte Carlo and Model
Sim075
Sim150
x4 inverter RO
10 100 1k 10k 100k 1M0
100
200
300
400
Ave
rage
VD
Dm
in(m
V)
n: Number of stages
Sim125Sim100Inverter RO
Matlab (Monte Carlo simulation)Measurement
Model calculation
σnxx 10max log2+≅
The equation intuitively explains the reason why the average VDDmin increases with the number of RO stages.
23
Adaptive Body Bias Control to Reduce VDDmin
Body bias control
VDDmin=89 mV (Initial)
VDDmin=87 mV
1 2 3 4 5 6 7 8 9 10 111 2 3 4 5 6 7 8 9 10 11
1 2 3 4 5 6 7 8 9 10 111 2 3 4 5 6 7 8 9 10 11
Simulated
The body bias of pMOS is adaptively controlled to minimize VDDmin and the body bias of nMOS is fixed.
24
Fine-Grain Adaptive Body Bias Control to Reduce VDDmin
VDDmin=85 mV
VDDmin=43 mV
Vb1 Vb2 Vb3 Vb4 Vb5 Vb6
1 2 3 4 5 6 7 8 9 10 11
Vb1 Vb2 Vb3 Vb4 Vb5 Vb6
1 2 3 4 5 6 7 8 9 10 11
Vb1Vb2
Vb3Vb4
Vb5Vb6
Vb7Vb8
Vb9Vb10
Vb11
1 2 3 4 5 6 7 8 9 10 11
Vb1Vb2
Vb3Vb4
Vb5Vb6
Vb7Vb8
Vb9Vb10
Vb11
1 2 3 4 5 6 7 8 9 10 111 2 3 4 5 6 7 8 9 10 11
Simulated
When inverter-by-inverter body bias is applied, VDDmin is drastically reduced to 43mV. But it is impractical.
25
VDDmin Dependence on Body Bias of Both nMOS and pMOS
0.5 0.0 -0.5 -1.0 -1.5 -2.0
050
100150200250300350400
-0.50.00.51.01.52.0
V DD
min (m
V)
pMOS Body Bias (V)
nMOS Body Bias (V)
40.080.0120160200240280320360
04080
Initial
Measured
Common body bias control allows to reduce VDDmin by only 4mV.
26
Outline低電力設計技術の動向(1)低電圧、(2)細粒度制御、(3)3次元
ロジック回路の電源電圧の下限(VDDmin)
細粒度基板バイアス制御による低電力化
3次元SSDのNANDフラッシュ向け昇圧回路に
よる低電力化(SSD: Solid State Drive)
27
細粒度基板バイアス制御による低電力化
■ 基板バイアスのグローバル最適化により電力を19%以上削減■ post-fabrication tuningにより設計ばらつきを補正
VN1 VP1
VN2 VP2
VN3 VP3
VN4 VP4
VN5 VP5
VN6 VP6
VN7 VP7
VN8 VP8
-25%
-20%
-15%
-10%
-5%
0%
0 5 10 15 20 25 30 35 40Iteration count [times]
As-fabricated
>19% power reductionat 20 iterations
N8P8SimulatedAnnealing
1V, 90nm CMOSプロセス
64bit DES CODEC機能
1V, 90nm CMOSプロセス
64bit DES CODEC機能
1つの機能ブロックを8領域に等分割
基板バイアス16変数の最適化
電力
実測
28
Outline低電力設計技術の動向(1)低電圧、(2)細粒度制御、(3)3次元
ロジック回路の電源電圧の下限(VDDmin)
細粒度基板バイアス制御による低電力化
3次元SSDのNANDフラッシュ向け昇圧回路に
よる低電力化(SSD: Solid State Drive)
29
Importance of 20V generator in NAND
Write time~800µs
Read time~50µs
Write time is dominant over read time.Write 8 to 16 chips simultaneously.
20V or higher program voltage for writeEnergy during write should be reduced.
0V0V
Program voltage:20V
0VInjection
Floating gate
High-speed low-power 20V generator is required.
Write operation of NAND flash
30
Conventional SSD with charge pump
DRAM
NANDflash
Each NAND flash has charge pump for 20V.5 to 10% area of NAND flash chip!
NAND controller
Interposer
Chargepump
31
Issues on charge pump
Large capacitance areaEnergy loss
VOUT =20V
ClkClk
VDD
Serial MOS diodes lose energy.Large number of stages for low VDD
Large capacitancefor large current
Bucket brigade
VOUT
VDD
32
Voltage scalability of charge pump
1.8V NAND(simulated)
Ener
gy d
urin
g w
rite
(a.u
.)
*K. Takeuchi, et al., ISSCC 2006Energy by charge pump increases!
3.3V NAND*(Core 2.5V)
Energy increases!
Memorycore
Chargepump
Others
Memorycore
Chargepump
☺∝CBLV2
CBL: Bit-line capacitance
33
Proposed 3D-SSD with boost converter
DRAM
Adaptive controller
NANDflash
Interposer
Spiralinductor
Low-cost High-voltage
MOSNAND controller
Smaller die size
Boost converter (shared)
Chargepump
Chargepump
☺ Realizing low power and low cost
34
Advantages of Boost converters
☺ High conversion ratio, large output current☺ High efficiency☺ Small chip area
Off-chip inductor
TON TOFF
Frequency = 1 / (TON + TOFF)Duty cycle = TON / (TON + TOFF)
Frequency, duty cycle Conversion ratio (VOUT/VDD)Inductance Output current
VOUTVDD
Clk
35
Boost converter & NAND Co-operation High voltage MOS
(0.35mm × 0.50mm)
Inductor(5mm x 5mm)
16Gb NAND flash
Adaptive controller(0.67mm × 0.28mm)
36
Comparison of energy during write
This work1.8V NAND
Conventional1.8V NAND(Simulated)
Chargepump
Memorycore
Chargepump
Memorycore
MemorycoreOthers
Total-68%
Ener
gy d
urin
g w
rite
(a.u
.)
*K. Takeuchi, et al., ISSCC 2006.
Conventional3.3V NAND*(Core 2.5V)
Boostconverter
withadaptivecontrol
37
Summary of key features
--------20V CMOS process
Technology(High voltage MOS)
3.45µs (100%)0.92µs (27%)Rising time (0 15V)
This work(Measured)
Charge Pump(Simulated)
Transient energy(0 15V) 30nJ (12%) 253nJ (100%)
Chip area (HV-MOS) 0.175mm2 (15%) 1.19mm2 (100%)
Chip area(Adaptive controller) 0.188mm2 --------
Technology(Adaptive controller)
1.8V 0.18µm standard CMOS --------
Supply voltage 1.8V 1.8V
--------20V CMOS process
Technology(High voltage MOS)
3.45µs (100%)0.92µs (27%)Rising time (0 15V)
This work(Measured)
Charge Pump(Simulated)
Transient energy(0 15V) 30nJ (12%) 253nJ (100%)
Chip area (HV-MOS) 0.175mm2 (15%) 1.19mm2 (100%)
Chip area(Adaptive controller) 0.188mm2 --------
Technology(Adaptive controller)
1.8V 0.18µm standard CMOS --------
Supply voltage 1.8V 1.8V
38
まとめ低電力設計技術の3つのキーワード
→(1)低電圧、(2)細粒度制御、(3)3次元
チップ内製造ばらつきはランダム(90nm CMOS, 4mm)→細粒度制御では対処不能→一方、設計ばらつきは細粒度制御で対処可能
リングオシレータの段数を11段から1M段にすると、VDDminは90mVから 343mVに増加
→大規模ロジックの低電圧化は困難→革新的な回路技術が必要
3次元による低電力化の例 →SSD
39
参考文献(1)[1] D. Levacq, T. Minakawa, M. Takamiya, and T. Sakurai, "A Wide Range Spatial Frequency
Analysis of Intra-Die Variations with 4-mm 4000 x 1 Transistor Arrays in 90nm CMOS," IEEE Custom Integrated Circuits Conference (CICC), San Jose, USA, pp. 257-560, Sep. 2007.
[2] T. Niiyama, P. Zhe, K. Ishida, M. Murakata, M. Takamiya, and T. Sakurai, "Dependence of Minimum Operating Voltage (VDDmin) on Block Size of 90-nm CMOS Ring Oscillators and Its Implications in Low Power DFM," IEEE International Symposium on Quality Electronic Design (ISQED), San Jose, USA, pp. 133-136, March 2008.
[3] T. Niiyama, P. Zhe, K. Ishida, M. Murakata, M. Takamiya, and T. Sakurai, "Increasing Minimum Operating Voltage (VDDmin) with Number of CMOS Logic Gates and Experimental Verification with up to 1Mega-Stage Ring Oscillators," International Symposium on Low Power Electronics and Design (ISLPED), Bangalore, India, pp. 117-122, Aug. 2008.
[4] Y. Nakamura, D. Levacq, L. Xiao, T. Minakawa, T. Niiyama, M. Takamiya, and T. Sakurai, "1/5 Power Reduction by Global Optimization Based on Fine-Grained Body Biasing," IEEE Custom Integrated Circuits Conference (CICC), San Jose, USA, pp. 547-550, Sep. 2008.
[5] T. Niiyama, K. Ishida, M. Takamiya, and T. Sakurai, "Expected Vectorless Teacher-Student Swap (TSS) Test Method with Dual Power Supply Voltages for 0.3V Homogeneous Multi-core LSI’s," IEEE Custom Integrated Circuits Conference (CICC), San Jose, USA, pp. 137-140, Sep. 2008.
[6] K.Onizuka, H. Kawaguchi, M. Takamiya and T. Sakurai, "Stacked-chip Implementation of On-chip Buck Converter for Power-Aware Distributed Power Supply Systems," IEEE Asian Solid-State Circuits Conference (A-SSCC), Hangzhou, China, pp. 127-130, Nov. 2006.
[7] K. Onizuka, K. Inagaki, H. Kawaguchi, M. Takamiya, and T. Sakurai, "Stacked-Chip Implementation of On-Chip Buck Converter for Distributed Power Supply System in SiPs," IEEE Journal of Solid-State Circuits, Vol. 42, No. 11, pp. 2404 - 2410, Nov. 2007.
40
参考文献(2)[8] K.Onizuka and T. Sakurai, "VDD-Hopping Accelerator for On-Chip Power Supplies Achieving
Nano-Second Order Transient Time," IEEE Asian Solid-State Circuits Conference (A-SSCC), Hsinchu, Taiwan, pp. 145-148, Nov. 2005.
[9] K. Onizuka, H. Kawaguchi, M. Takamiya, and T. Sakurai, "VDD-Hopping Accelerators for On-Chip Power Supply Circuit to Achieve Nanosecond-Order Transient Time," IEEE Journal of Solid-State Circuits, Vol. 41, No. 11, pp. 2382 - 2389, Nov. 2006.
[10] D. Levacq, M. Takamiya and T. Sakurai, "Backgate Bias Accelerator for 10ns-order Sleep-to-Active Modes Transition Time," IEEE Asian Solid-State Circuits Conference (A-SSCC), Jeju, Korea, pp. 296-299, Nov. 2007.
[11] D. Levacq, M. Takamiya, and T. Sakurai, "Backgate Bias Accelerator for sub-100 ns Sleep-to-Active Modes Transition Time," IEEE Journal of Solid-State Circuits, Vol. 43, No. 11, pp. 2390 - 2395, Nov. 2008.
[12] Y. Nakamura, M. Takamiya, and T. Sakurai, "An On-Chip Noise Canceller with High Voltage Supply Lines for Nanosecond-Range Power Supply Noise ," IEEE Symposium on VLSI Circuits, Kyoto, pp. 124-125, June 2007.
[13] K. Ishida, T. Yasufuku, S. Miyamoto, H. Nakai, M. Takamiya, T. Sakurai, and K. Takeuchi, "A 1.8V 30nJ Adaptive Program-Voltage (20V) Generator for 3D-Integrated NAND Flash SSD," IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, USA, pp. 238-239, Feb. 2009.