low power aes implementations for rfid dina kamel, francesco regazzoni, cédric hocquet, david bol,...
TRANSCRIPT
Low power AES implementations for RFID
Dina Kamel, Francesco Regazzoni, Cédric Hocquet, David Bol, Denis
Flandre and François-Xavier Standaert
Outline
• Overview – RFIDs– Why AES ?– RFID Power budget
• Design of S-box– Technology selection– Supply voltage– Logic style
• Subthreshold AES core
2BCRYPT 2010
RFID
Analog Front End
Non Volatile Memory (NVM)
Logic / Base Band (BB)
Rectifier
Regulator
Clock regenerator/Divider
Mod/DeMod
RFID passive tag
3BCRYPT 2010
General Constraints:1- Power – few µW2- Area – few K Gates3- Latency – ms
Technology road map for memories
BCRYPT 2010 4
Foundries used to provide NVM down till 0.18 µmIP vendors provide NVM down till 45 nm targeting several foundries
Why AES• Nowadays RFID are at 180 nm and 130 nm mainly
for memory issues• The technology trend is pushing for smaller
technologies (also for memories)• Smaller technologies allow to implement
complex algorithms / enhanced functionality• 3-D stacking enables mixed technologies e.g.
65 nm logic + 130 nm NVM
• AES is the standard
BCRYPT 2010 5
Move to 65nm to overcome area problems…
• 65nm will allow compact AES implementation• Widespread use of Low-Power technology
flavor• Low fabrication costs for high volume
production
BCRYPT 2010 6
…low power is still an issue
• Passive RFIDs are battery less devices• Power constraints are still present at 65 nm
(leakage)• In advanced technologies, such as 65 nm and
below, two flavors are developed:– General purpose (GP)– Low power (LP)
BCRYPT 2010 7
Power budget
BCRYPT 2010 8
Analog Front End
Non Volatile Memory (NVM)
Logic / Base Band (BB)
Rectifier
Regulator
Clock regenerator/Divider
Mod/DeMod
RFID passive tag
Logic / Base Band (BB)
Digital circuitryCryptographic
functionAES
Power Budget for: HF (13.56 MHz): 22.5 µWUHF (900 MHz): 4 µW
[A.S.W. Man, RFID Eurasia’07]Power: 4.7 µWTech.: TSMC 0.18 µmVDD: 1.8 VSim. results using Power compiler
How the power is distributed in an 8bit Architecture AES
BCRYPT 2010 9
34%
15%5%
24%
8%
7%7%
0Memory
Clock GenerationControl
S-box
MixColumns
KeyExpansion
Remainder of datapath[T. Good, TVLSI’09]
S-box Design
• The optimized S-box given by [N. Mentens,05]– It uses the composite field GF(((22)2)2)
• Power and delay aspects in light of different parameters:– Technology selection– Supply voltage– Logic style
BCRYPT 2010 10
[D. Kamel, ISCAS’09]
S-box design: Technology selection
• 0.13 µm main properties of Standard VT (SVT) and High VT (HVT) NMOS transistors.
BCRYPT 2010 11
Tech. flavor
Device type
VDD
V
Tox
nm
Vt
mV
Ion
µA/ µm
Ioff
nA/ µm
Ig
pA/ µm
GPSVT 1.2 2 247 670 46 9
HVT 1.2 2 336 537 2 12
+ 90 mV23 x
lower
[D. Kamel, ISCAS’09]
S-box design: Technology selection• 65 nm Main properties of Low VT (LVT), Standard VT (SVT) and High VT
(HVT) NMOS transistors in GP and LP technology flavors.
BCRYPT 2010 12
Tech. flavor
Device type
VDD
V
Tox
nm
Lpoly
nm
Vt
mV
Ion
µA/ µm
Ioff
nA/ µm
Ig
nA/ µm
GPSVT 1 1.3 45 475 896 62 8.97
HVT 1 1.3 45 555 740 4.7 6.18
LP
LVT 1.2 1.85 57 507 855 4.2 0.0114
SVT 1.2 1.85 57 645 702 0.52 0.008
HVT 1.2 1.85 57 721 501 0.036 0.0054
[D. Kamel, ISCAS’09] 3 orders of magnitude
10-10
10-8
10-6
10-4
Pow
er (
W)
0.13
HV
T
0.13
SVT
65 G
P SVT
65 G
P HV
T
65 LP L
VT
65 LP S
VT
65 LP H
VT
10-10
10-8
10-6
10-4
I off ,
I Gat
e (n
A/µ
W)
Simulation resultsPower consumption at 100 kHz
BCRYPT 2010 13
IoffIgate
8.7 μW/MHz [P. Hamalainen, DSD’06]
*870 nW
3.71 μW
90.6 nW
10 times less than 870 nW7 times less than 630 nW reported by [Feldhofer,05] using 0.35 μm, 1.5 V
Power
1.8 times less than 166 nW reported by [T. Good,TVLSI’09] using 0.13 μm, 0.75 V
[D. Kamel, ISCAS’09]
Simulation resultsDelay
BCRYPT 2010 14
1 1.5 2 2.5 3 3.5 410
-8
10-7
10-6
10-5
Delay (ns)
Pow
er (
W)
130 nm
SVT
HVT
65 nmGP
SVT
HVT
LVT
SVT HVT65 nm
LP
2.2 ns
2.35 ns
Power↓40
[D. Kamel, ISCAS’09]
S-box design: Supply voltage
• Simulations are done using 65 nm LP SVT devices at 100 kHz and at nominal conditions.
BCRYPT 2010 15
0.7 0.8 0.9 1 1.1 1.2 1.30
5
10
15
Del
ay (
ns)
0.7 0.8 0.9 1 1.1 1.2 1.30
50
100
150
VDD
(V)
Pow
er (
nW
)
[D. Kamel, ISCAS’09]
* 166 nW [T. Good, TVLSI09] at 0.75 V
5 times less than 166 nW reported by [T. Good,TVLSI09] using 0.13 μm, 0.75 V
Fine for 100 KHz (large margin)
Promising, but robustness ?
S-box Design: compare different logic families
• Standard Logic: Static CMOS (S-CMOS)• Dynamic Differential Logic: Dynamic
Differential Swing Limited Logic (DDSLL) – Protected Logic
• Why ?– Security – more resilient against power analysis
attacks
BCRYPT 2010 16
Static CMOS versus Dynamic Differential Swing Limited logic
BCRYPT 2010 17
A A
B BB B
Clki Clki
Clki
OU
T
OU
T
S
ENO
ENO
OUTOUT
Tre
e
Clki+1
Clki
Clki
A
B
B
A
A
B
B
A
OUT
A A
B B
SC XOR
DDSLL XORA B OUT0 0 00 1 11 0 11 1 0
PPart
NMOS Tree
FeedBack
Completion SignalCurrent
Source
[I. Hassoune, the VLSI Journal’07]
DDSLL – How does it work ?
BCRYPT 2010 18
A A
B BB B
Clki Clki
Clki
OU
T
OU
T
S
ENO
ENO
OUTOUT
Tre
e
Clki+1
Clki
Clki
Pre-charge
Evaluation
9 11 13 15 17 190
0 .5
1
V(C
lki)
(V)
9 11 13 15 17 190
0 .5
1
V(T
ree
) (V
)
9 11 13 15 17 190
0 .5
1
(V)
Vk0Vk0b
9 11 13 15 17 190
0 .51
V(S)
(V)
9 11 13 15 17 190
0 .51
V(EN
O) (
V)
9 11 13 15 17 19-2
0
2x 10
-4
I(Tre
e) (V
)
9 11 13 15 17 190
0 .51
V(C
lki+
1) (
V)
Time ( s )
10 10 .001 10 .0020
0 .5
1
V(C
lki)
(V)
1 0 10 .001 10 .0020
0 .5
1
V(T
ree
) (V
)
1 0 10 .001 10 .0020
0 .5
1
(V)
Vk0Vk0b
10 10 .001 10 .0020
0 .51
V(S)
(V)
1 0 10 .001 10 .0020
0 .51
V(EN
O) (
V)
1 0 10 .001 10 .002-5
0
5x 10
-6
I(Tre
e) (V
)
1 0 10 .001 10 .0020
0 .51
V(C
lki+
1) (
V)
Time ( s )
12 .001 12 .002 12 .003 12 .004 12 .0050
0 .5
1
V(C
lki)
(V)
1 2 .001 12 .002 12 .003 12 .004 12 .0050
0 .5
1
V(T
ree
) (V
)
1 2 .001 12 .002 12 .003 12 .004 12 .0050
0 .5
1
(V)
Vk0Vk0b
12 .001 12 .002 12 .003 12 .004 12 .0050
0 .51
V(S)
(V)
1 2 .001 12 .002 12 .003 12 .004 12 .0050
0 .51
V(EN
O) (
V)
1 2 .001 12 .002 12 .003 12 .004 12 .0050246
x 10-5
I(Tre
e) (V
)
1 2 .001 12 .002 12 .003 12 .004 12 .0050
0 .51
V(C
lki+
1) (
V)
Time ( s )
[I. Hassoune, the VLSI Journal’07]
BCRYPT 2010 19
DDSLL is too complex !
# trans for 1 XOR SC DDSLL
P transistors 4 9
N transistors 4 12
Total transistors 8 21
DDSLL – Sharing principle
BCRYPT 2010 20
Clki Clki
Clki
OU
T1
OU
T1
S
ENO
ENO
OUT1OUT1
Tre
e
Clki+1
Clki
Clki
Clki Clki
OU
Tn
OU
Tn
…..NMOS Tree 1 NMOS Tree n
[I. Hassoune, the VLSI Journal’07]
DDSLL – Sharing principle
BCRYPT 2010 21
- The whole DDSLL AES S-box consists of 13 stages- The total number of DDSLL S-box transistors is 1.2 times less that of S-CMOS S-box
# trans for S-box S-CMOS DDSLL
Total transistors 1530 1275
Trans GF(28) ->
GF(((22)2)2)
Trans GF(((22)2)2) -> GF(28)
+Affine Trans
1 1 1 3 1 1 3 1 1
Measurement results of S-CMOS and DDSLL S-boxes
BCRYPT 2010 22
S-CMOS S-box DDSLL S-box
70 80 90 100 110 120 130 1400
1
2
3
4
Tota l Powe r (nW )
occ
ura
nce
46 µm
24 µ
m
46 µm
25 µ
m
83 nW
70 80 90 100 110 120 130 1400
1
2
3
4
Tota l Powe r (nW )
occ
ura
nce 127 nW
Area
Power
Delay 3 – 3.2 ns 7.5 – 8.1 ns70 80 90 100 11 0 120 130 1400
1
2
3
4
Total Powe r (nW )
Die
co
un
t
S -CMOSDDS LL
= 1.53PS-CMOS
PDDSLL
Thanks to lower voltage swing
Full AES core
• Base architecture [Feldhofer,05]:– 128 AES– 8 bit data path– S-box GF(((22)2)2)
• Design Target:– sub-threshold 65nm– 100 kHz– Low power
BCRYPT 2010 23
Sub-threshold Design Flow
BCRYPT 2010 24
HDL
Synth
P&R
Library
Constraints
Sub-threshold librarySynopsys
Designcompiler
CadenceEncounter
[C. Hocquet, FTFC’09]
Design of 65 nm subthreshold library
• Start point: 65nm library with nominal voltage 1.2V
• Keep in the library only gates with maximum stack of 2 MOSFETs
• Re-characterize the library at 0.4V (lowest VDD for 100kHz)
• Final library: 73 cells
BCRYPT 2010 25
[C. Hocquet, FTFC’09]
Results
BCRYPT 2010 26
1.2 V standard library
1.2 V restricted library
0.4 V restricted library
Comparison with state of the art
BCRYPT 2010 27
Implementation Technology Area[GEs]
Max. freq.[MhZ]
Power @ 100 kHz [µW]
Proposed 1.2 V 65 nm 3500 0.1 @ 0.4 V 0.12 @ 0.4 V
[Feldhofer,05] 1.5 V 0.35 µm 3400 80 4.5
[Hamalainen, DSD’06] 1.2 V 0.13 µm 3400 130 3
[Good,TVLSI’09] 1.2 V 0.13 µm 5500 12.8 0.692 @ 0.75 V
Conclusions• The S-box consumes the largest percentage of power.• By choosing the appropriate technology the S-box power
can be reduced by more than 1 order of magnitude – from 3.71 µW (0.13 µm) to 90 nW (65 nm - LP) while maintaining same delay
• Reducing the VDD of S-box from 1.2 V to 0.8 V decreases the power by 60 %, but increases the delay x3
• The DDSLL logic reduces the power x1.5 than S-CMOS• Subthreshold AES is a good candidate for Ultra Low power
RFID applications – 120 nW @ 0.4 V
BCRYPT 2010 28
BCRYPT 2010 29
Thank you