cryptographic aes
TRANSCRIPT
-
7/29/2019 Cryptographic AES
1/51
Towards FPGA Architectures OptimizedFor Cryptographic Algorithms
-
7/29/2019 Cryptographic AES
2/51
Table of Contents Antecedents
Motivation General and Specific Objectives State art of the work
Results Publications Future Work
Conclusions
-
7/29/2019 Cryptographic AES
3/51
AntecedentsCryptographic algorithms can be implemented through
Software
ASIC FPGAs
Choice of platform depends upon
Algorithm performance Cost Flexibility
-
7/29/2019 Cryptographic AES
4/51
Antecedents(continued) Software Most flexible Low Performance
Low cost ASIC High performance No flexibility at all
High cost
FPGAs Most flexible Low cost High performance
-
7/29/2019 Cryptographic AES
5/51
FPGAs-Potential FeaturesCryptographic algorithms-Basic
Functions
Motivation
-
7/29/2019 Cryptographic AES
6/51
FPGA: Field programmableGate Arrays
-
7/29/2019 Cryptographic AES
7/51
Configurable Logic Block
Logic Mode
CombinationalLogic
CombinationalLogic
1-bitreg
1-bitreg
16x1RAM
4
16x1RAM
4
1-bitreg
1-bitreg
Memory Mode
4
4
-
7/29/2019 Cryptographic AES
8/51
Virtex-II ProFeature/Product
XC
2VP2
XC
2VP4
XC
2VP7
XC
2VP20
XC
2VP30
XC
2VP40
XC
2VP50
XC
2VP70
XC
2VP100
XC
2VP125
EasyPath cost reduction - - - -XCE
2VP30
XCE
2VP40
XCE
2VP50
XCE
2VP70
XCE
2VP100
XCE
2VP125
Logic Cells 3,168 6,768 11,088 20,880 30,816 43,632 53,136 74,448 99,216 125,136
Slices 1,408 3,008 4,928 9,280 13,696 19,392 23,616 33,088 44,096 55,616
BRAM (Kbits) 216 504 792 1,584 2,448 3,456 4,176 5,904 7,992 10,008
18x18 Multipliers 12 28 44 88 136 192 232 328 444 556
Digital Clock Management
Blocks4 4 4 8 8 8 8 8 12 12
Config (Mbits) 1.31 3.01 4.49 8.21 11.36 15.56 19.02 25.6 33.65 42.78
PowerPCProcessors
0 1 1 2 2 2 2 2 2 4
Max Available Multi-Gigabit
Transceivers*4 4 8 8 8 12* 16* 20 20* 24*
Max Available User I/O* 204 348 396 564 644 804 852 996 1164 1200
http://www.xilinx.com/products/tables/fpga.htm#v2p
1 Logic Cell = (1) 4-input LUT + (1) FF + (1) Carry Logic
1 CLB = (4) Slices
-
7/29/2019 Cryptographic AES
9/51
Cryptographic algorithms on
FPGAsCryptographic algorithms contains: Simple logical operations - at a bit level
Replicated blocks block length is highCan benefits FPGAs because FPGAs actually treat bit level operations
Blocks can be just copied Parallelism is possible (high no. of IOs) More physical security Flexibility
High density
-
7/29/2019 Cryptographic AES
10/51
GeneralTo achieve optimized implementations for
cryptographic algorithms
Specific Objectives DES: Data encryption standard AES: Advance Encryption Standard ECC: Elliptic Curve Cryptography
Objectives
-
7/29/2019 Cryptographic AES
11/51
BackgroundThe Advanced Encryption Standard (AESAlgorithm) is a computer security standard
that became effective on May 26, 2002 byNIST to replace DES. The cryptographyscheme is a symmetric block cipher thatencrypts and decrypts 128-bit blocks of data.
Lengths of 128, 192, and 256 bits arestandard key lengths used by AES Algorithm.
-
7/29/2019 Cryptographic AES
12/51
ComparisonMARS RC6 Rijndael Serpent Twofish
General security 3 22
3 3
Implementation of
security1 1 3 3 2
Software performance 2 2 3 1 1
Smart card performance 1 1 3 3 2
Hardware performance 1 2 3 3 2
Design features 2 1 2 1 3
-
7/29/2019 Cryptographic AES
13/51
AES: Advanced Encryption
Standard
AES
Plain Text
Key
Cipher Text
128
128
128
AES Processes
Key Scheduling Encryption Decryption
-
7/29/2019 Cryptographic AES
14/51
AES: Advanced EncryptionStandard
1514131211109876543210bbbbbbbbbbbbbbbb
15141312
111098
7654
3210
bbbb
bbbb
bbbb
bbbb
Input = 128 bits = 16 bytes
-
7/29/2019 Cryptographic AES
15/51
Round
Key 0
Round
Key 1
Round
Key 3
.. Round
Key 10
15141312
111098
7654
3210
kkkk
kkkkkkkk
kkkk..
..
Key Scheduling
31302928
27262524
23222120
19181716
kkkk
kkkk
kkkk
kkkk
175174173172
171170169168
167166165164
163162161160
kkkk
kkkk
kkkk
kkkk
-
7/29/2019 Cryptographic AES
16/51
AES Encryption Algorithm Flow
BS: Byte SubstitutionSR: Shift RowsMC: Mix Column
ARK: Add Round Key
ARK BS ARK BS SR ARK
SR MC
IN OUT
(ROUND-1..9)
USER KEY SUB KEY SUB KEY
-
7/29/2019 Cryptographic AES
17/51
S-BOX16x16a0,0 a0,1 a0,2 a0,3
a1,0 a1,1 a1,2 a1,3
a2,0 a2,1 a2,2 a2,3
a3,0 a3,1 a3,2 a3,3
b0,0 b0,1 b0,2 b0,3
b1,0 b1,1 b1,2 b1,3
b2,0 b2,1 b2,2 b2,3
b3,0 b3,1 b3,2 b3,3
Byte Substitution
State Matrix
BS ARK
SR MC
SUB KEY
-
7/29/2019 Cryptographic AES
18/51
a b c d
f g h ek l i j
p m n o
a b c d
e f g hi j k l
m n o p
Offset 0
ShiftRow(SR)
SROffset 1
Offset 2
Offset 3
a b c d
f g h e
k l i j
p m n o
a b c d
e f g h
i j k l
m n o p
Offset 0
ISROffset 1
Offset 2
Offset 3
BS ARK
SR MC
SUB KEY
-
7/29/2019 Cryptographic AES
19/51
MixColumn(MC) &
Inv MixColumn(IMC)
i
i
i
i
c
c
c
c
c
c
c
c
,3
,2
,1
,0
0,0
0,0
0,0
0,0
02010103
03020101
01030201
01010302
i
i
i
i
c
cc
c
EDB
BEDDBE
DBE
c
cc
c
,3
,2
,1
,0
0,0
0,0
0,0
0,0
00900
0009000009
09000
MC
IMC
i=0,1,2,3
BS ARK
SR MC
SUB KEY
-
7/29/2019 Cryptographic AES
20/51
a3,3a3,2a3,1a3,0
a2,3
a2,2
a2,1
a2,0
a1,3a1,2a1,1a1,0
a0,3a0,2a0,1a0,0
b3,3b3,2b3,1b3,0
b2,3
b2,2
b2,1
b2,0
b1,3b1,2b1,1b1,0
b0,3b0,2b0,1b0,0
k3,3k3,2k3,1k3,0
k2,3
k2,2
k2,1
k2,0
k1,3k1,2k1,1k1,0
k0,3k0,2k0,1k0,0
AddRoundKey(ARK)
key
BS ARK
SR MC
SUB KEY
-
7/29/2019 Cryptographic AES
21/51
Our Contributions
Design 1: Encryptor Core
Sequential vs. Pipelined Architecture
Design 2: Encryptor/Decryptor Core
MixColumn & Inv. MixColumn modified
Design 3: Encryptor/Decryptor Core S-Box & Inv. S-Box
-
7/29/2019 Cryptographic AES
22/51
Design 1: Encryptor Core
Sequential vs. Pipelined Architecture
Our Contributions
-
7/29/2019 Cryptographic AES
23/51
AES Algorithm ImplementationSequential Approach
KGEN LATCHROUND
KEY
S RCON CLKUSERKEY
RND 1-9 LATCH
SROUND-KEY CLK
RND 10
RND 0
ROUND-KEY
CIPHERTEXT
PLAIN
TEXT
USER-KEY
-
7/29/2019 Cryptographic AES
24/51
AES Algorithm ImplementationPipelined Approach
INR
EG
RND
0
RND
1
RND
2
RND
3
RND
4
RND
5
RND
6
RND
7
RND
8
RND
9
RND
10
IN OUT
INR
EG
KGEN
KGEN
KGEN
KGEN
KGEN
KGEN
KGEN
KGEN
KGEN
KGEN
KGEN
USER- KEY
RK
0
RK
1
RK
2
RK
3
RK
4
RK
5
RK
6
RK
7
RK
8
RK
9
RK
10
-
7/29/2019 Cryptographic AES
25/51
Our Contributions
Design 2: Encryptor/Decryptor Core
MixColumn & Inv. MixColumn Modified
-
7/29/2019 Cryptographic AES
26/51
BS and Inverse BS
IAF
MI
AF S-BOX
INV S-BOX
IN
E/D
MI AF
IAF MI
S-BOX
INV S-BOX
IN
-
7/29/2019 Cryptographic AES
27/51
**Every entry is represented in GF(28)
MixColumn(MC) &
Inv MixColumn(IMC) Revisted
i
i
i
i
cc
c
c
cc
c
c
,3
,2
,1
,0
0,0
0,0
0,0
0,0
02010103
03020101
01030201
01010302
i
i
i
i
c
c
c
c
EDB
BED
DBE
DBE
c
c
c
c
,3
,2
,1
,0
0,0
0,0
0,0
0,0
00900
00090
00009
09000
MC
IMC
-
7/29/2019 Cryptographic AES
28/51
i
i
i
i
c
c
c
c
c
c
c
c
,3
,2
,1
,0
0,0
0,0
0,0
0,0
02010103
03020101
01030201
01010302
i
i
i
i
c
c
cc
EDB
BED
DBEDBE
c
c
cc
,3
,2
,1
,0
0,0
0,0
0,0
0,0
00900
00090
0000909000
xxxtimextimexxtimextimextimeDx 0
xxtimex 02 xxxtimex 03 Where
For MC, the biggest co-efficient is, 03
For IMC, the biggest co-efficient is, 0D
The co-efficient for IMC have higher hamming weight ? It is costly operation?
MixColumn(MC) &
Inv MixColumn(IMC) Cont
-
7/29/2019 Cryptographic AES
29/51
05000400
00050004
04000500
00040005
02010103
03020101
01030201
01010302
00900
00090
00009
09000
EDB
BED
DBE
DBE
We observe that,
(1) (2)
xxxtimextimex 05The biggest co-efficient for Eq.2 is, 05
Eq.1, we already have, Eq.2 calculation can be made before Eq.1
MixColumn(MC) &
Inv MixColumn(IMC) Cont
-
7/29/2019 Cryptographic AES
30/51
Encryption: MI + AF + SR + MC + ARKDecryption: ISR + IAF + MI + ModM + MC + ARK
Data Path for
Encryption/Decryption
-
7/29/2019 Cryptographic AES
31/51
Our Contributions
Design 3: Encryptor/Decryptor Core S-Box & Inv. S-Box
-
7/29/2019 Cryptographic AES
32/51
S-BOX
16x16
a0,0 a0,1 a0,2 a0,3
a1,0
a1,1
a1,2
a1,3
a2,0 a2,1 a2,2 a2,3
a3,0 a3,1 a3,2 a3,3
b0,0 b0,1 b0,2 b0,3
b1,0 b1,1 b1,2 b1,3
b2,0 b2,1 b2,2 b2,3
b3,0 b3,1 b3,2 b3,3
Byte Substitution (Revisited)
State Matrix
IAFMI
IAF S-BOX
INV S-BOX
IN
-
7/29/2019 Cryptographic AES
33/51
MI: 1st Approach
MI with Lookup Table
Same S-Box (MI) for encryption/decryption Memory requirements become half
BRAMs are used for storing MI values. No initial time to prepare them
ISR
MI
AF
IN
IAF
SR
E/D MCARK
IMCIARK
OUT
E/D
-
7/29/2019 Cryptographic AES
34/51
MI Three-Stage StrategyS. Morioka and A. Satoh, CHES 2002
MI with Composite Fields GF(22)2 &GF(24)21. Map the elementAGF(28) to a composite fieldF2. Compute the Multiplicative Inverse over the fieldF3. Map back from fieldFto GF(28)
MI: 2nd Approach
M-1M
GF(28) TO FIELD F IN GF(24) FIELD F TO GF(28)
MIManipulation
IstTransformation
2ndTransformation
-
7/29/2019 Cryptographic AES
35/51
MI Implementation
Let AF2 and A= AHy+ AL, then it can be shown that:
LLHLLHHLHH
AAAAAAAyAAA
AAyAA
16216161617
16
0
;
-
7/29/2019 Cryptographic AES
36/51
Results
AES Algorithm Implementations
-
7/29/2019 Cryptographic AES
37/51
Matrix to measure?
Throughput := Clock cycle (Frequency) x No. of bitsNo. of rounds
1
2
FPGAs Resources used CLB slices BRAMs etc.
-
7/29/2019 Cryptographic AES
38/51
Device(XCV)
Area(CLB slices)
Throughput(Mbs)
Through-put/Area
Gaj et al[1] 1000 2902 331.5 0.11
Dandalis et al[2] 1000 5673 353 0.06
Nazaret al 812 2744 258.5 0.09
Device(XCV)
Area (CLB slices) Throughput(Mbits/s)
Throughput/Area
Elbirt et al[3] 1000 9004 1940 0.22
Nazaret al 2600 2136 2868 1.29
Sequential Vs Pipeline design
Sequential Design
Pipeline Design
-
7/29/2019 Cryptographic AES
39/51
MixColumn vs Inv MixColumn
Device BRAMs CLB(S)
Slices
Throughput
(Mbits/s)(T)
T/S
McLoone etal XCV3200E 102 7576 3239 0.43
This design XCV2600E 80 5677 4121 0.73
Two approach for MC/IMC Less BRAMs Less Slices Higher Throughput reported to-date
-
7/29/2019 Cryptographic AES
40/51
S-Box Vs Inv S-Box
Device BRAMs CLB(S)
Slices
Throughput
(Mbits/s)(T)
T/S
McLoone[] XCV3200E 102 7576 3239 0.43
E/D GF(28) XCV2600E 80 6676 3840 0.58
E/D GF(24) XCV2600E No
BRAMs
13416 3136 0.24
Two approaches for MI Key Scheduling included No initial delay
First design uses look-up table for MI,Fast but high memory requirements
Second design use composite field approachfor MI, Slower with less memory requirements.
Both are efficient as compared to reported design
-
7/29/2019 Cryptographic AES
41/51
PCI
PCI
9054
Xilinx
xc2v1000
(FIFO)
(DMA)
Xilinx
xc2v3000
(
)
PC PCI
-
7/29/2019 Cryptographic AES
42/51
-
7/29/2019 Cryptographic AES
43/51
Our Contributions
Elliptic Curve Cryptography
-
7/29/2019 Cryptographic AES
44/51
Scaler Multiplication
Q = kP
Point doubling Q=2PPoint addition R=P+Q
MultiplicationSquaring,Addition etc.
Elliptic
CurveOperation
GF(2m)Arithmatic
Elliptic Curve Cryptography
-
7/29/2019 Cryptographic AES
45/51
GF(2191) Arithmetic-Square
0
2
2
4
4
6
6
2
01
2
2
3
3
axaxaxaAaxaxaxaA
A = 1111A2= 1010101
-
7/29/2019 Cryptographic AES
46/51
GF(2
191
) Arithmetic-Reduction
-
7/29/2019 Cryptographic AES
47/51
Karatsuba Multiplier GF(2191
)
12
0
1
2
1
0
m
i
i
i
im
mi
i
im
i ixaxaxaA
LHm
i
m
i i
im
m
i i
m
AAxxaxax
21
2
02
12
0
2
12
0
1
2
1
0
m
i
i
i
im
mi i
im
i i xbxbxbB
LHm
im
i i
im
m
i i
m
BBxxbxbx
21
2
02
12
0
2
LLm
HLLHHHm BAxBABABAxC 2
LLm
LHLHLLHHHHm BAxBBAABABABAxC 2
Then Polynomial multiplication of A and B is:
LHm CCx
The karatsuba algorithm has an idea that the above product can be written as:
-
7/29/2019 Cryptographic AES
48/51
Point addition GF(2
191
)
333223113
361412521
45122611
126215214
213212211
3
qpzqpyqpx
qqq
ppp
yzyzxz
zxyxxy
Hessian Form
222211113333
,,,,,, zyxPzyxPzyxP
-
7/29/2019 Cryptographic AES
49/51
Point doubling GF(2
191
)
912712812
549468657
16215114
2
13
2
12
2
11
3
zzxyyx
zyx
zyx
Hessian Form
111111112222
,,,,,, zyxPzyxPzyxP
-
7/29/2019 Cryptographic AES
50/51
Performance results
No. of CLB
slices
Timings(ns)
Karatsuba MultiplierGF(2191)
8721 43.123
Point addition GF(2191) 9894 863.9
Point doublingGF(2191) 8531 422.02
Tool : Xilinx Foundation F4.1iDevice: XCV2600E
For ECC scalar multiplicationMaximum Reported timings := 170 s [Gerardo, Chess 2000,]Estimated timings :=
-
7/29/2019 Cryptographic AES
51/51
Conclusions
A promising AES Encryptor/decryptor Core (contributions forAES S-Box/Inv S-Box)
Using look-up table for S-Box
Using Composite Fields GF(24)
An optimized AES Encryptor/decryptor Core (contributionsfor AES MC/IMC) Using Modified version for IMC
Efficient arithmetic for ECC (sqr,mul,point addition,pointdoubling)
Future work , completion of ECC scalar multiplicationThesis writing and defense