aes effecitve software implementation
TRANSCRIPT
![Page 1: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/1.jpg)
Effective Software Implementation of
Advanced Encryption Standard December 2014
Roman Oliynykov
Professor atInformation Technologies Security Department
Kharkov National University of Radioelectronics
Head of Scientific Research Department JSC “Institute of Information Technologies”
Ukraine
Visiting professor at Samsung Advanced Technology Training Institute
![Page 2: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/2.jpg)
Outline
A few words about myself Brief history of AES/Rijndael AES properties Direct AES implementation and problems with it Methods for effective encryption
implementation (proposed by Rijndael authors in their submission to AES competition)
Decryption optimization Conclusions
![Page 3: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/3.jpg)
About myself (I)
I’m from Ukraine (Eastern part of Europe), host country of Euro2012 football championship
I live in Kharkov (the second biggest city in the country, population is 1.5 million people), Eastern Ukraine (near Russia),former capital of the Soviet Ukraine (1918-1934)three Nobel prize winners worked at Kharkov University
![Page 4: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/4.jpg)
About myself (II)
Professor at Information Technologies Security Department at Kharkov National University of Radioelectronics courses on computer networks and operation
system security, special mathematics for cryptographic applications
Head of Scientific Research Department at JSC “Institute of Information Technologies” Scientific interests: symmetric cryptographic
primitives synthesis and cryptanalysis
Visiting professor at Samsung Advanced Technology Training Institute courses on computer networks and operation
system security, software security, effective application and implementation of symmetric cryptography
![Page 5: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/5.jpg)
Modern and effective solution: Advanced Encryption Standard (AES) result of international public cryptographic competition
(1997-2000) had been chosen among 15 candidate ciphers
(developed in the US, Belgium, Denmark, Germany, Israel, Japan, Switzerland, Armenia, etc.)
original name is Rijndael (developed by researchers from Belgium)
votes on 3rd AES conference had been given to this cipher, but the rest Twofish (US), MARS (US, IBM), E2 (Japan, Camellia predecessor), Serpent (Israel) are also remain strong
the most researched block cipher all over the world (2014, open publications)
basis for development of many other symmetric primitives
![Page 6: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/6.jpg)
AES properties
block length 128 bits only (subset of Rijndael which supports 128, 192 and 256 bits)
key length is 128, 192 and 256 bits uses Substitution-Permutation Network (SPN) number of rounds (10,12,14) depends on key length quite transparent design, algebraic structure
(theoretically may be vulnerable to algebraic analysis)
quite effective in software (32-bit platforms) and hardware implementation
![Page 7: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/7.jpg)
AES parameters: key length, block size, number of rounds
![Page 8: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/8.jpg)
AES: presentation of processing bytes as a “cipher state”
![Page 9: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/9.jpg)
AES: main steps
running key schedule procedure: generation of all round keys
running encryption or decryption procedure
or, for compact hardware implementation, sequential operations: generation of the current round key one encryption round
![Page 10: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/10.jpg)
AES: high-level structure (pseudocode)
![Page 11: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/11.jpg)
AES: high-level structure (picture for 128 bit key)
![Page 12: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/12.jpg)
AES: SubBytes transformation
![Page 13: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/13.jpg)
AES: ShiftRows transformation
![Page 14: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/14.jpg)
AES: MixColumns transformation
![Page 15: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/15.jpg)
AES: AddRoundKey transformation
![Page 16: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/16.jpg)
AES round key generation (key expansion)
NB: not all key length (128, 192, 256) must be supported; for many applications it’s enough to have the single key length
![Page 17: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/17.jpg)
AES round key generation: RotWord
![Page 18: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/18.jpg)
AES round key generation: SubBytes
![Page 19: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/19.jpg)
AES round key generation: round constant application
NB: without Rcon there would be equal blocks in ciphertext if plaintext and keys have equal blocks (1, 2 or 4 bytes repeats in plaintext and key)
![Page 20: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/20.jpg)
AES round key sequence
![Page 21: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/21.jpg)
AES decryption (direct presentation): reverse operations in different order
![Page 22: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/22.jpg)
AES/Rijndael design goals
be extremely fast on 32 bit platforms (+++) be compact on hardware implementation with
small number of gates (++) possibility to implement cipher on 8-bit smart-
card processors actual for 1990th (++) cryptographic strength (+)
![Page 23: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/23.jpg)
Direct implementation of AES round function: SubBytes
16 operations (byte substitution)
![Page 24: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/24.jpg)
Direct implementation of AES round function: ShiftRows
12 operations (byte permutation)
![Page 25: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/25.jpg)
AES: MixColumns transformation
60 operations (logical and conditional): 3+ operations for each input byte (48+ total):
• shift and conditional XOR (mult by 02)• XOR (mult by 03)
3 XORs for each row (12 total)
![Page 26: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/26.jpg)
Direct implementation of AES round function
SubBytes: 16 operations (byte substitution) ShiftRows: 12 operations (byte permutation) MixColumns: 60 or even more operations
(conditions will prevent effective pipelining) AddRoundKey: 16 operations (logical)
TOTAL: more than 102 operations per round
![Page 27: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/27.jpg)
AES effective software implementation: 32-bit platform
three different operations can be united into the single (!) look-up table access: SubBytes (non-linear) ShiftRows (linear) MixColumns (linear)
cipher consists of look-up table accesses and round key additions
![Page 28: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/28.jpg)
AES effective software implementation: MixColumns
Matrix multiplication: 7 operations (4 memory look-ups + 3 XORs) instead of 60:
32-bit XOR of 4 columns each column depends on one input byte only all 4 bytes in each column are precomputed and stored in
advance
![Page 29: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/29.jpg)
AES round function operations sequence variants:
Original: SubBytes ShiftRows MixColumns
Equivalent: ShiftRows SubBytes MixColumns
![Page 30: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/30.jpg)
AES effective software implementation: MixColumns and SubBytes at one precomputed table
SubBytes and MixColumns: 7 operations (4 memory look-ups + 3 XORs) total:
32-bit XOR of 4 columns each column depends on one input byte only (already sent throw
S-box) all 4 bytes in each column are precomputed and stored in advance
![Page 31: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/31.jpg)
Fragment of OpenSSL AES source code (based on Rijndael author's implementation)
4 tables are needed; size of each table is 256 * 4 = 1 kByte
![Page 32: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/32.jpg)
Fragment of OpenSSL AES source code (based on Rijndael author's implementation)
ShiftRows is implemented as usual shift and mask of 32-bit register;SubBytes and MixColumns are implemented as memory lookups (8 bit → 32 bit)
![Page 33: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/33.jpg)
AES effective software implementation: extra memory optimization
Decreasing memory amount: single table (1 kByte instead of 4 tables of 1 kB each)
![Page 34: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/34.jpg)
Main table size for the fastest and compact optimized 32-bit AES implementation fastest:
(4 bytes) x (256 different entries to S-box) x x (4 different positions for ShiftRow) == 4 kbytes
compact optimized: (4 bytes) x (256 different entries to S-box) ==
== 1 kbyte three additional operations in C ( << , >>, | or ^)
are needed besides a table look-up
NB: for reaching highest performance precomputed tables and processing data must fit into L1 processor cache (32-64kBytes for modern processors)
![Page 35: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/35.jpg)
Number of 32-bit operations needed for a single block encryption at main transformation (having all round keys)
( (4 look-up) + (3 xors) ) * (4 columns) ==== 28 operations / round
4 xors with round keys == == 4 operations / round
(28 + 4) * (9 rounds) == 288 operations for high strength encryption of 9 rounds (!)
(16 operations on SubBytes) + (24 operations on ShiftRows) + (4 xors with round keys) == == 44 operations at last round
![Page 36: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/36.jpg)
AES decryption: high-level structure (pseudocode)
![Page 37: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/37.jpg)
AES decryption: optimization
SubBytes() and ShiftRows() transformations commute, their sequence can be chaged
The column mixing operations - MixColumns() and InvMixColumns() – are linear with respect to the column input, which means InvMixColumns(state xor Round Key) == InvMixColumns(state) xor InvMixColumns(Round Key)
![Page 38: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/38.jpg)
AES optimized decryption with changed round keys
![Page 39: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/39.jpg)
Additional details on AES implementation
two set of tables for encryption main optimized set (MixColumns, ShiftRows and
SubBytes) separate S-box array for the last round
two set of tables for decryption (complexity is the same as for encryption) main optimized set (InvMixColumns, InvShiftRows
and InvSubBytes) separate reverse S-box array for the last round
NB: ECB decryption is not needed for the most block cipher modes of operation
![Page 40: AES effecitve software implementation](https://reader034.vdocuments.mx/reader034/viewer/2022051414/55cadc37bb61eb322b8b481c/html5/thumbnails/40.jpg)
Conclusions
direct AES implementation is very slow (requires many byte operations and conditions)
three different round function operations can be united into the single look-up table access
with effective implementation AES consists of look-up table accesses and round key additions
the fastest version AES requires 4 kB of memory for tables, fast but compact requires 1 kB
fast AES decryption operation has the same speed as encryption and uses changed order of round function operations with modified round keys