reconfigurable hardware (fpga) implementation of ...orar.upit.ro/docmanagerpub/file/buletin nr9...

5
50 ISSN – 1453 – 1119 RECONFIGURABLE HARDWARE (FPGA) IMPLEMENTATION OF CRYPTOGRAPHIC ALGORITHMS - AES DECRYPTION Paul BURCIU 1 , Ionuţ Mihai SIMA 2 University of Piteşti, Electronics Communications & Computer Faculty E-mail: 1 [email protected], 2 [email protected] Keywords: hardware implementation, cryptographic algorithm, FPGA platform, cryptographic module, ASIC chip, AES, VHDL code, hardware simulation. Abstract: The hardware implementation of cryptographic algorithms is a timely method, providing efficient security solutions, both regarding the processing speed and the consumed power. The present-day FPGA platforms, which are the physical groundwork for such implementations, however they are not new engineering solutions, assert as the most efficient way to practically transpose the cryptographic algorithms, resulting optimized (concerning diversified aspects) cryptographic modules, respectively in the end, unmatched (concerning performance) ASIC chips. This paper presents a new hardware implementation for the deciphering block of the AES (Advanced Encryption Standard) symmetric cryptographic algorithm, using VHDL programming language and a hardware simulation of the resulted deciphering module. 1. INTRODUCTION The cryptographic algorithms became the main proceeding for protection of very important data, the security objective called confidentiality being the one taken into account by their hardware implementation and by their integration into the present-day communication systems. Among the diverse cryptographic algorithms, the symmetric algorithms may be considered as the most susceptible of being hardware implemented, because the mathematical mechanisms used by them contain arithmetical operations which can be executed by logical combinational or sequential circuits, namely both by ordinary logical gates and by Finite State Machines (FSM), according to Church-Turing Thesis [1], in other words by VLSI digital integrated circuits. The international contest organized by the National Institute of Standards and Technology (NIST) in 1997 for selection of the new symmetric cryptographic algorithm, which was intended to replace the old DES, succesfully attacked and proved to be insecure, imposed in 2000 the Belgian algorithm RIJNDAEL as the winner and designated as the American cryptographic standard, under the name of Advanced Encryption Standard (AES), through the FIPS PUB 197 in 2001 [2]. The hardware implementation of cryptographic algorithms is a timely method, providing efficient security solutions, both regarding the processing speed and the consumed power. The present-day FPGA platforms, which are the physical groundwork for such implementations, however they are not new engineering solutions, assert as the most efficient way to practically transpose the cryptographic algorithms, resulting optimized (concerning diversified aspects) cryptographic modules, respectively in the end, unmatched (concerning performance) ASIC chips. This paper presents a new hardware implementation for the deciphering block of the AES (Advanced Encryption Standard) symmetric cryptographic algorithm, using VHDL programming language and a hardware simulation of the resulted deciphering module.

Upload: others

Post on 21-Jan-2020

14 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: RECONFIGURABLE HARDWARE (FPGA) IMPLEMENTATION OF ...orar.upit.ro/DocManagerPub/File/BULETIN NR9 V2/08.50-54.Reconfigurable... · Reconfigurable Hardware (FPGA) Implementation of Cryptographic

50

ISSN – 1453 – 1119

RECONFIGURABLE HARDWARE (FPGA) IMPLEMENTATION OF CRYPTOGRAPHIC ALGORITHMS - AES DECRYPTION

Paul BURCIU1, Ionuţ Mihai SIMA2

University of Piteşti, Electronics Communications & Computer Faculty E-mail: [email protected], [email protected]

Keywords: hardware implementation, cryptographic algorithm, FPGA platform, cryptographic

module, ASIC chip, AES, VHDL code, hardware simulation.

Abstract: The hardware implementation of cryptographic algorithms is a timely method, providing efficient security solutions, both regarding the processing speed and the consumed power. The present-day FPGA platforms, which are the physical groundwork for such implementations, however they are not new engineering solutions, assert as the most efficient way to practically transpose the cryptographic algorithms, resulting optimized (concerning diversified aspects) cryptographic modules, respectively in the end, unmatched (concerning performance) ASIC chips. This paper presents a new hardware implementation for the deciphering block of the AES (Advanced Encryption Standard) symmetric cryptographic algorithm, using VHDL programming language and a hardware simulation of the resulted deciphering module.

1. INTRODUCTION

The cryptographic algorithms became the main proceeding for protection of very important data, the security objective called confidentiality being the one taken into account by their hardware implementation and by their integration into the present-day communication systems.

Among the diverse cryptographic algorithms, the symmetric algorithms may be considered as the most susceptible of being hardware implemented, because the mathematical mechanisms used by them contain arithmetical operations which can be executed by logical combinational or sequential circuits, namely both by ordinary logical gates and by Finite State Machines (FSM), according to Church-Turing Thesis [1], in other words by VLSI digital integrated circuits.

The international contest organized by the National Institute of Standards and Technology (NIST) in 1997 for selection of the new symmetric cryptographic algorithm, which was intended to replace the old DES,

succesfully attacked and proved to be insecure, imposed in 2000 the Belgian algorithm RIJNDAEL as the winner and designated as the American cryptographic standard, under the name of Advanced Encryption Standard (AES), through the FIPS PUB 197 in 2001 [2].

The hardware implementation of cryptographic algorithms is a timely method, providing efficient security solutions, both regarding the processing speed and the consumed power. The present-day FPGA platforms, which are the physical groundwork for such implementations, however they are not new engineering solutions, assert as the most efficient way to practically transpose the cryptographic algorithms, resulting optimized (concerning diversified aspects) cryptographic modules, respectively in the end, unmatched (concerning performance) ASIC chips.

This paper presents a new hardware implementation for the deciphering block of the AES (Advanced Encryption Standard) symmetric cryptographic algorithm, using VHDL programming language and a hardware simulation of the resulted deciphering module.

Page 2: RECONFIGURABLE HARDWARE (FPGA) IMPLEMENTATION OF ...orar.upit.ro/DocManagerPub/File/BULETIN NR9 V2/08.50-54.Reconfigurable... · Reconfigurable Hardware (FPGA) Implementation of Cryptographic

PAUL BURCIU, IONUT MIHAI SIMA Reconfigurable Hardware (FPGA) Implementation of Cryptographic Algorithms - AES Decryption 51

ISSN – 1453 – 1119

2. THE BASIC CRYPTOGRAPHIC

ARCHITECTURE

Figure 1 - The basic AES-128 cryptographic

architecture (1st variant).

The 2 architectures of the AES block deciphering algorithm contain, as described in Figure 1 and Figure 2, an initial round where at its input a 128 bits ciphertext is applied, consisting of 4 columns with 4 bytes each, or in other words, 4 words with 32 bits each. The ciphertext is XORed with the initial key, also represented as a data block consisting of 4 words with 32 bits each. The inverse algorithm, like the encryption algorithm, has 3 possible versions [2], respectively Nk = 4, 6, 8 (the number of 4 bytes columns, or 32 bits words, of the key) corresponding to AES-128, AES-192 and AES-256 versions, also uniquely Nb = 4 (the number of 4 bytes columns, or 32 bits words, of the plaintext), namely 128 bits. The architectures presented by Figure 1 and Figure 2 refer to the AES-128 version.

Figure 2 - The basic AES-128 cryptographic

architecture (2nd variant).

For the 3 constructive versions we have 3 different round numbers, as shown by the presented table in Figure 3.

Figure 3 - AES versions characteristics.

After the initial round, AES-128 contains 9 decryption rounds. Each of these rounds has a corresponding round key supplied by the key schedule mechanism which starts the key generation process with the initial secret key. Every round consists of 4 main operations: Inv Shift Rows (every row of the State is right shifted by a specific number of positions, which is the inverse transposition of those made in the encryption case), Inv Sub Bytes (the substitution of every byte of the State using an inverse substitution table), Inv Mix Columns (which produces the two variants of implementation) and the Add Round Key (the round key is XORed with the resulted State of the previous operation). This sequence of operations is

Name Key Length (Nk words)

Block Size (Nb words)

Number of Rounds (Nr)

AES-128 4 4 10 AES-192 6 4 12 AES-256 8 4 14

Page 3: RECONFIGURABLE HARDWARE (FPGA) IMPLEMENTATION OF ...orar.upit.ro/DocManagerPub/File/BULETIN NR9 V2/08.50-54.Reconfigurable... · Reconfigurable Hardware (FPGA) Implementation of Cryptographic

52 UNIVERSITY OF PITESTI – ELECTRONICS AND COMPUTERS SCIENCE, SCIENTIFIC BULLETIN, No. 9, Vol. 2, 2009

ISSN – 1453 – 1119

iterated 9 times. We know that the column mixing

operations (Mix Columns() and Inv Mix Columns()) are linear with respect to the column input [2], which means:

Key) mns(RoundInvMixColu mns(State)InvMixColu

Key) Round mns(StateInvMixColu

⊕=⊕

We will obtain two variants for the

decryption algorithm: we firstly calculate the Inv Mix Columns of the State and of the Round Key and finally XOR the two results, or, secondly, we XOR the State with the Round Key, and after it we’ll apply the Inv Mix Columns function to the result.

The Inv Mix Columns function is a Galois Field (28) multiplication between the operand (described above), each of its columns being polynomials over GF(28), and a fixed polynomial given by:

{0E} {09}x {0D}x {0B}x (x)a 23-1 +++= or

=

ssss

ssss

c3,

c2,

c1,

c0,

'

c3,

'

c2,

'

c1,

'

c0,

0E090D0B

0B0E090D

0D0B0E09

090D0B0E

, for bNc0 <≤ .

This implementation uses a decomposition

of the inverse affine matrix, in two matrix terms:

=

05000400

00050004

04000500

00040005

02010103

03020101

01030201

01010302

0E090D0B

0B0E090D

0D0B0E09

090D0B0E

The final round differs from the others by

the lack of the Inv Mix Columns operation. Thus, this round only consists of Inv Shift Rows, Inv Sub Bytes and an Add Round Key operation. Finally, the result of this last round is the

plaintext State, which is a matrix with 4 words, exactly like the input ciphertext State.

The inverse key schedule uses the same key expansion routine as in the encryption case, which provides the AES-128 decryption algorithm with Nb (Nr + 1) words, namely 44 key words, or 11 round keys, with the difference from the encryption of the reverse order of the keys. The other difference from the encryption key expansion is the reverse order of the applied round constant (Rcon[i]) values.

3. FPGA IMPLEMENTATION OF AES

ENCRYPTION

Figure 4 - The occupied device resources (1st variant)

Figure 5 - The occupied device resources (2nd variant) The main goal of this hardware

implementation is not speed, but the area & resource limitations of a specific target FPGA device, respectively Xilinx Virtex-4 (model XC4VFX12 - FF668) which is a low resource platform, namely: 5472 slices, 320 I/O buffers, 10944 LUTs with 4 inputs. A summary of the occupied resources is presented in Figure 4 and Figure 5.

The implementation uses the VHDL programming language, which nowadays is a well-established commonly used language for FPGAs. The design & simulation software is

Page 4: RECONFIGURABLE HARDWARE (FPGA) IMPLEMENTATION OF ...orar.upit.ro/DocManagerPub/File/BULETIN NR9 V2/08.50-54.Reconfigurable... · Reconfigurable Hardware (FPGA) Implementation of Cryptographic

PAUL BURCIU, IONUT MIHAI SIMA Reconfigurable Hardware (FPGA) Implementation of Cryptographic Algorithms - AES Decryption 53

ISSN – 1453 – 1119

Xilinx ISE 10.1. The decryption block is represented in Figure 6, where the main signals used by the implementation are shown.

Figure 6 - The AES Decryption Block (both variants)

The limitations of this device determined the use of 64 bit inputs, consequently loaded, firstly on the LOW-HIGH transition, and secondly on the HIGH-LOW transition, both in case of the input key and in case of the input ciphertext. The use of 128 bit inputs would easily lead to exceeding of the I/O buffer resources. No buffer limitations are imposed to the output, which is a 128 bit plaintext.

From Figure 4 and 5 we can easily notice that the first implementation variant occupies more than 77 % (4241 slices) of the device’s slices, when the second implementation needs about 68 % (3771 slices) of the slice resources. The number of occupied slice flip flops is 2484 (22 %) in the first case, and 2323 (21 %) in the second case. The number of 4 input LUTs is 6166 (56 %) in the first case, and 5470 (49 %) in the second case. The conclusion is that the second implementation is more efficient than the first one, concerning the number of occupied device resources, but slower, concerning the resulted work frequencies (see the resulted work frequencies from the next section).

In both variants of the decryption implementation, because the decryption uses the same key generation mechanism as the encryption algorithm, the key expansion routine is executed before the decryption itself. This happens because the generation of the same round keys as in the encryption case, syncronously with the specific rounds, but in a reverse order, was not possible as described in [2].

The main signals are: the system clock (CLK), the system reset (RESET), LOAD signal which loads the key and the ciphertext in the initial round, and BEGIN_DEC/END_DEC which starts/ends the decryption process. The loading process, which is described above, will be immediately followed by BEGIN_DEC signal which will start the decryption process. The entire decryption process takes exactly 12 clock periods from the HIGH-LOW transition of the LOAD signal. The signals diagram is represented in the simulation chapter in Figure 7 and 8.

Like in the encryption case, this implementation reffers to the case of Electronic Code Book (ECB) mode of operation for the AES decryption.

4. SIMULATION & SECURITY ASPECTS

The detailed diagrams of the simulation processes for the AES implementation are presented below, in Figure 7 and 8. The total duration of the decryption process is 4600 ns or 4.6 µs (approx. 28 Mb/s), calculated as described in the previous section. The resulted maximum clock frequency is 172.476 MHz ≈ 172.5 MHz, corresponding to a clock period of 5.798 ns, respectively 159.808 MHz ≈ 160 MHz, corresponding to a clock period of 6.258 ns. The simulations used a sequence of 128 ‘0’s, both for the input ciphertext and the secret input key. The output sequence is shown in Figure 7 and 8: ‘140F0F1011B5223D79587717FFD9EC3A’ (in hexadecimal code).

To be remarked that the END_DEC signal announces the final result of the decryption: when END_DEC = ‘1’, the plaintext appears on the output.

The evaluation of the security provided by this implementation must take into consideration the security requirements for cryptographic modules stipulated by [4]. In fact, the security of the cryptographic module stands on the security provided by the cryptographic algorithm itself. In [5], a complex evaluation of the 5 finalists of the NIST’s international contest concludes that using S-boxes as non-linear components, ‘Rijndael appears to have an adequate security margin, but has received some criticism suggesting that its mathematical structure may lead to attacks. On the other hand, the simple structure may have

Page 5: RECONFIGURABLE HARDWARE (FPGA) IMPLEMENTATION OF ...orar.upit.ro/DocManagerPub/File/BULETIN NR9 V2/08.50-54.Reconfigurable... · Reconfigurable Hardware (FPGA) Implementation of Cryptographic

54 UNIVERSITY OF PITESTI – ELECTRONICS AND COMPUTERS SCIENCE, SCIENTIFIC BULLETIN, No. 9, Vol. 2, 2009

ISSN – 1453 – 1119

facilitated its security analysis during the timeframe of the AES development process.’ Even if some critiques were formulated (e.g. ‘the key schedule does not have high diffusion’), Rijndael was highly appreciated for the provided

level of security. Regarding attacks on implementations, [5] remarks that ‘the operations used by Rijndael are among the easiest to defend against power and timing attacks.’

Figure 7 - Simulation of the encryption block (1st variant)

Figure 8 - Simulation of the encryption block (2nd variant)

5. CONCLUSIONS

This paper has presented a brief description of the implementation of the AES block decryption algorithm, underlining the benefits of this modern design concept. An FPGA implementation of a decryption algorithm is a cryptographic module device in which the structure is software implemented.

The FPGA implementations allow us to increase flexibility, lower costs, and reduce time to release enhanced cryptographic equipment, providing a satisfactory level of security for communication applications, or other electronic data transfer processes where security is needed.

REFERENCES

[1] Wenbo Mao, ‘Modern Cryptography. Theory And Practice’, Prentice Hall PTR., ISBN: 0-13-066943-1, U.S.A., 2003. [2] Federal Information Processing Standards, ‘FIPS PUB 197 – Announcing the Advanced Encryption Standard (AES)’, U.S.A., 2001.

[3] Francisco Rodriguez-Henriquez, Nazar Abbas Saqib, Cetin Kaya Koc, ‘Cryptographic Algorithms on Reconfigurable Hardware’, Springer Science+Business Media, ISBN: 0-387-33883-7, U.S.A., 2006. [4] Federal Information Processing Standards, ‘FIPS PUB 140-3 - Security Requirements For Cryptographic Modules (Draft)’, U.S.A., 2007. [5] National Institute of Standards and Technology, ‘Report on the Development of the Advanced Encryption Standard (AES)’, 2000.