tamper proof certification system based on secure non ... · fpgas can be categorized into four...
TRANSCRIPT
-
Tamper proof certification system based on securenon-volatile FPGAs
Diogo Alcoforado da Gama de Oliveira Parrinha
Thesis to obtain the Master of Science Degree in
Electrical and Computer Engineering
Supervisor(s): Prof. Ricardo Jorge Fernandes ChavesProf. Leonel Augusto Pires Seabra de Sousa
Examination Committee
Chairperson: Prof. Gonçalo Nuno Gomes TavaresSupervisor: Prof. Ricardo Jorge Fernandes Chaves
Member of the Committee: Prof. Fernando Manuel Duarte Gonçalves
November 2017
-
ii
-
Acknowledgments
I would like to start by thanking the constant support from my family and everything they did for me,
which allowed me to close this chapter of my life. Without them, this would have been much harder. A
special thanks to my mother Marina and my father Ricardo.
Throughout the years I spent in IST, I have enjoyed working with a lot of people, from colleagues
to professors. I have made some great friends and I am happy to realize that we have spent amazing
moments together. However, I would like to offer a particular thanks to Diogo Prata for being a good
friend throughout the degree and for overcoming many common adversities together.
Finally, I would like to extend my sincere thanks to my supervisor Prof. Ricardo Chaves, for his con-
tinuous support and guidance throughout this project. His technical expertise and constant motivation
have helped me to conclude this thesis.
May this be the start of a new beginning.
Thank you!
iii
-
iv
-
Resumo
Os sistemas embebidos suportados por FPGAs têm um papel cada vez maior em sistemas crı́ticos e
de segurança. Um exemplo particular destes sistemas são os Módulos de Segurança em Hardware
(HSM), que fornecem gestão e utilização de chaves privadas, de modo seguro e confiável. Contudo, os
sistemas que estão disponı́veis comercialmente são demasiado caros e limitados nas funcionalidades
disponibilizadas. Por outro lado, as soluções baseadas em FPGAs voláteis que existem até à data, não
são adequadas para a criação de um Módulo de Segurança em Hardware, pois não contêm as carac-
terı́sticas de segurança necessárias, como funcionalidades anti-adulteração, gestão de chaves interna
segura e capacidade de prevenir clonagem. Neste trabalho, é proposto um HSM que seja de código
aberto, de baixo custo, reconfigurável e altamente flexı́vel. O sistema é suportado por um System-
on-Chip que contém uma FPGA não-volátil, com diversos serviços e caracterı́sticas de segurança. A
solução apresentada opera como um sistema de certificação versátil, capaz de providenciar gestão se-
gura de chaves, assinaturas digitais e de emitir certificados digitais confiáveis, suportando uma interface
PKCS#11 com funções adicionais. Para melhor ilustrar a flexibilidade da solução proposta, um caso-de-
uso, denominado Log-Chain, é também proposto e implementado. O Log-Chain consiste numa cadeia
de logs que pode ser incrementada e verificada, não podendo ser modificada ou repudiada. Os resulta-
dos experimentais sugerem que o sistema consegue calcular até 2 operações de assinatura/certificação
por segundo, com uma abordagem de baixo custo, adaptável e segura.
Palavras-chave: FPGA não-volátil, Módulo de Segurança em Hardware, Sistema de Certificação,Microsemi Smartfusion2 SoC
v
-
vi
-
Abstract
Embedded systems supported by FPGAs are increasingly playing a bigger role in safety-critical areas.
A particular example of such safety-critical systems are Hardware Security Modules (HSM), which pro-
vide private key management and usage, in a secure and reliable way. However, commercially available
systems are too expensive and limited in the provided functionality. On the other hand, existing volatile
FPGA solutions do not adequately provide the needed security characteristics, such as anti-tampering
features, secure internal key management and anti-cloning capabilities. Herein, an open-source, low-
cost and highly flexible reconfigurable HSM is proposed, supported by a System-on-Chip with a non-
volatile FPGA that contains several security characteristics and services. The presented solution oper-
ates as a versatile certification system that provides secure key management, digital signatures services
and is able to issue trustworthy certificates, using an extended PKCS#11 interface. To further illustrate
the flexibility of the proposed solution, a Log-Chain certification use-case is also presented, which con-
sists of a chain-of-logs that can be incremented and verified, but cannot be repudiated or modified.
Experimental results suggest that the system is able to compute up to 2 sign/certification operations per
second with a low-cost, adaptable, and secure approach.
Keywords: Non-volatile FPGA, Hardware Security Module, Certification System, MicrosemiSmartfusion2 SoC
vii
-
viii
-
Contents
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Resumo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
List of Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
1 Introduction 1
1.1 Objectives and Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Main contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Background 5
2.1 Cryptographic Services and Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 Symmetric Key Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.2 Asymmetric Key Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.3 Hashing Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.4 Secret Key Establishment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.5 Digital Signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.6 Key Certification and PKI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.7 Physically Unclonable Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Secure Computing Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Implementation Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 Smartfusion2 SoC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.1 Device Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4.2 Security Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3 State of the Art 19
3.1 FPGA as Secure Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 Key Generation and Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.3 Full Security Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
ix
-
3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4 Proposed Solution 27
4.1 Users and Key Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.2 Communication and Session Establishment . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.3 Log-Chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5 Implementation 37
5.1 Device Configuration and Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.2 Cryptographic Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.3 Key Generation and Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.4 Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.5 Log-Chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.6 Communication Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.7 Middleware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.8 Simple Time Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
6 Results 51
6.1 Cryptographic Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.1.1 SHA-256 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
6.1.2 AES-256 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.1.3 EC Scalar Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.2 System Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.3 Communication Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.4 Comparison with the State of the Art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
7 Conclusions 61
7.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Bibliography 63
A Communication Protocol 67
x
-
List of Tables
2.1 X.509v3 certificate fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Single-threaded performance (signatures/second) for different HSMs [12]. . . . . . . . . . 13
2.3 HSM Key Storage capacity [12]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 Protection mechanisms for FPGA configuration data. . . . . . . . . . . . . . . . . . . . . . 14
2.5 Key Features for Secure Hardware [4]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1 Comparison of Security Features of the different system proposals. . . . . . . . . . . . . . 25
3.1 Comparison of Security Features of the different system proposals. . . . . . . . . . . . . . 26
4.1 Key generation and storage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2 Secure session establishment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.3 Available Device commands. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.1 Non-volatile memory usage requirements for the implemented system. . . . . . . . . . . . 43
5.2 Additional API functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.3 Supported official PKCS#11 API functions. . . . . . . . . . . . . . . . . . . . . . . . . . . 47
6.1 Operation times for the three SHA-256 implementations. . . . . . . . . . . . . . . . . . . . 52
6.2 Operation times for the two AES-256 implementations. . . . . . . . . . . . . . . . . . . . . 54
6.3 Operation times for the three versions conceived. . . . . . . . . . . . . . . . . . . . . . . . 55
6.4 Operation times for the three versions conceived. . . . . . . . . . . . . . . . . . . . . . . . 56
xi
-
xii
-
List of Figures
2.1 An example of an elliptic curve. Example equation: y2 = x3 + ax+ b . . . . . . . . . . . . 7
2.2 SmartFusion2 SoC FPGA Block Diagram [39]. . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 Detailed security and settings model diagram [41]. The green segments in the middle are
stored in non-volatile memory. The COMBLK performs the communication between the
MSS (software) and security services (System Controller). . . . . . . . . . . . . . . . . . . 17
3.1 FPGA as a Trusted Machine [7]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2 Setup: TA generates a public/private RSA key pair and transfers the private key privatekf
into the bootstrapping binary (in blue: data is encrypted so it doesn’t need to go through
a secure channel) [11]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3 Block diagram of the Amuet architecture [10]. The embedded application is actually the
user application, protected by the proposed wrapping system. . . . . . . . . . . . . . . . . 24
4.1 The proposed overall system architecture. The light blue rectangle represents the secure
System-on-Chip. The contents of the external Flash are encrypted. . . . . . . . . . . . . . 28
4.2 Log file example. Each horizontal line represents a line break. Hashes and signatures
are Base 64 encoded. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.3 Structure of a Log-Chain. Each hash is computed using the previous hash of the log. The
first hash is defined by the device administrator (UID=0) and set as the root hash value. . 33
4.4 The structure of a log chain with grouped log entries. . . . . . . . . . . . . . . . . . . . . . 34
4.5 The structure of log folders and ther files. The current year and month folders are high-
lighted in dark grey and the current day log file is highlighted in yellow. . . . . . . . . . . . 34
5.1 The first system architecture, which uses the mbedTLS algorithms to perform crypto-
graphic operations. The unused modules are greyed out. . . . . . . . . . . . . . . . . . . 39
5.2 The second system architecture, which uses the SoC embedded cores for additional se-
curity and possible performance. The unused modules are greyed out. . . . . . . . . . . . 40
5.3 The third system architecture, which uses the FPGA to accelerate the SHA-256 algorithm.
The unused modules are greyed out. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.4 SRAM-PUF core example for Key Code 2 and 3. . . . . . . . . . . . . . . . . . . . . . . . 42
5.5 Development internal mode memory map. . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.6 Production mode internal memory map. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
xiii
-
5.7 The scheme for time synchronization via STS. . . . . . . . . . . . . . . . . . . . . . . . . 49
6.1 SHA-256 throughput for the three tested implementations. . . . . . . . . . . . . . . . . . . 52
6.2 AES-256 throughput for the two tested implementations. . . . . . . . . . . . . . . . . . . . 54
6.3 System operation times for the three tested implementations. . . . . . . . . . . . . . . . . 55
6.4 Throughputs for open-channel and secure-channel communications. . . . . . . . . . . . . 57
A.1 Flowchart describing the process of receiving a message through the created communi-
cation protocol. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
xiv
-
Acronyms
AES Advanced Encryption Standard.
ASIC Application-Specific Integrated Circuit.
DPA Differential Power Analysis.
EC Elliptic Curve.
ECC Elliptic Curve Cryptography.
ECDH Elliptic Curve Diffie-Hellman.
ECDSA Elliptic Curve Digital Signature Algorithm.
ECIES Elliptic Curve Integrated Encryption Scheme.
eNVM embedded Non-Volatile Memory.
eSRAM embedded Static Random Access Memory.
FPGA Field-Programmable-Gate-Array.
HMAC Hash-based Message Authentication Code.
HSM Hardware Security Module.
IV Initialization Vector.
MSS Microcontroller Subsystem.
NTP Network Time Protocol.
PC Personal Computer.
PKI Public Key Infrastructure.
PUF Physically Unclonable Function.
RSA Rivest-Shamir-Adleman.
xv
-
SoC System-on-Chip.
STS Simple Time Service.
TPM Trusted Platform Module.
TRNG True Random Number Generator.
UID User Identification.
xvi
-
Chapter 1
Introduction
Modern reconfigurable systems, such as Field-Programmable-Gate-Array (FPGA), provide increasing
programming possibilities, high flexibility and growing hardware capabilities. For these reasons, there
has been an expanding variety of applications for these devices, such as Data Centers, Medical, Aerospace,
Defense, Security, Transportation and Automotive [1]. Along with this, the increasing need for data pro-
tection and system reliability, especially for safety-critical systems, has urged FPGA manufacturers to
develop more secure and reliable devices, rather than solely focusing on power consumption and system
performance.
FPGAs are low-cost general-purpose devices that provide high flexibility and performance. They are
composed of a configurable logic block array, connected through programmable interconnections. Their
configuration is usually described using a hardware description language, such VHDL or Verilog, and
can be configured for the desired application after manufacturing.
FPGAs can be categorized into four different categories depending on their configuration storage:
SRAM-based, SRAM-based with internal flash, Flash-based and Antifuse-based. SRAM-based FPGAs
have their logic cells configuration data stored in the static memory cells. These FPGAs must be re-
programmed on each start since SRAM is volatile. They read the configuration from an external source
(e.g. Flash memory) when the device is booted. When an internal flash memory is present, the bit-
stream is stored internally, which prevents unauthorized bitstream copying (SRAM-based with internal
flash). Most modern volatile FPGAs come with a secure boot process, in which the device will attempt to
load the binary from the configuration memory when powered on. The binary is decrypted and authen-
ticated using the onboard dedicated decryption logic and the programmed AES key (by the hardware
manufacturer). This key can only be read by the internal decryption logic and is not accessed from the
outside. If the configuration bitstream is not authenticated, the device gets to an error state and will not
function until provided with a valid bitstream.
On the other hand, Flash-based FPGAs use an internal flash memory for the configuration storage,
rather than static memory cells. Non-volatile FPGAs provide higher security, faster logic availability
after power-on, and of course, non-volatile storage, which is of key importance for safety-critical ap-
plications [2, 3]. Furthermore, non-volatile FPGAs tend to consume less power and are more tolerant
1
-
to radiation effects. Because they are non-volatile, the bitstream is not at risk of being probed during
start-up [3]. Finally, Antifuse-based FPGAs consist of “fuse-burning”, which means they can only be
programmed once.
Existing commercial security-oriented devices provide cryptographic operations and secure key man-
agement with an adequate performance but at a high cost and low flexibility (e.g. Hardware Security
Module), or low performance because of small computation power and memory storage but at a lower
price and higher flexibility (e.g. SmartCard). Unlike these, new FPGA technologies are starting to pro-
vide great flexibility and security at a low price [4, 5, 3], allowing for the creation of cheaper systems that
resemble a Hardware Security Module (HSM) with much greater flexibility and the ability to be easily
reconfigurable. Moreover, FPGAs are being integrated in System-on-Chip (SoC) designs, merging em-
bedded CPUs, memories and security modules with the FPGA fabric, allowing for the creation of more
robust and security-oriented systems.
Over the last years, several authors have proposed FPGA-based architectures as secure computing
platforms [6, 7], as well as PUF-based key generation and re-keying mechanisms on FPGAs [8, 9].
Moreover, full systems have been proposed, which use FPGAs to perform security operations for safety-
critical applications, such as a Secure Application Wrapper which performs secure system authentication
and data transfer with an external memory [10], and an FPGA-based architecture that allows users to
offload sensitive computations to the cloud [11].
However, the State of the Art solutions cannot be used as Hardware Security Modules, as they solve
very specific problems and do not meet certain requirements that are mandatory for an HSM to have [12],
such as anti-cloning mechanisms (e.g. PUF-based key generation), secure communication channels,
internal non-volatile memories for master key storage, anti-tamper mechanisms, internal clock freshness
(e.g. through a Timestamping Authority) and a common developer interface, such as PKCS#11. Fur-
thermore, these works use SRAM-based FPGAs, which are subject to several attacks, such as probing
when the configuration bitstream is loaded at boot time [3, 5]. Non-volatile FPGA technologies provide
lower power consumption, faster boot times and do not need to be re-configured on each power-on.
1.1 Objectives and Requirements
The main objective of this work, is to create a re-configurable and flexible HSM, supported by a low-cost
non-volatile security-oriented FPGA, as opposed to the State of the Art. The low price implies certain
limited hardware specifications, such as reduced internal memory storage and short endurance, mean-
ing that is not possible to install an Operating System that is stored and executed only inside the device.
Moreover, the expected low CPU processing speed suggests a relatively low performance, while the
lack of internal battery implies that an internal Real Time Clock cannot be relied on, unless it is securely
initialized. Therefore, the work herein considered aims to overcome these limitations, providing a fully
working solution that tries to compete with existing commercial ones in terms of security features and
standards, while maintaining the same flexibility and low-cost as the academic re-configurable propos-
als.
2
-
The requirements for the considered solution, in order to achieve the aforementioned objectives are
as follows:
1. Secure key management and storage should be guaranteed through the use of a device dependent
key generation mechanism, enhancing the anti-cloning characteristics. Additionally, internal key
storage should be supported.
2. The selected device should guarantee a secure boot process.
3. As the system may be used under insecure environments, the system should be capable of estab-
lishing a secure channel with the PC, guaranteeing confidentiality, integrity, freshness and authen-
tication of the exchanged messages.
4. Internal clock freshness and synchronization should be maintained with the outside world through
the use of a reliable external time provider.
5. Externally stored data should be properly protected.
6. The selected device should have several tamper detection mechanisms and anti-tamper protection
features.
7. The developed system should be flexible, open-source and be available at a low cost.
1.2 Main contributions
In order to achieve the objectives of the work, a solution was proposed and implemented, which satisfies
the requirements highlighted above. The proposed solution, considering the Smartfusion2 SoC as the
supporting non-volatile technology, consists of creating an open-source Hardware Security Module sup-
ported by a reconfigurable technology, as opposed to existing commercial solutions. The system itself
consists of a low-cost tamper-proof and unclonable secure certification system capable of generating
and managing keys securely, while still providing high flexibility and adaptability.
The system has the ability to issue digital certificates and generate key pairs for its users, as well
as generate digital signatures upon request by an authenticated user. As the system provides high
flexibility and because of the existing need for a secure logging system, a complementary novel feature
is proposed, which consists of creating a non-repudiable and certified chain-of-logs with the secure
computation system (e.g. for Linux Syslog messages, Transaction logs or Medical Receipts [13]).
Regarding the existing State of the Art [7, 9, 8, 10, 11], the proposed solution contributes with im-
proved key management supported by a PUF-based mechanism, secure and authenticated external
data storage, a secure communication channel that assures confidentiality, integrity and authentication
of exchanged data, as well as the ability for developers to integrate applications with the system through
an extended PKCS#11 interface. Moreover, the proposed solution considers the use of a non-volatile de-
vice as opposed to a volatile one, and more specifically, a security-oriented device that contains several
characteristics that makes it possible to create a flexible and reconfigurable Hardware Security Module
at a low cost.
3
-
Additionally, this work also contributes with a thorough analysis of the cryptographic operations per-
formed by the device’s embedded cores, such as AES-256, SHA-256 and Elliptic Curve scalar multi-
plication. The results show that the performance of the system is primarily influenced by the Elliptic
Curve scalar multiplication operation, with 70% to 95% of the operations time being spent on scalar
multiplications. Additionally, a SHA-256 core was deployed in the FPGA fabric to understand the perfor-
mance impact over the existing embedded cores and the software implementation. The conducted tests
suggest that the FPGA-accelerated version of SHA-256 is faster than the embedded device SHA-256
core and software-based implementations, while consuming only 5% of the FPGA fabric. On the other
hand, the software-based version of AES-256 is faster than the embedded AES core provided by the
device. Overall, the system is able to perform up to 2 signature/certification operations per second, on
a non-volatile device, at a much lower cost than existing commercial HSMs, while providing the needed
security and reliability features.
An article discussing the different implementations of the proposed solution and their results has
been submitted and accepted to the International Conference on Reconfigurable Computing and FPGAs
(ReConFig 2017).
• Diogo Parrinha and Ricardo Chaves, ”Flexible and Low-Cost HSM based on Non-Volatile FPGAs”,
International Conference on Reconfigurable Computing and FPGAs (ReConFig’17), September
2017.
1.3 Thesis Outline
The thesis is organized as follows. In Chapter 2, a background study on cryptographic services is pro-
vided, along with a review of existing secure computing platforms and their implementation technologies.
Afterwards, in Chapter 3, the relevant State of the Art is presented, with a major focus on secure FPGA
computing platforms, including a comparative analysis of the various solutions. Chapter 4 and 5 detail
the proposed solution and the resulting implementation, respectively. The result analysis and perfor-
mance comparison is presented in Chapter 6. Finally, Chapter 7 concludes this document with some
final remarks and future work directions.
4
-
Chapter 2
Background
In this section, a brief introduction to a variety of concepts used throughout the dissertation is provided,
which includes the presentation of cryptographic services and mechanisms, such as symmetric and
asymmetric cryptography, Physically Unclonable Function (PUF), Public Key Infrastructure (PKI), key
exchange protocols (such as ECDH) and data signing mechanisms (such as ECDSA). Furthermore,
several Secure Computing Platforms (Smart Card, TPM, HSM) and Implementation Technologies (ASIC,
FPGA) are introduced and compared, along with a thorough description of the Microsemi Smartfusion2
SoC.
2.1 Cryptographic Services and Mechanisms
Cryptography services and mechanisms include symmetric cryptography (AES), asymmetric cryptogra-
phy (RSA, ECC-based), hashing functions (SHA-256), key exchange protocols (ECDH) and data signing
algorithms (ECDSA), as well as Public Key Infrastructures. Symmetric cryptography involves the use of
a shared common key between multiple parties, and is faster than asymmetric cryptography. The latter
involves a pair of keys (public and private) and is usually used to provide authentication or encryption
between two parties, giving the ability to ensure non-repudiation1 if used correctly. These keys can be
used in key exchange protocols for two parties to establish a symmetric key for secure communication
or to be used by data signing algorithms which allows a user to sign a piece of data using a private
key and another user to validate it using the signer’s public key. Since public keys must be published
and verified in a trusted manner, the PKI, that consists of a framework to perform the management of
digital certificates that bind public keys to users, is also addressed. The PKI provides mechanisms for
distributing public keys, verifying and revoking them when their private counterpart is compromised.
2.1.1 Symmetric Key Cryptography
Symmetric Key Cryptography is composed of symmetric-key algorithms for ciphering and deciphering
data using the same cryptographic keys. They represent a shared secret between two or more parties,
1The ability to ensure that a party to a contract cannot deny the authorship of a document.
5
-
which is used to maintain a secure and private link. Although faster than Asymmetric Key Cryptography
algorithms, it requires that the parties share the same secret.
Currently, the main symmetric encryption standard is the Advanced Encryption Standard (AES) algo-
rithm, a 128-bit iterative and symmetric block cipher which can support key sizes of 128, 192 or 256 bits
for 10, 12 or 14 rounds respectively [14]. A round consists of multiple processing steps including substi-
tution, transposition, mixing of the plain-text and transformation into the final output, i.e. the cipher-text
[14]. AES can be used with different block cipher modes of operation, which include ECB, CBC, OFB,
OCB and CTR [15]. Since OFB, OCB and CTR allow to encrypt bit by bit, they can be used as stream
ciphers, that consist of a method in which a cryptographic key and algorithm are applied to each individ-
ual bit of the plaintext. Usually, the cipher modes require an Initialization Vector (IV) that is mixed with
the data to achieve semantic security2. It consists of a fixed-size variable that should be non-repeating
and randomly generated.
2.1.2 Asymmetric Key Cryptography
Asymmetric Key Cryptography, also known as Public Key Cryptography, includes any system that uses
a pair of keys: a public key and a private key. The private key is only known to or usable by the owner,
while the public key can be known to everyone. This provides two possible features: authentication,
which is when someone uses the public key to verify the sender of a message, and confidentiality, which
is when someone uses the public key to ensure that only the owner of the private key is able to decipher
the message.
Rivest-Shamir-Adleman (RSA) is one of the oldest but most used public key cryptography algorithms.
It is based on the assumption that factoring the product of large prime number is a computationally hard
task to do. Meaning that even if an attacker has enough computational resources and time, it will still not
be able to obtain the private key.
To create a public and a private key, it is necessary to generate two different random prime numbers
p and q first. Then, compute n such that n = p ∗ q. Afterwards, use the Euler’s totient φ(n) to compute
φ(n) = (p− 1) ∗ (q − 1). With them, it is possible to choose a random integer e that meets 1 < e < φ(n)
and finally compute the integer d which verifies: e ∗ d ≡ 1 mod φ(n). The public key is (e, n) and the
private key is (d, n). With these, it is possible to compute the cipher-text c of the message m (2.1), using
the public key, and decrypt it with the private key by (2.2):
c ≡ me mod n (2.1)
m ≡ cd mod n (2.2)
An alternative to RSA is Elliptic Curve Cryptography (ECC), based on the mathematics behind elliptic
curves (see Figure 2.1) over finite fields, that can be applied to encryption, digital signatures and pseudo-
random generators. An elliptic curve is represented as a looping line intersecting two axes, and ECC
hinges on a particular type of equation created from a mathematical group derived from points where
2If an attacker possesses the ciphertext of a message and the message’s length, it cannot determine any partial informationon the message with higher probability than if it only possesses the message length.
6
-
the line intersects those axes [16] [17]. By multiplying a point on the curve by a number, another point
on the curve is obtained. However, it is computationally infeasible to find which number was used, even
if the original point and result is known. An example equation of a possible curve is y2 = x3 + ax+ b.
Figure 2.1: An example of an elliptic curve. Example equation: y2 = x3 + ax+ b
When using elliptic curves for public key encryption, a public and private key pair must be generated:
d and Q, in which d (random integer chosen from {1, ..., n − 1} where n is the order of the subgroup)
represents the private key and Q its public counterpart, that is generated from Q = dG (where G is the
base point of the subgroup). The definition and selection of the curve parameters, such as base point G,
elliptic curve coefficients (a,b) and order of the subgroup n, goes beyond the scope of this work, but each
curve has its own parameters and for many curves, they’re defined in FIPS 186-4 [18]. Most importantly,
ECC uses two main operations: point addition and point multiplication, in which the former involves the
public key on most algorithms while the latter involves mostly the private key. These can then be used
to establish a common shared secret between two parties and to sign data using the private key.
Moreover, ECC is considered to be faster than RSA and its keys have a shorter length than its RSA
equivalent for the same security level [16] [17]. This also makes ECC more suitable for embedded
systems and systems with lower performance and memory capacity.
2.1.3 Hashing Function
A hashing function maps an arbitrary long message to a fixed-size hash value. Ideally, they have four
main properties: hashes are quickly generated; it is infeasible to generate a message from its hash value
except by trying all possibilities; a small change in the message should change the hash extensively; it is
infeasible to find two different messages with the same hash value (collision resistant). Digital signatures
and message authentication codes (MAC) make use of hashing functions.
With SHA-256, the message is first padded so that its length is a multiple of 512 bits and then parsed
7
-
into 512-bit message blocks, M1,M2, ...,MN . Then each block is processed one at a time, beginning
with a fixed initial hash value H0:
Hi = Hi−1 + CMi(Hi−1), (2.3)
where C is the SHA-256 compression function and + represents the word-wise mod 232 addition. HN
is the hash of M. Hashes can typically be used to provide integrity and authentication, such as in a
Hash-based Message Authentication Code (HMAC), which involves a hash function and a secret key.
2.1.4 Secret Key Establishment
A secure communication can be established with several techniques, with the goal of creating a commu-
nication channel that provides confidentiality and authentication of information, as well as authentication
of parties. The following presents the Diffie-Hellman (DH) Key Exchange, followed by an ECC-based
DH protocol (ECDH) and finally ECIES, which uses ECDH as part of its hybrid scheme.
Diffie-Hellman
One of the most common secret key establishment protocols is the Diffie-Hellman Key Exchange (D-H),
which establishes a cryptographic secret over a public channel. The idea behind this method is that the
public information exchanged between two parties, is used to create a secret key without compromising
it. The algorithm relies on the difficulty of solving the discrete logarithm problem. First, Alice and Bob
agree to use a certain modulus p (prime) and a base g (primitive root module p), which can be public.
Afterwards, Alice and Bob generate a random value, a and b, calculate A and B respectively, and finally
exchange A and B between them:
A = ga mod p (2.4)
B = gb mod p, (2.5)
Finally, they compute:
s = Ba mod p = Ab mod p (2.6)
where s is the resulting shared secret. The can provide perfect forward secrecy3.
Unless the key pairs are not ephemeral, this does not provide authentication of either party, and
therefore it is subject to Man-in-the-Middle attacks, where an attacker intersects the exchanged public
keys, generates its own key pair, and exchanges its public key with both parties, pretending to be the
other party. Therefore, the attacker ends up with two shared secrets, one to communicate with each
party.
In some cases, it is possible to define a previously shared password that is used to cipher the ex-
changed public keys, guaranteeing that only those parties will be able to use them to generate the3If any long-term key is compromised, it does not compromise all past session keys. In the D-H key exchange, if the private
values are obtained randomly for each session (thus being ephemeral), compromising one of them will not compromise any of theother previously exchanged secrets.
8
-
shared secret, therefore providing authentication. This may not be feasible in all cases, as the password
needs be known, and securely stored, before establishing the communication.
However, in some situations, it is only required for one of the parties, such as a server, to be authen-
ticated before starting a secure connection. In that case, the server can have a non-ephemeral key pair,
whose public key has been distributed correctly via a PKI (more details in Section 2.1.6). The connecting
party, or client, can then authenticate the other one because it knows its public key already and trusts it.
As long as the client generates a new key pair for each session, perfect forward secrecy is still ensured.
Because the server can be authenticated, Man-in-the-Middle attacks can no longer be performed, as
the client rejects any shared secrets that do not match the one calculated using the known server public
key.
ECDH
The protocol Elliptic Curve Diffie-Hellman (ECDH) , a derivation of the DH protocol supported by elliptic
curves, allows for two parties to establish a shared secret over an insecure channel by simply exchanging
their Elliptic Curve public keys. First, both parties have to agree on the same domain parameters (i.e.
the same curve) and then generate a key pair (d,Q) accordingly, in which d and Q represent the private
and public keys respectively.
Once both parties have generated their private and public keys, they can exchange their public elliptic
curve keys. Party A calculates S = dAQB and party B calculates S = dBQA, in which S is the shared
secret, that cannot be obtained by an attacker because it only knows the public keys. The shared secret
is hashed using SHA-256 to obtain a 256-bit key.
ECIES
Elliptic Curve Integrated Encryption Scheme (ECIES) is a hybrid encryption scheme that provides se-
mantic security [19], which uses the following functions: key agreement, key derivation function, hashing,
encryption and message authentication. According to [20], there are several versions of ECIES. A simple
one basically consists of a key agreement protocol, such as ECDH, followed by a key derivation function
(KDF), whose resulting key is used in a symmetric encryption scheme (AES). The key derivation function
can be the PBKDF2, defined in PKCS#54, which supports key expansion, that can be required when
using large keys.
2.1.5 Digital Signatures
A Digital Signature consists of a mathematical technique used to validate the authenticity and integrity
of a digital message or document, assuring to the recipient that the sender cannot deny its authorship
(non-repudiation) and that the message was not altered while in transit (authentication).
4PKCS#5 is a password-based cryptography specification [21] which covers key derivation functions, encryption schemes andmessage authentication schemes.
9
-
An example of a Digital Signature algorithm is the Elliptic Curve Digital Signature Algorithm (ECDSA),
which allows an entity to digitally sign a piece of data using its private elliptic curve key. The data is first
hashed using a hashing function, such as SHA-256. The resulting message digest must be truncated,
so that the length of the hash is the same as the bit length of n (the order of the EC subgroup). The
truncated digest value is an integer denoted as z. The algorithm works as follows:
1. Choose a random integer k from {1, ..., n− 1}.
2. Calculate the point P = kG, where G is the base point.
3. Calculate the number r = xP mod n (where xP is the coordinate x of the point P)
4. If r = 0 choose another k and try again.
5. Calculate s = k−1(z + rdA) mod n, where dA is the private key of the signer (A).
6. If s = 0 choose another k and try again.
If successful, the pair (r, s) is the signature. Other parties can verify the signature by using the
signer’s public key (QA) to:
1. Calculate integer u1 = s−1z mod n.
2. Calculate integer u2 = s−1r mod n.
3. Calculate point P = u1G+ u2QA, and obtain xP .
4. The signature is valid if r = xP mod n.
2.1.6 Key Certification and PKI
Public Key Cryptography provides non-repudiation to secure communications. When Alice sends data
signed with its private key, Bob knows only Alice was able to sign it because only Alice has its private
key. However, Bob must trust that the received public key is in fact the public key of Alice. To solve
this problem, Digital Certificates are used. Digital certificates are electronic credentials used to verify
identities of individuals and machines, by guaranteeing that the identification of a user and data is bound
to a certain public key. In general, digital certificates consist of three main parts: user/device information;
public key; digital signature. A frequently used certificate format is the X.509v3 profile. Some of the fields
present in a X.509v3 certificate include [22]:
Table 2.1: X.509v3 certificate fields.
Version Number Serial Number
Signature Algorithm ID Issuer Name
Validity period Subject name
Subject Public Key Info Issuer Unique Identifier (optional)
Subject Unique Identifier (optional) Extensions (optional)
Certificate Signature Algorithm Certificate Signature
10
-
Because there must be services to issue, validate and revoke these certificates, the PKI was cre-
ated. A PKI is a framework of roles, policies and procedures that allow the generation, management,
distribution, revoking and storage of digital certificates [23, 24, 25].
A PKI normally has the following components: Security Policy, Certification Authority, Registration
Authority and Certificate Repository. The Security Policy is essential to state how the organisation
handles keys and valuable information. The Certification Authority is the entity which issues (binds
the identity of a user to a public key with a digital signature) and revokes certificates. To ensure a
certain level of trust, users that wish to receive a certificate for a public key, must first register with
a Registration Authority, which is the interface between the user and the CA, that authenticates the
user and submits the certificate request to the CA. Finally, a Certificate Repository is required in order
to store the certificates issued and Certificate Revocation Lists (CRLs), which contain certificates that
have been revoked. Certificates can be revoked for several reasons, e.g. validity date expired, private
key compromised, failure to comply with policy requirements and misrepresentation.
2.1.7 Physically Unclonable Function
A Physically Unclonable Function (PUF) is a challenge-response mechanism in which the response to a
given challenge is dependent on a variable physical material [8]. A PUF receives an input challenge (or
stimulus) Ci ∈ C, where C is the set of all possible challenges, and outputs a response Ri ∈ R - where
R is the set of all possible responses. PUFs are based on the natural randomness that exists in the IC
(integrated circuit) used to generate the response - and cannot be controlled. This occurs due to the
random alterations during the IC fabrication process, i.e. two PUFs with the same layout result in two
different functions, so it is impossible to make two PUFs behave equally.
A PUF has four main characteristics. An input produces the same response approximately (error
correction codes are used to remove noise). Given a response, it must be difficult to find its challenge
(input). Two different challenges must produce two different responses. Two different PUFs must pro-
duce two different responses for the same challenge.
2.2 Secure Computing Platforms
Several platforms have been developed over the years to perform secure computing, which include
Hardware Security Modules, Trusted Platform Modules and Smart Cards. A brief introduction to some
of these platforms is given in this section.
HSM are application-specific devices which provide secure cryptographic key management and ac-
celerated cryptographic operations with those keys. They have the following main characteristics [12]:
1. Secure key management.
2. Secure internal and external data storage.
3. Support cryptographic operations with internal keys (such as ciphering/deciphering data and gen-
erating digital signatures).
11
-
4. Include anti-tampering and anti-cloning features at the physical level.
5. May include side-channel analysis protection mechanisms.
6. Contain True Random Number Generators.
7. Guarantee internal clock freshness.
8. Support secure communication with the outside world.
9. Standard developer API that allows for an easy integration with software applications.
However, their cost is usually higher than general-purpose devices (the price for a regular HSM can
go up to 35,000e [26, 27]) and they do not provide the same flexibility for the programmer. HSMs are
usually connected to a network through TCP/IP or to a computer via USB, which makes it easy to remove
or add them back.
Smart Cards are security tokens that have an embedded chip. They are designed to be tamper resis-
tant and provide security services. Although considered secure, smart cards possess slow input/output
communication and low computational processing power and memory storage. Some advantages of
using Smart Cards as Secure Computing Platforms include their low price when compared to other al-
ternatives, their portability and their flexibility (being often used as credit cards, ID cards or repositories
for personal information).
A Trusted Platform Module (TPM) is a secure cryptographic chip that integrates a secure micro-
processor with cryptographic keys and functionalities, which is normally embedded on a computer’s
motherboard. It includes capabilities like Binding, Sealing and Attestation. Binding allows data encryp-
tion using the TPM’s unique RSA key. Sealing works similarly to Binding, except that it requires the
TPM to be in a certain state in order to decrypt the data. Attestation allows a third party to verify that
the software has not been changed, by comparing the unforgeable authenticated digest of the hardware
and software configuration. The most recent version of the TPM specification (2.0) provides more cryp-
tography algorithms, such as ECC, AES, SHA-256 and HMAC. However, this version is still relatively
new (approved in 2015) and therefore not many vendors support it.
Recently, ARM and Intel have developed two different technologies that provide system-wide hard-
ware isolation for trusted software. The ARM TrustZone creates an isolated area that can be used to
guarantee confidentiality and integrity to the system, by providing code and data isolation [28]. Most
ARM platforms today have this security technology implemented. On the other hand, the Intel SGX is
a technology which allows developers to protect certain code and data from disclosure or modification
[29]. Both technologies are susceptible to several attacks as described in [30, 31].
By comparing the advantages and disadvantages of each platform described above, with the objec-
tives and requirements of this work (Section 1.1), it is possible to identify that ideally, the desired system
would comprise of a low-cost and flexible Hardware Security Module.
There are several Hardware Security Modules in the market for a variety of goals, coming with differ-
ent prices and characteristics. The criteria for choosing the right HSM for a given task includes perfor-
mance, scalability, redundancy, API support, security, supported algorithms, authentication options and
cost.
12
-
A review on HSMs [12] shows that models like the AEP Keyper v2, SafeNet Luna SA 4.4, Thales
nShield Connect 6000 and Ultimaco CryptoServer Se1000 support several algorithms, such as AES,
RSA, ECDSA, ECDH and SHA-2, which are available through a PKCS#11 interface. While only two
of them support elliptic curve operations, all support RSA. The single-threaded performance results for
RSA signature generation can be found in Table 2.2.
Table 2.2: Single-threaded performance (signatures/second) for different HSMs [12].Key Size (bits) Keyper v2 Luna SA 4.4 nShield CryptoServer
1,204 310 800 950 1160
2,048 110 420 570 710
4,096 13 35 150 230
Table 2.3: HSM Key Storage capacity [12].AEP Keyper v2 8000 1024-bit RSA keysSafeNet Luna SA 4.4 1200 2048-bit RSA keysThales nShield Limited on board NVRAM storage.Ultimaco CryptoServer 5000 1024-bit key pairs
Concerning key storage capacity, Table 2.3 depicts the capacity for each HSM listed previously. In
regard to backups, the listed HSMs either provide backups to dedicated external cards or remote back-
ups functionality. Furthermore, they all support time synchronization and administrator authentication
via PKCS#11 interface.
2.3 Implementation Technologies
There are three major implementation technologies that can be used to create HSM-like systems: CPUs,
ASICs and FPGAs. A Central Processing Unit (CPU) consists of a general purpose electronic chip
(such as the one inside Smart Cards) that performs basic arithmetic, logical, control and input/output
operations specified by program instructions. However, they do not contain internal memory (apart from
possible cache memories) and are not closed systems, providing no security to the application.
An Application-Specific Integrated Circuit (ASIC) is an integrated circuit built for a specific application
(in this case, secure computing). They usually include microprocessors, memory blocks (ROM, RAM,
Flash Memory) and other building blocks. Because of their specific use, its manufacturing cost is quite
high when compared to general purpose application devices. They strive to meet the EAL 4+ assurance
level regarding Common Criteria / FIPS 140-2 certification [32] [33] [34].
On the other hand, as described in Section 1, FPGAs are low-cost general-purpose devices that
provide high flexibility and performance, which can be categorized into four different categories depend-
ing on their configuration storage: SRAM-based, SRAM-based with internal flash, Flash-based and
Antifuse-based.
13
-
Many FPGA vendors (e.g. Achronix, Altera, Lattice, Microsemi and Xilinx) compete with each other
to provide the best FPGAs to the market. Each vendor provides different FPGA devices with distinct
technologies and equip their own security mechanisms against attacks such as Bitstream Probing, De-
cryption Key Stealing, Readback attacks, Side-channel attacks, as well as FPGA counterfeiting and
cloning [35] [36]. Table 2.4 lists the protection mechanisms available for the configuration data of the
major FPGA models in the market.
Table 2.4: Protection mechanisms for FPGA configuration data.Manufacturer Device Bitstream Encryption Authentication Technology Key Storage
Microsemi [4] IGLOO2, Smartfusion2 SoC AES-256 SHA-256 Flash PUF
Xilinx [37]
Spartan-6 AES-256 Device-DNA5 SRAM eFUSE6, volatile
Virtex-6 AES-256 SHA-256 HMAC SRAM eFUSE, volatile
Virtex-7 AES-256 SHA-256 SRAM eFUSE, volatile
Zynq-7000 AES-256 SHA-256 SRAM eFUSE, volatile
Altera [38]
Stratix II/II GX AES-256 CBC - SRAM NVM
Stratix III/IV/V AES-256 CBC - SRAM volatile, NVM
Cyclone III LS AES-256 CBC - SRAM volatile
As detailed previously, non-volatile FPGAs are better suited for safety-critical applications. In par-
ticular, the IGLOO2 and Smartfusion2 SoC (which integrates a processor and FPGA logic) provide the
best variety of design and data security features when compared to other FPGA models from different
vendors [3, 39]. In fact, among the devices highlighted in Table 2.4, the IGLOO2 and the Smartfusion2
are the only devices capable of generating and storing keys through a PUF-based mechanism, allowing
for greater anti-cloning capabilities. Additionally, they are the only ones which include embedded cryp-
tographic cores that are protected against Differential Power Analysis (DPA)7. Regarding the volatile
FPGAs, while several ad-hoc solutions have been proposed [5] to strengthen the native technology,
these devices are still vulnerable to several attacks, particularly when loading the initial configuration.
On the other hand, the IGLOO2 and the Smartfusion2 SoC non-volatile bitstream storage makes them
less susceptible to probing attacks on device boot [3].
2.4 Smartfusion2 SoC
Given that the Smartfusion2 SoC is considered to be the best option available, the following focuses
on the SmartFusion2 SoC from Microsemi. Although the IGLOO2 provides similar security, it does
not contain a microprocessor, which is necessary to run the software that controls the system. In this
section, a brief description of the device is performed, followed by an introduction of its main hardware
components, design and data security features, which together, form a Root-of-Trust that is essential to
ensure a Secure Boot.
7A type of side-channel attack, in which the attacker studies the power consumption of a cryptographic hardware device inorder to extract cryptographic secrets. A side-channel attack is any attack based on information retrieved from the physical levelof a cryptographic system.
14
-
2.4.1 Device Description
The Microsemi Smartfusion2 SoC contains a non-volatile FPGA, integrated with an ARM Cortex-M3 in
a Microcontroller Subsystem, including an embedded Non-Volatile Memory (eNVM) and an embedded
Static Random Access Memory (eSRAM), along with several cryptographic services [39] depicted in
Figure 2.2. The board on which the SoC is mounted, provides an SPI external flash, an Ethernet PHY
10/100 and one external RAM memory.
Figure 2.2: SmartFusion2 SoC FPGA Block Diagram [39].
There are two available power modes: full-power (normal), in which the MSS and FPGA fabric are
fully operational and the Cortex-M3 is running application code with all memory controllers enabled;
low-power, in which the Smartfusion2 is considered to be in an idle state but ready to respond to an
interrupt sourced from the MSS and the FPGA - this mode disables the majority of the Cortex-M3 logic.
2.4.2 Security Features
Microsemi defines three different abstraction layers: Secure Hardware, Design Security and Data Secu-
rity. Data Security builds on top of Design Security which is built on top of Secure Hardware. While a full
list of secure hardware features is described in [4], the most important features for secure hardware are
listed in Table 2.5, namely key management and the validation of digital signatures, which include the
encrypted loading of user and factory keys, as well as device certificates (which are bound to a specific
device) and a revocation list of stolen or scrapped devices.
15
-
Table 2.5: Key Features for Secure Hardware [4].
Services Features
Key Management Encrypted loading of user secret key material (both Symmetric and Asymmetric encryption
supported).
Authenticated/encrypted loading of all factory keys.
Factory keys and passcodes generated and loaded by Hardware Security Models (HSMs)
Digital Signature Vali-
dation
X.509 certificate bound to device serial number, device grading information, and device se-
cret keys.
Certificates digitally signed by factory HSMs.
Certificate revocation list for scrapped or stolen devices.
Design Security is defined as protecting the intent of the design owner, i.e. keeping the design
and bitstream keys confidential and protecting against design changes. It can also be referred to as
Intellectual Property (IP) protection. The list of available Design Security features can be found in [4].
The Smartfusion2 SoC provides True Random Number Generator (TRNG) for nonces and private ECC
key generation, and all bitstreams are encrypted with AES-256 based encryption and fully authenticated
with a 256-bit tag. To prevent back-tracking attacks, versioning is provided, disallowing the loading of
obsolete bitstreams.
For DPA protection, all security keys, protocols and ECC point multiplication services have coun-
termeasures with technology from CryptographicTM Research Inc. Microsemi states that cryptographic
services are not protected against DPA (although they are safe from timing analysis and simple power
analysis), therefore they do not recommend the use of repeated keys when the adversary can choose
the ciphertext [40].
The Smartfusion2 SoC has several security and access control policies which can be configured by
setting flash-lock bits, which are control bits that enable or disable certain features (see the circles with
crosses in Figure 2.3). The security segments in green, depicted in the middle section of Figure 2.3,
are stored on the internal non-volatile memory and are the heart of the system security as they contain
the set of keys (e.g. UEK1, UPK1, DPK) used to encrypt the bitstream configuration, unlock operations
such as read, write, and verify eNVM, enter Factory Test Mode, erase, write and verify fabric, enable
versioning updates, and restrict JTAG and SPI access.
Additionally, readback of the bitstream is always disabled. For tamper protection, the device comes
with configurable zeroization options to clear and verify volatile and non-volatile memories. It also pro-
vides redundancy in the security flash array to allow detection and reporting of faults.
As for Data Security, which they explain as protecting the information that is stored, processed or
communicated in the application executing on the FPGA, a list of features is available in [4], which
includes SHA-256, HMAC-SHA-256, SRAM-PUF, ECC and AES operations.
Another security feature of these FPGAs is the Root-of-Trust. It is described as an entity that can be
16
-
Figure 2.3: Detailed security and settings model diagram [41]. The green segments in the middle arestored in non-volatile memory. The COMBLK performs the communication between the MSS (software)and security services (System Controller).
trusted to always behave in the expected manner. It provides the verification of the system, software and
data integrity and confidentiality, as well as the extension of trust to internal and external entities. It is
the foundation upon which all security layers are built. In an embedded system, the Root-of-Trust works
with other system elements to ensure the main processor boots securely using only authorized code -
which extends the trusted zone to the processor and its applications. By providing the aforementioned
hardware, design and data security features, Microsemi considers the Smartfusion2 SoC to provide a
Root-of-Trust, which is essential to the Secure Boot.
A Secure Boot process (controlled by the System Controller displayed on the left of Figure 2.3)
initializes an embedded system from rest and it does that by executing trusted code, free from tampering
by an attacker. If this level of trust does not exist, another boot image could replace the original one and
allow an attacker to hijack the whole system. The validation of each stage must be performed by the
previous successful phase to ensure a chain-of-trust up to the application layer. The first phase (Phase
0), or Immutable Boot Loader, is inserted within the Smartfusion2 SoC and validated by the Root-Of-
Trust, that ensures integrity and authenticity of the code. Then, each phase is validated by the previously
trusted system, before code and execution is transferred to it.
The first pages of the eNVM are reserved for the System Controller and cannot be accessed by
the user. They store, among other things, the Device Certificate and the digest of the User portion of
the eNVM. This digest allows the Immutable Boot Loader, in Phase 0, to know whether or not Phase 1
contents have been modified. If not modified, the booting process proceeds to the next phase, otherwise
it gets halted. All the aforementioned features, when configured and used properly, allow us to deploy
17
-
the Smartfusion2 SoC as a Secure Computing Platform.
2.5 Summary
In this chapter, a range of concepts used throughout the dissertation were introduced. More specifically,
several cryptographic services and mechanisms were discussed, such as Symmetric and Asymmetric
Key Cryptography, which include AES, RSA and ECC. Additionally, an overview of the SHA-256 hash-
ing function was given, followed by a description of secret key establishment protocols (DH, ECDH and
ECIES). Afterwards, the concept of digital signatures, Public Key Infrastructures and Physically Unclon-
able Functions were introduced.
Moreover, the existing types of Secure Computing Platforms were discussed, including Hardware
Security Modules, Trusted Platform Modules and Smart Cards. After analysing the characteristics of
each platform, it was concluded that the platform that fulfils most of the requirements of this dissertation
are HSMs, but they are usually expensive and have very fixed designs. To understand how an HSM
can be built, a variety of implementation technologies were introduced, including CPUs, ASICs and
FPGAs, with a bigger detail being given to non-volatile FPGAs due to their more suitable characteristics
for security applications and low price.
Finally, the Smartfusion2 SoC (which integrates a microprocessor, internal memories and non-
volatile FPGA fabric) was thoroughly described, with a major highlight being given to its design and data
security features, which differentiate it from its competitors. The next chapter presents the State of the
Art proposals which attempt to create secure computation systems supported by FPGA technologies.
18
-
Chapter 3
State of the Art
In this section, the State of the Art is discussed in regard to secure systems based on FPGAs by
analysing how the different solutions configure their devices to act as secure modules, how their sys-
tems establish communication channels with outside parties and how key management and data storage
is performed. This section is divided into four sub-sections: FPGA as Secure Platform, Key Generation
and Storage, Related Full Systems and Comparison Analysis. The first sub-section presents the sys-
tems which consider FPGAs to be secure platforms, discussing the works that propose schemes for
dynamic reconfiguration of volatile FPGAs and the implementation of the Cipherbase secure hardware
on a volatile FPGA. Secondly, we present works that perform key generation and storage with FPGAs,
such as a PUF-based approach and a rekeying management scheme for Storage Area Networks which
uses a master enveloping key that never changes. The third sub-section discusses two architectures
whose requirements are very similar to ours: the first one uses volatile FPGAs to perform trusted cloud
computing and the one second creates a secure wrapper that embeds an FPGA application. Finally, a
comparison analysis of the major works is performed, depicting a summary of the discussed architec-
tures.
3.1 FPGA as Secure Platform
Gaj et al. [6] consider the use of embedded microprocessor cores within the FPGA to achieve bitstream
security, specifically for reconfiguration of a Xilinx Virtex-II Pro on a Xilinx ML310 board. Nonetheless,
they consider the entire board as a secure device and not just the FPGA, meaning the path between the
FPGA and the external memory is susceptible to tampering.
Arasu et al. [7] present the design of the Cipherbase secure hardware and its implementation using
FPGAs. The Cipherbase system incorporates customized trusted hardware, extending Microsoft’s SQL
Server for efficient execution of queries using both secure hardware and commodity servers, allowing
for the secure storage of data.
They choose a volatile FPGA to implement the Trusted Machine (TM). Since the logic is built from
volatile configuration memories and the binary which defines the computation is loaded at power-on
19
-
from external non-volatile memories, the bitstream configuration can be intercepted.
Figure 3.1: FPGA as a Trusted Machine [7].
Figure 3.1 depicts how the setup of an FPGA as a TM is performed in [7]. There’s a Trusted Authority
(TA) which is trusted by clients and the cloud operator, that is responsible for generating and maintaining
FPGA binary encryption keys. Additionally, it vets and compiles the hardware code associated with the
TM and creates the encrypted and signed binaries for each device.
They assume that the FPGA is secure and that an adversary does not have access to its internal
state, because they believe that the TM is only vulnerable to side-channel analysis. The session es-
tablishment algorithm is not mentioned and the secure bootstrapping and operation of FPGAs is not
described in their work.
3.2 Key Generation and Storage
Arasu et al. [7] present two scenarios for the generation of encryption keys for database operations in
the Cipherbase system. The simplest one consists of embedding a master key into a Trusted Machine
(TM) (programmable region of FPGA) binary and distributing to the database client. The TM would use
this key to encrypt their data with AES. The drawbacks of this approach include the need of the Trusted
Authority to generate separate binaries (each loaded onto the device) for every potential database clients
and the fact that the client is locked to using this key for all their databases.
A more sophisticated approach is suggested by the authors, in which the TM embeds an RSA pub-
lic/private master key pair. The public key is published via standard public key infrastructure techniques,
allowing clients to uniquely identify a certain FPGA. This would allow clients to negotiate AES session
keys for different database fields or different database applications. In case the keys are cached by the
TM, in a key vault, they would be encrypted using the master key (or another key defined by the TA), to
guarantee that only the TM can recover the contents of the vault.
Nabeel et. al propose [8] an approach based on Physically Unclonable Function technology to pro-
vide strong hardware authentication of smart meters and efficient key management for Advanced Me-
tering Infrastructures. They utilize the PUF on the devices to generate and re-generate the symmetric
20
-
keys and access level passwords for smart meters. The PUF based secret generation provides strong
protection against key leakage since the master key is never stored in memory. They implemented the
PUF feedback loop using the Xilinx’s Spartan-6 FPGA board, which is connected to a PC through a
serial port. The error correction of the PUF mechanism and cryptographic operations are done on the
PC.
Wang et al. [9] propose an FPGA based flexible and low-cost rekeying management scheme to
improve the security and reduce the processing time of rekeying processes. They claim that their system
must not only provide secure, high performance, flexible, open and standard based storage infrastructure
but also prevent the data from all kind of attacks, such as Side-Channel Attacks, eavesdropping and
man-in-the-middle attacks.
To prevent the system from being attacked, they perform key management at hardware level (FPGA),
with the software simply sending commands to the key management module (hardware). The software
sends commands, which include key backup, key recovery, key revocation and key generation, to the
key management module (hardware). The hardware, implemented on a FPGA, communicates with the
software through a PCIe interface. The internal memory stores current “active” key pairs, while the
outside flash memory backs up the keys. All involving keys are encrypted before being stored. The
needed keys are generated by RNG and digested through SHA-256 to be aligned with 256-bit length.
In a typical scheme, when re-keying takes place, all the encrypted data must be decrypted using the
old key and encrypted using the new key. They propose a new FPGA flexible and low-cost re-keying
process to avoid the decryption of stored data using the old key and encryption using the new key. Their
scheme proposes the use of a long-term enveloping key that encrypts the user’s access key, which is
used to encrypt a data encryption/decryption key, known as LUN. When the re-keying process occurs,
the new access key is generated, the LUN key is decrypted using the old access key and encrypted
back with the new access key. Finally, the generated access key is encrypted using the enveloping key.
This ensures that the stored data does not need to be decrypted and encrypted again, because the LUN
key remains the same. The following is what is stored in memory: EKaccess(kLUN ), EKenv(kaccess).
The software only stores 32-bit indexes for maintenance, which are extracted at hardware level from
the encrypted private key and the user’s access keys, and are sent out to the software by the FPGA.
Even though the authors claim that their system prevents physical attacks, their explanation is quite
vague. They state that because at software level, only a 32-bit index is used (instead of the key), it
prevents their design from physical attacks, since the attacker would need to obtain the contents of the
internal memories. If an attacker has access to the devices, it is still possible to perform side-channel
analysis unless the devices are truly protected against it.
3.3 Full Security Systems
This section presents two works that fulfil requirements similar to the ones proposed by this thesis. The
first one [11] consists of using volatile FPGAs to perform trusted cloud computing, in which protected
bitstreams are used to create a Root-of-Trust for cloud computing clients. The second one [10] proposes
21
-
a system architecture that wraps an embedded application on a volatile FPGA. The wrapper includes
a secure user authentication interface and cryptographic services which secure all of the embedded
application’s data transfer interfaces.
Eguro et al. describe [11] how protected bitstreams can be used to create a Root-of-Trust for the
clients of cloud computing servers. Their hardware-based approach solves the following problem: how
to secure client data and computation from both potential external attackers and an untrusted system
administrator. The system which addresses this problem uses volatile FPGAs. They are programmed
to form a flexible, independent trusted third party computing platform within the cloud infrastructure.
Their proposed system allows clients to upload their configuration data to the cloud and since cloud
administrators do not have low-level access to computation within the FPGA, it allows clients to offload
sensitive parts of their applications to these devices, avoiding potential vulnerabilities in the software
stack.
The deployment of the trusted computing nodes begins with a trusted authority (TA), which is trusted
by all clients and cloud operator. The TA generates a random symmetric encryption key symkfb and
copies it into the onboard key memory of the FPGA before the platform is delivered to the cloud operator.
After the key has been written, the FPGA can be delivered to the cloud operator and installed. Since
the FPGA comes with a secure boot process (which uses the symmetric key symkfb to decrypt and
authenticate the bitstream configuration), the authors believe the FPGA can be used as a “virtual” HSM.
The authors, however, strive to support a more sophisticated operational model that does not require
direct TA involvement for each and every bitstream. The idea is that the TA provides a single generic
bootstrapping binary for each FPGA that acts as an onboard infrastructure which receives and loads
client applications. Figure 3.2 depicts how the TA generates a private/public RSA key pair and places
the private key into the boostrapping bitstream. The public key is published via a standard PKI. Once
the TA encrypts the bootstrapping bitstream with AES using symkfb, it transfers the configuration into
the flash memory on the FPGA.
Once the bootstrap configuration is running on a FPGA in the cloud, the client can create an appli-
cation for the FPGA to handle sensitive data. The client connects to this device to load their application
securely using standard PKI, like an SSH session, in which the client uses the public key of the device
to exchange a symmetric session key sessionkf .
The attack model proposed by the authors assumes that the following operations are sufficiently
difficult and that they are effectively impossible in practice: breaking the cryptography used; loading a
binary that cannot be decrypted and authenticated properly; retrieving binary or state information on the
device from outside; altering the behaviour of the loaded binary; altering data currently on the device.
Furthermore, the authors also explain the problems that would arise from keys being compromised or
lost. They also emphasize that the immutable bootstrapping logic forms the initial Root-of-Trust for the
clients of cloud computing servers.
Graf proposes a system [10] that acts as a secure wrapper around an embedded application on a
FPGA (depicted in Figure 3.3). This wrapper (known as Amuet) creates a secure user authentication
22
-
Figure 3.2: Setup: TA generates a public/private RSA key pair and transfers the private key privatekfinto the bootstrapping binary (in blue: data is encrypted so it doesn’t need to go through a securechannel) [11].
interface and cryptographically secures the data interfaces accessible to the embedded application,
effectively rendering the FPGA as a black box capable of performing the task for which it was designed.
The architecture introduces a secure token-based authentication scheme (using Java’s iButton [42]) and
a FPGA-based encrypted memory controller. It is important to note that the user application which runs
within the FPGA (protected by the wrapper application proposed) can only be re-programmed at the
factory as there is no interface in the proposed system to perform reconfiguration of user applications.
For this thesis, the relevant part of this work is the way Amuet performs the secure embedding of the
application, which includes the process of retrieving the user identification information (UID) from the
iButton through a secure user interface, the modification of the UID to form a DES key and finally to use
that DES key as the key source for the encrypted memory controller (EMC).
The Authentication Control Unit (ACU) is responsible for a establishing secure communication chan-
nel for the UID transfer from the iButton. It has at its disposal, a xe mod n calculator, a set of RSA
secret keys, a SHA-1 unit and a table of authorized certificates. The protocol used to negotiate a secure
channel between the device and the iButton is similar to a RSA key exchange. However, the public key is
never made public and instead, it is used as the encryption key (e), while the private RSA key is used as
the decryption key (d). The two key pairs and the two moduli (n) used in the RSA-based authentication
scheme, are generated prior to the programming of the iButton and the creation of the FPGA bitstream.
The FPGA stores ni, nf , ei, df and the iButton stores ni, nf , ef , di.
Additionally, the certificates stored are hashes of iButton’s UIDs, which makes their input mathemat-
ically difficult to be found. Since the table resides inside the FPGA, everytime a new certificate is added
or revoked, the bitstream must be modified. To prevent man-in-the-middle attacks, the iButton authenti-
23
-
Figure 3.3: Block diagram of the Amuet architecture [10]. The embedded application is actually the userapplication, protected by the proposed wrapping system.
cates itself as an authorized user to the FPGA but the FPGA also authenticates itself as an authorized
host to the iButton. After a successful authentication, the FPGA sends the UID to the EMC to create the
final key for the DES engine.
The Encrypted Memory Controller (EMC) encrypts and decrypts every transaction between the em-
bedded application within the FPGA and the external memory, outside it. On startup, the EMC uses a
secret DES key, which is unique to and known only to the FPGA, to create the final DES key (it is never
used for ciphering/deciphering operations). This final key is formed by passing the UID from the iButton
through the DES engine using the secret DES key. From the 64-bit result, the first 56 bits are the final
DES key.
3.4 Discussion
In this chapter, the State of the Art was presented in regard to existing solutions that strive to create
secure systems on FPGAs, each with different motivations, as seen in the previous sections. The main
focus of these works are primarily the ability to perform secure key management, establishing secure
communication channels and storing data securely (internally and externally). Due to the lack of non-
volatile solutions, the works herein presented focused on volatile FPGAs, which are subject to several
attacks [3, 5].
To create a Hardware Security Module, a system must have a series of characteristics [12], such as
24
-
anti-cloning mechanisms (e.g. PUF-based key generation), secure communication channels, internal
non-volatile memories for master key storage, anti-tamper mechanisms, internal clock freshness (e.g.
through a Timestamping Authority) and a common developer interface, such as PKCS#11.
However, none of the works provide a robust solutions that is able to address all of the above char-
acteristics. In fact, the proposed solutions do not consider security-oriented devices, which contain
anti-cloning, anti-tamper and side-channel analysis protection mechanisms, neither do they consider
the necessity of internal clock freshness. Moreover, they do not consider the performance and security
of the encryption software/hardware used. Our study finds that only Nabeel et. al [8] considers the use
of a PUF-based mechanism to increase the security and authentication of the overall system.
Additionally, freshness of exchanged data with external parties is highly overlooked by all solu-
tions, which makes them susceptible to replay attacks. They lack any kind of developer interfaces (e.g
PKCS#11) to allow for an easy integration of their systems with external applications.
Table 3.1 depicts the key characteristics of each solution and the main features they provide, in
regards to device security, the ability to establish a secure communication channel and how they perform
key management.
Table 3.1: Comparison of Security Features of the different system pro-
posals.
Category Arasu et al. [7] Wang et al. [9] Nabeel et al. [8] Graf [10] Eguro et al. [11]
Device Security
Device N/D1 Xilinx
Virtex-6
Xilinx
Spartan6
Xilinx
Virtex-E,
iButton
Xilinx
Virtex-6
Bitstream Encryption AES AES-256 AES-256 3DES AES-256
Bitstream Storage External External External External External
Secure Channel
Algorithm RSA N/A2 N/A RSA-based RSA
Key Management
Master Keys Generation Factory Factory PUF Factory Factory
Master Keys RSA AES N/D DES RSA, AES
Master Keys Encryption No No N/D No No
Master Keys Storage External Internal N/D Internal Internal
Session Keys Generation RSA-KE3 RNG PUF RSA-KE RSA-KE
1Not Disclosed2Not Available3RSA-based Key Exchange protocol
25
-
Table 3.1: Comparison of Security Features of the different system pro-
posals.
Category Arasu et al. [7] Wang et al. [9] Nabeel et al. [8] Graf [10] Eguro et al. [11]
Session Keys AES AES AES DES AES
Session Keys Storage Internal Internal,
External
N/D Internal Internal
Considering the above discussion, it can be concluded that the State of the Art works cannot be
used to create a Hardware Security Module or a system that resembles one. Considering that HSMs are
expensive and non-reconfigurable, as mentioned in Section 2.2, the next section proposes a solution,
supported by the Smartfusion2 SoC, which creates a multi-user, flexible HSM, capable of performing
secure key management, communicating securely with external parties, guaranteeing internal clock
freshness and with the ability to sign data and issuing digital certificates. To demonstrate the flexibility
of the HSM, a use-case called Log-Chain is presented and integrated in the HSM.
26
-
Chapter 4
Proposed Solution
The main goal of this work is to create a re-configurable and flexible HSM, supported by a low-cost
non-volatile security-oriented FPGA, as opposed to the State of the Art. As stated previously, existing
commercial HSMs are expensive and fixed in terms of design. On the other hand, the State of the Art
proposals that attempt to create secure systems on FPGAs, focus purely on volatile devices and without
security-oriented characteristics, such as anti-tampering, anti-cloning and side-channel analysis protec-
tion. Additionally, they lack most of the requirements of an HSM, which were described in Section 2.2.
The solution herein proposed, supported by the Smartfusion2 SoC, creates a multi-user, low-cost
and highly flexible secure computation system that performs secure key management, stores data se-
curely internally and externally, can establish a secure communication with outside parties, can maintain
internal clock freshness and is capable of computing digital signatures as well as issuing digital certifi-
cates.
To demonstrate the flexibility of the proposed architecture, the implementation of a novel log cer-
tification scheme is also proposed and developed. This scheme consists in creating a signed Log-
Chain, such as the Linux Syslog messages, transaction logs or even a medical receipts log. Each
message/command signature guarantees the authenticity of that message and all previously logged
messages/commands, therefore creating a chain-of-logs that can be verified and cannot be repudiated
or modified.
The integration of the provided features with applications is done through an extended PKCS#11
middleware (device driver), abstracting the users from the inner workings of the system. The generated