softwarebenchmarkingofthe2nd round...

33
Software Benchmarking of the 2 nd round CAESAR Candidates Ralph Ankele 1 and Robin Ankele 2 1 Information Security Group, Royal Holloway, University of London McCrea Building, Egham TW20 0EX, UK ralph.ankele.2015@rhul.ac.uk 2 Department of Computer Science, University of Oxford Wolfson Building, Oxford OX1 3QD, UK robin.ankele@cs.ox.ac.uk Abstract. The software performance of cryptographic schemes is an important factor in the decision to include such a scheme in real-world protocols like TLS, SSH or IPsec. In this paper, we develop a benchmarking framework to perform software performance measurements on authenticated encryption schemes. In particular, we apply our framework to independently benchmark the 29 remaining 2 nd round candidates of the CAESAR competition. Many of these candidates have multiple parameter choices, or deploy software optimised versions raising our total number of benchmarked im- plementations to 232. We illustrate our results in various diagrams and hope that our contribution helps developers to find an appropriate cipher in their selection choice. Keywords: Software benchmarks · CAESAR · 2 nd round candidates · real-worlds usecases · tls · ssh · optimisations · aes-ni · sse · avx · neon 1 Introduction An authenticated encryption (AE) scheme provides confidentiality, integrity and authentic- ity simultaneously. These goals are typically achieved by combining separate cryptographic primitives, an encryption scheme to ensure confidentiality and a message authentication code for integrity and authenticity. However, the generic composition, the combination of a confidentiality mode with an authentication mode, may lead to implementation errors and is often not efficient (e.g. the input stream has to be processed twice, for the encryption scheme and the message authentication code). Recently, some practical attacks [CHVV03], [Vau02] have confirmed the vulnerability of these AE schemes. In 2013, the CAESAR [Ber14] competition was announced in order to mitigate such attacks. Moreover, the aim of the newly proposed candidates was to achieve fast and efficient execution and to be suitable for widespread adoption. In March 2014, 57 first round candidates were submitted to the CAESAR competition. Abed et al. [AFL14] provide a classification and a general overview of the first round candidates. In July 2015, the second round candidates were announced. The deadline for the announcement of the third round candidates is expected in end July 2016 that requires the CAESAR committee to make a selection decision which candidates will be included in the third round. Beside of the security guarantees of the authenticated encryption mode, the software and hardware performance of the candidate becomes more and more important. One of the requirements imposed on all CAESAR submissions was that they should demonstrate the potential to be superior to AES-GCM in at least one significant aspect. One of those aspects that is particularly significant for the usage in widely deployed protocols

Upload: others

Post on 26-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

Software Benchmarking of the 2nd roundCAESAR Candidates

Ralph Ankele1 and Robin Ankele2

1 Information Security Group, Royal Holloway, University of LondonMcCrea Building, Egham TW20 0EX, UK

[email protected] Department of Computer Science, University of Oxford

Wolfson Building, Oxford OX1 3QD, [email protected]

Abstract. The software performance of cryptographic schemes is an important factorin the decision to include such a scheme in real-world protocols like TLS, SSH or IPsec.In this paper, we develop a benchmarking framework to perform software performancemeasurements on authenticated encryption schemes. In particular, we apply ourframework to independently benchmark the 29 remaining 2nd round candidates of theCAESAR competition. Many of these candidates have multiple parameter choices,or deploy software optimised versions raising our total number of benchmarked im-plementations to 232. We illustrate our results in various diagrams and hope thatour contribution helps developers to find an appropriate cipher in their selection choice.

Keywords: Software benchmarks · CAESAR · 2nd round candidates · real-worldsusecases · tls · ssh · optimisations · aes-ni · sse · avx · neon

1 IntroductionAn authenticated encryption (AE) scheme provides confidentiality, integrity and authentic-ity simultaneously. These goals are typically achieved by combining separate cryptographicprimitives, an encryption scheme to ensure confidentiality and a message authenticationcode for integrity and authenticity. However, the generic composition, the combinationof a confidentiality mode with an authentication mode, may lead to implementationerrors and is often not efficient (e.g. the input stream has to be processed twice, forthe encryption scheme and the message authentication code). Recently, some practicalattacks [CHVV03], [Vau02] have confirmed the vulnerability of these AE schemes. In2013, the CAESAR [Ber14] competition was announced in order to mitigate such attacks.Moreover, the aim of the newly proposed candidates was to achieve fast and efficientexecution and to be suitable for widespread adoption. In March 2014, 57 first roundcandidates were submitted to the CAESAR competition. Abed et al. [AFL14] providea classification and a general overview of the first round candidates. In July 2015, thesecond round candidates were announced. The deadline for the announcement of the thirdround candidates is expected in end July 2016 that requires the CAESAR committee tomake a selection decision which candidates will be included in the third round.

Beside of the security guarantees of the authenticated encryption mode, the softwareand hardware performance of the candidate becomes more and more important. One ofthe requirements imposed on all CAESAR submissions was that they should demonstratethe potential to be superior to AES-GCM in at least one significant aspect. One ofthose aspects that is particularly significant for the usage in widely deployed protocols

Page 2: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

2 Software Benchmarking of the 2nd round CAESAR Candidates

such as TLS, SSH and IPsec is software performance. Therefore, we have developed abenchmarking framework for automated and accurate software performance measurementsand applied it to benchmark the remaining 29 second round candidates. Many of thesecandidates have multiple parameter choices, extending the number to 125 implementations.Overall, we have considered several software performance optimised implementations,raising the total number of benchmarked implementations to 232.

During our benchmarks, we perform and illustrate our measurements on a continuousfunction (with message and associated data sizes from 0 to 2048 bytes, in 128 byte steps)and on some fixed length sizes, relating to real-world usecases — 1 byte (one keystroke ,SSH), 16 bytes (small payload), 557 bytes (average IP packet size), 1.5 kB (ethernet MTUsize, TLS), 16 kB (max. TCP packet size) and 1 MB (e.g. file upload).

Related Work. In order to evaluate the software performance of the CAESAR candidates,Bernstein provides the SUPERCOP∗ [Ber16] framework, which is a general purposeframework for the evaluation of the software performance of cryptographic software.Saarinen presented the BRUTUS† [Saa16] framework, in order to test the CAESARcandidates for cryptographic weaknesses. Additionally to the software performance ofthe candidates, the hardware performance is also an important factor in the selection ofthe final portfolio. Homsirikamol et al. [HDF+15] present a hardware API to performhardware benchmarks of the CAESAR candidates.

Contribution and Outline. In this paper, we present a complementary benchmarkingframework to perform automatic and accurate measurements on authenticated encryptionfunctions. We apply this framework to perform independent software benchmarks on the2nd round CAESAR candidates and illustrate the results in various diagrams.

This paper is organised as follows. In Section 2, we introduce the 2nd round candidatesof the CAESAR competition and classify them regarding their underlying cryptographicprimitive. Section 3, presents our benchmarking framework and the measurement settings.In Section 4, we give an overview and discuss the software optimisation implemented inthe 2nd round candidates. Finally, in Section 5 we present the results of the benchmarkingand Section 6 concludes the paper.

2 Classification of the 2nd round CAESAR CandidatesIn the CAESAR [Ber14] call for submissions the choice of underlying primitive and type ofcipher was open for the designers. After 57 submissions for the first round, 29 submissionshave been selected for the second round. Within these remaining candidates there are 15based on block ciphers, 8 based on sponge constructions, 4 based on stream ciphers, 2based on permutations and 1 based on a compression function. Table 1 gives an overviewof the 2nd round candidates of the CAESAR competition sorted by type.

Some of the candidates are online. An online cipher is defined such that the encryptionof message block Mi is only depending on already processed message blocks M1, . . . ,Mi−1.Furthermore, an inverse-free scheme indicates if the inverse of an underlying primitive isneeded (i.e. for the purpose of decryption of the underlying block cipher). An authenticatedencryption scheme is nonce-misuse resistant, if the cipher is still secure under the repetitionof nonces. Moreover, we can distinguish between nonce misuse resistance for online andoffline ciphers. On the one hand, offline ciphers security can be claimed if the repetitionof the ciphers leak only the ability to see a repeated message. On the other hand, foronline ciphers security can be claimed, if nonces are repeated only the common prefix ofthe messages can be observed.∗https://bench.cr.yp.to/supercop.html†https://github.com/mjosaarinen/brutus

Page 3: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

Ralph Ankele and Robin Ankele 3

Table 1: Overview of the 2nd round CAESAR candidates.‡ Possible types of schemes are: BC: blockcipher based schemes, SC: stream cipher based schemes, Sponge: scheme based on Sponge construction(duplex or otherwise), P: permutation based scheme and CF: scheme based on compression function.Underlying primitives are: AES: when the full AES-128 is used, AES[r]: when a round reduced version ofAES is used, other dedicated primitive or constructions: e.g. SHA2, SPN, LFSR. Entries for nonce misuseresistance are: None: when no security is claimed by the designers if nonces are repeated, Max: when foroffline ciphers the repetition of nonces only leak the ability to see a repeated message, LCP: for onlineciphers, all an adversary can observe is the longest common prefix of messages for repeated nonces

Name Type Primitive Parallel Online Inverse- Security Nonce mis- Reference SoftwareEnc. / Dec. Enc. / Dec. free proof use resistant Version

AEGIS BC AES[1] X/ - X/X - - None [WP14] v1∗AES-COPA BC AES partly/partly X/X - X LCP [ABL+15] v2∗∗AES-JAMBU BC AES, Simon - / - X/X X - LCP [WH15] v2∗AES-OTR BC AES X/X X/X X X None [Min16] v3∗AEZ BC AES[4] X/X - / - X X Max [HKR15] v4∗CLOC BC AES, TWINE - / - X/X - X None [IMGM15] v2♦Deoxys BC AES X/X X/X - X LCP [JNP15a] v1.3∗ELmD BC AES partly/partly X/X - X LCP [DN15] v2∗Joltik BC Joltik-BC X/X X/X - X LCP [JNP15b] v1.3∗OCB BC AES X/X X/X - X None [KR14] v1∗POET BC AES partly/partly X/X X X LCP [AFF+15] v2∗∗SCREAM BC SPN X/X X/X - - None [GLS+15] v3∗SHELL BC AES partly/partly X/X - X None [Wan15] v2∗SILC BC AES, Present, LED - /- X/X X X None [IMG+15] v2♦Tiaoxin BC AES[1] X/X X/X X - None [Nik15] v2∗

OMD CF SHA2 - / - X/X - X None [CMN+15] v2∗

Minalpher P SPN X/X X/X X [STA+15] v1.1♠PAEQ P AESQ X/X X/X X X [BK14] v1∗∗

ACORN SC LFSR X/X X/X X - None [Wu15] v2∗HS1-SIV SC ChaCha/Poly1305 - / - - / - X X Max [Kro15] v1∗MORUS SC LRX - / - X/X X - None [HW15] v1∗TriviA-ck SC Trivium partly/partly - / - X X None [CM15] v2∗

Ascon Sponge SPN - / - X/X X X [DEMS15] v1.1∗∗ICEPOLE Sponge Keccack-like X/ X X/X X X [MGH+15] v2∗Ketje Sponge Keccack-f - / - X/X X X None [BDP+14] v1♣Keyak Sponge Keccack-f X/ X X/X X X None [BDP+15] v2♣NORX Sponge LRX X/X X/X X X None [AJN15] v1♥π-cipher Sponge ARX X/X X/X X X Intermediate [GMS+15] v2�PRIMATEs Sponge SPN - / - X/X X X LCP [ABB+14] v1∗∗STRIBOB Sponge Streebog - / - X/X X X None [OSB15] v2∗∗

‡ Data obtained from: https://aezoo.compute.dtu.dk/doku.php∗ Source code obtained via email from designer.∗∗ Source code from SUPERCOP v20140622 obtained from https://bench.cr.yp.to/supercop.html.♥ Source code received from https://github.com/norx/norx.♦ Source code received from http://www.nuee.nagoya-u.ac.jp/labs/tiwata/AE.♣ Source code received from https://github.com/gvanas/KeccakCodePackage.♠ Source code received from http://info.isl.ntt.co.jp/crypt/minalpher/web/downloads.html.� Source code received from http://pi-cipher.org.

3 Benchmarking FrameworkSimilar to the SUPERCOP [Ber16] and BRUTUS [Saa16] frameworks, we have developeda benchmarking framework to facilitate automatic software performance evaluations forauthenticated encryption functions. Using our toolkit, one can perform independent andaccurate software performance measurements (i.e. timing measurements), review theimplementation of the cipher under test, generate and illustrate statistical information

Page 4: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

4 Software Benchmarking of the 2nd round CAESAR Candidates

Table 2: Comparison between prior benchmarking frameworks for authenticated encryptionschemes and our approach.

Name TimingMethod

NoiseReduction Accuracy Simplicity Applicability Visualisation Use Cases

SUPERCOP RDTSC+CPUID none∗ high complex wide range‡ none♥ 0 - 2KB (1bit)and add. settings♠

BRUTUS clock() none average simple AE ciphers none 16, 64, 256, 1k, 4k,16k, and 65kB

This paper RDTSCPaveraging andmedian† high complex§ AE ciphers 2d, 3d and

bar plots0 - 2KB (128bit)and see Table 3

∗ Release of multiple measurements of the same function for post processing (median and quartiles) on web page.Recommendation to turn off hyper-threading and over-clocking as well as recommentation to perform measure-ment when the computer is in an idle state.

† Measurements performed in single user mode.§ An updated version of our framework aims to provide automated benchmarking and more user-friendliness.‡ Hash functions, stream ciphers, signature schemes, secret sharing, encryption and AE schemes♥ Unclear and crowded 2d plots available at https://bench.cr.yp.to/results-aead.html♠ Additionally settings include (all in the same ranges): forgery of message, only message, and only ad processing.

about the software performance, and compare different parameters and optimisations of acipher among its different optimisations or to other ciphers under test. More specifically,the benchmarking functionality is implemented in C programming language and can becompiled on various platforms. Moreover, in order allow for semi-automatic testing, wesupply cmake and other script based functionalities (i.e. bash scripts for compiling andbenchmarking of the software candidates under test). In order to illustrate and visualise theresults of the benchmarking and to promote reasonable comparision between the softwarecandidates, we supply some visualisation scripts runnable in a Matlab environment. Ourbenchmarking suite is currently optimised to perform measurements of authenticatedencryption functions. Nevertheless, it remains open to be extended to perform bench-marking of other functionality, and include other types of software tests. Generally, ourbenchmarking framework provides the following benefits compared to the SUPERCOPand BRUTUS framework.

� Very simple with only focus on comparison of AE schemes.� Due to its simplicity open for extension of other software performance tests.� Optimised TSC instruction (i.e. rdtscp) for accurate timing measurements.� Reduction of noise in the measurement using single user mode, averaging and median.� Benchmarking, review, illustration and comparison of results.

Using our framework, we built and benchmarked the 2nd round candidates (see Table 1)of the ongoing CAESAR competition. By July 2016, this includes the implementations of29 candidates. Many of these candidates have multiple parameter choices, extending thenumber to 125 implementations. Overall, we have considered several software performanceoptimised implementations, raising the total number of benchmarked implementations to232‡.

3.1 Measurement SetupFor our measurements we had two different measurement setups. The first measurementswere taken on a 64-bit Sandy Bridge 2.3GHz dual-core Intel Core i5-2415M processor.The underlying hardware was a Mac Book Pro Early 2011 running OS X 10.11 (ElCapitan) in single user mode. The second measurements were taken on a 64-bit Skylake‡Number of implementations supported by our benchmarking hardware.

Page 5: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

Ralph Ankele and Robin Ankele 5

Table 3: Real-world use case settings for our benchmarking.Message Size Associated Data Size Comments

1 byte 5 byte one keystroke (e.g. SSH)16 bytes 5 byte small payload557 byte 5 byte average IP packet size∗1.5 kB 5 byte ethernet MTU, TLS16 kB 5 byte max TCP packet size1 MB 5 byte file upload∗ http://slaptijack.com/networking/average-ip-packet-size

2.4GHz dual-core Intel Core i5-6300U processor. The underlying hardware was a DELLLatitude E7470 running Ubuntu 16.04 in single user mode. The implementations undertest were built using the clang compiler version 6.1.0 (clang-602.0.53) and the gcc compilerversion 5.4.0 (5.4.0-6ubuntu1-16.04.2). Additionally, we enabled the following compilerflags for compile time optimisation of the underlying source code: -Ofast -fno-stack-protector -march=native. In order to get rid of the noise(e.g. context switches) duringthe measurement and to get a clean and stable result in an uninterrupted environment,we choose to boot in the single user mode, which just starts the core operating systemand provides a sh command interface, without starting other core elements of the OS Xoperating system (e.g. network services, multi-user support, graphical user interface) thatmay interfere with our benchmarking. Furthermore, we used the same method as describedin the paper from Krovetz and Rogaway [KR11]. In their approach, they determine theperformance of the underlying function as the median of 91 averaged timings of 200measurements each.

3.2 High Resolution Methods for CPU Timing Information

In general, there are several ways to determine the software performance of a functionunder test. Each operating system offers several methods of accurate timing sources to beused to determine the runtime (e.g. latency) or throughput of a function. Using multi-coreprocessors, one must be aware of handling hyper-threading and over-clocking of the CPUduring their benchmarking, to avoid distorted or falsified measurements.

For the performance of high precision timing measurements on Windows platforms, onecould use the HPET (i.e. High Precision Event Timer) or the QueryPerformanceCounter.On Unix based platforms one can use the posix conform time() and clock() functions,as used in the BRUTUS framework. However, a much more accurate timing source onx86 processors is the Timer Stamp Counter (TSC) as used in the SUPERCOP framework.The TSC is a 64 bit register and counts the number of cycles since the last reset of themachine. Using the RDTSC assembler instruction, one can read out the current TSC valueand use it for accurate timing measurements.

In newer processor hardware, Intel processors practice out-of-order execution, whereinstructions are not always performed in the order as they appear in the source code. Thiscould lead to a skewed measurement result. To avoid such errors, the CPUID instruction,a serialisation instruction, can be inserted, which forces every preceding instruction tocomplete before the execution of the RDTSC instruction. In our framework, we make use ofthe RDTSCP [Pao10] instruction, which is an optimised versions of the combination CPUID+ RDTSC.

Page 6: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

6 Software Benchmarking of the 2nd round CAESAR Candidates

Table 4: Most common optimisations for the CAESAR candidatesInstruction Set Vendor Microarchitecture CAESAR Candidate

AES-NI Intel Westmere OCB, OTR, AEGIS, CLOC, SILC, POET,JAMBU, AEZ, Deoxys, PAEQ, Tiaoxin, TriviA-ck

SSE Intel Pentium III OCB, CLOC, Keyak, Morus, OMD,Scream, Stribob, Tiaoxin, TriviA-ck

AVX Intel Sandy Bridge NORX, OMDAVX2 Intel Haswell Keyak, Minalpher, Morus, NORXNEON ARM ARMv7-A OCB, NORX, Scream, Stribob

3.3 Authenticated Encryption Benchmarking Settings and Real-WorldUsecases

In our benchmarking, we implemented measurements on a continuous function (i.e. wherewe vary the message size and the associated data size) and over some fixed message andassociated data sizes (i.e. representing some real-world usecases).

Overall, we consider message size from 0 to 1 MB and associated data size from 0 to 2KB. The sizes of the key and nonce are fixed-size values, as determined by the CAESARspecification. In the case of the continuous function we consider message and associateddata size from 0 to 2048 bytes in 128 byte steps. On the other hand, for the timingmeasurements with fixed message and associated data size, we consider the values inTable 3.

The associated data is fixed to 5 bytes§ for all measurements, which represents thetypical size of the header of a TLS packet. Additionally, for better comparability, werepresent the performance results of the authenticated encryption candidates in cycle perbyte (cbp), which is a function of the throughput of the ciphers, instead of releasing thetimings (i.e. latency) of the candidates.

4 Software OptimisationsFor some ciphers software-optimised implementations are provided. These optimisationsare achieved by the usage of architecture-specific instructions and intrinsics. An intrinsicfunction is handled in a special manner by the compiler where the intrinsic function mapsto some specific assembler instructions. Table 4 gives an overview of the mainly usedinstruction set extensions in the 2nd round candidates. Furthermore, in the followingwe give a short overview of the most common used instruction set extensions to achievesoftware optimised algorithms.

4.1 AES-NI.As many CAESAR candidates are AES based, they can obtain an increase in softwareperformance by using the AES New Instructions [Gue12]. These set of instructions areavailable beginning with the 2010 Intel processors based on the 32nm microarchitectureWestmere. The instruction set consists of six instructions that offer full hardware supportfor AES enabling fast and secure encryption and decryption. The AES New Instructionsincrease the performance by more than an order of magnitude for parallel modes (e.g. CTRmode), and provide roughly 2-3 fold gains for non-parallelisable modes (e.g. CBC mode).On a Westmere microarchitecture the AES-NI instructions can run with approximately1.3 cycles per byte in parallel mode of operation.

§http://netsekure.org/2010/03/tls-overhead

Page 7: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

Ralph Ankele and Robin Ankele 7

4.2 SSE.The Streaming SIMD Extensions [Sie09] contain around 70 instructions for high perfor-mance optimised implementations. The SSE instructions are available since Pentium IIIand the latest update SSE4.2 is available with the Nehalem microarchitecture. The exten-sions provide up to 16 128-bit registers (xmm0-15) and provide vector-mode operations,enabling parallel execution of one operation on multiple data. The speedup that can beachieved with the SSE instructions depends on whether the data can be parallelised, asoperations are used in a vector-mode where the same operation is executed on multipledata.

4.3 AVX.The Advanced Vector Extensions [Lom11] were introduced with the Sandy Bridge mi-croarchitecture and extend the SSE instructions with a new set of instructions and a newcoding scheme. The AVX1 instructions extend the previous xmm0-15 128-bit registers to16 256-bit registers (ymm0-15). Furthermore, memory alignments have been relaxed andthree-operand non-destructive operations have been added. Previously, with two-operandcommands the source operand was always overwritten by the instruction (e.g. A = A+ B). With the new three-operand instructions (e.g. A = B + C) an additional movinstruction to secure the source-value is not necessary. Additionally, the AVX2 [Bux11]instructions are an advancement of the AVX instructions supporting integer datatypes andadding vector shift operations. The new instructions were introduced with the Haswellmicroarchitecture.

4.4 NEON.The implementation of the Advanced SIMD extensions used in ARM processors is calledNEON [neo] and is available with the Cortex-A microarchitecture. ARM processors areused in many mobile devices such as tablets and mobile phones. The NEON instructionsare 128-bit SIMD extensions providing a flexible and powerful acceleration as they performpackaged SIMD processing. Registers are therefore considered as vectors of elements withthe same data type and perform the same operation on all lanes.

5 ResultsIn the following, we present the results of the benchmarking of the 2nd round CAESARcandidates using our previously described framework. In this section, we illustrate ageneral overview of the best currently available implementations (i.e. software optimisedversions, where applicable). In the appendices, we give a more detailed overview for eachcandidate. These illustrations comprise a 2d plot of any software implementation (i.e.several software optimised versions, where applicable), a 3d plot and a bar plot supportingour measurement settings for the best software optimised version of said candidate.

Figure 1 and Figure 2 gives a comparison of the best implementations for each CEASARcandidate on SandyBridge and Skylake processor, respectively. These graphs serve as anoverview of the current best implementation with regards to latency. Moreover, Figure 7illustrates a comparison between the best block cipher based candidates; Figure 8 acomparison of stream cipher based; Figure 9 a comparison of sponge based; Figure 10 acomparison of permutation based; and Figure 11 a comparison of compression functionbased candidates. In all graphs, we include a measurement of GCM as a baseline —which software optimised versions of the candidates should aim to reach or surpass. Todemonstrate the impact of a CAESAR candidate in a real-world setting, we also madebenchmarks in the following settings: Figure 3 compares the best implementation per

Page 8: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

8 Software Benchmarking of the 2nd round CAESAR Candidates

CAESAR candidate for a TLS setting on the Skylake architecture, while Figure 4 showsthe same setting on the SandyBridge architecture. We chose the TLS setting, becausewe believe it is the most common usecase for authenticated encryption. Therefore, weset the associated data length to 5 bytes¶ which represents the typical length of a TLSheader. The message length is set to 1500 bytes, which corresponds to the MTU sizeof an ethernet frame. Moreover, we also made benchmarks for the SSH setting. In thissetting, we fix the associated data length to 5 bytes and the message length to 1 byte (i.e.one key stroke). Figure 5 compares the best implementation per CAESAR candidate forthe SSH setting on the Skylake architecture, and Figure 6 shows the same setting on theSandyBridge architecture. Furthermore, we created benchmarks for all candidates withincreasing message/associated data lengths given in the appendices. Additionally, 3D plotsare provided for the best implementation of each cipher, in order to illustrate the evolutionin performance when the message length and the associated data length increase. We givedetailed illustrations of these benchmarks in the appendices.

The benchmarking framework and our scripts to visualise the results are availableonline‖ for reviewing purposes and to encourage other designers and researchers to use ourframework for benchmarking their cryptographic schemes.

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 20480.510.710.92

1.361.75

2.47

4.21

5.91

9

13.14

20.53

31.11

47.47

82.4

116.3

272.37

1059.18

1590.87

6611.66

Message length (bytes)

Pe

rfo

rma

nce

(cp

b)

acorn128v2_opt

aeadaes256ocbtaglen128v1_opt

aegis128l_aesnic

aes128n8t8clocv2_aesni

aes128n8t8silcv2_aesni

aes128otrpv3_nip7m1

aes128otrsv2_ref

aes128poetv2aes4ls0lt0_ni

aescopav2_ref

aesjambuv2_aesni

aezv4_aesni

ascon128av11_opt64

deoxysneq256128v13_aesni

elmd600v2_ref

hs1sivlov1_ref

icepole128av2_ref

joltikneq6464v13_ref

ketjesrv1_reference

lakekeyakv2_generic64lc

minalpherv11_ref

morus640128v1_sse2

norx6441_xmm

omdsha512k512n256tau256v2_avx1

paeq64_aesni

pi64cipher128v2_ref

primatesv1gibbon80_ref

scream10v3_sse

shellaes128v2d4n80_ref

stribob192r2_ssse3

tiaoxinv2_nim

trivia0v2_ref

Figure 1: Comparison of the best implementation (on the SandyBridge processor)for each CAESAR candidate. The associated data is fixed to 0 bytes.

¶http://netsekure.org/2010/03/tls-overhead‖https://github.com/TheBananaMan/caesar_benchmarks_secondround

Page 9: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

Ralph Ankele and Robin Ankele 9

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 20480.29

0.470.63

0.94

1.5

2.53

3.69

5.09

8.25

11.42

17.61

25.16

36.93

61.14

128.46

482.85609.55

1567.98

Message length (bytes)

Pe

rfo

rma

nce

(cp

b)

acorn128v2_optaeadaes256ocbtaglen128v1_optaegis128l_aesnibaes128gcmv1_openssl

aes128n8t8clocv2_aesniaes128n8t8silcv2_aesniaes128otrpv3_nip7m1aescopav2_refaesjambuv2_aesni

aezv4_aesniascon128av11_opt64deoxysneq256128v13_aesnielmd600v2_ref

hs1sivlov1_reficepole128av2_refjoltikneq6464v13_refketjesrv1_referenceminalpherv11_ref

morus1280256v1_avx2norx6441_ymmomdsha512k512n256tau256v2_avx1paeq64_aesni

pi64cipher256v2_goptvpoetv2aes4_niprimatesv1gibbon80_refscream10v3_sseseakeyakv2_SandyBridge

shellaes128v2d4n80_refstribob192r2_ssse3tiaoxinv2_nimtrivia0v2_sse4

Figure 2: Comparison of the best implementation (on the SkyLake processor) foreach CAESAR candidate. The associated data is fixed to 0 bytes.

100

101

102

103

104

aegis128l_aesnic

tiaoxinv2_nim

aezv4_aesni

morus640128v1_sse2

aeadaes128ocbtaglen128v1_opt

aes128poetv2aes4ls0lt0_ni

deoxysneq256128v13_aesni

norx3241_xmm

aes128n8t8clocv2_aesni

aes128n8t8silcv2_aesni

paeq64_aesni

lakekeyakv2_generic64lc

aesjambuv2_aesni

hs1sivlov1_ref

acorn128v2_opt

ascon128av11_opt64

scream10v3_sse

omdsha512k128n128tau128v2_avx1

aes128otrsv2_ref

stribob192r2_ssse3

icepole128av2_ref

pi64cipher128v2_ref

trivia128v2_ref

elmd600v2_ref

shellaes128v2d4n64_ref

ketjejrv1_compact

aescopav2_ref

minalpherv11_ref

joltikneq6464v13_ref

primatesv1gibbon80_ref

Performance (cpb)

0.62

0.74

1.41

1.53

2.45

3.58

3.59

4.93

5.57

5.7

6.69

7.42

10.72

11.79

13.74

14.68

19.67

21.28

23.35

26.56

26.93

27.6

46.83

59.19

79.6

175.51

215.16

1115.12

1185.21

5894.83

Figure 3: Comparison of 2nd round CAESAR candidates in the TLS setting (onthe SandyBridge processor).

Page 10: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

10 Software Benchmarking of the 2nd round CAESAR Candidates

100

101

102

103

tiaoxinv2_nim

aegis128l_aesnic

aezv4_aesni

aes128otrpv3_nip7m2

morus1280256v1_avx2

aeadaes128ocbtaglen128v1_opt

deoxysneq128128v13_aesni

poetv2aes4_ni

aes128gcmv1_openssl

norx6441_ymm

aes128n12t8clocv2_aesni

aes128n8t8silcv2_aesni

lakekeyakv2_generic64

paeq80_aesni

ascon128av11_opt64

aesjambuv2_aesni

hs1sivlov1_ref

pi64cipher128v2_goptv

trivia0v2_sse4

acorn128v2_opt

scream10v3_sse

icepole128av2_ref

omdsha512k128n128tau128v2_sse4

stribob192r2_ssse3

ketjesrv1_reference

shellaes128v2d8n80_ref

aescopav2_ref

minalpherv11_ref

primatesv1gibbon80_ref

joltikeq12864v13_ref

Performance (cpb)

0.38

0.4

0.68

0.82

1.07

1.26

1.37

1.6

1.8

2.43

2.57

2.62

4.12

4.56

6.25

6.31

7.11

7.14

9.01

9.05

9.26

9.45

18.05

22.47

36.76

37.31

105.68

543.38

1309.54

1757.25

Figure 4: Comparison of 2nd round CAESAR candidates in the TLS setting (onthe SkyLake processor).

100

101

102

103

104

aegis128_aesni

tiaoxinv2_nim

aezv4_aesni

aes128n8t8clocv2_aesni

aesjambuv2_aesni

morus640128v1_sse2

aes128n8t8silcv2_aesni

ascon128av11_opt64

paeq128_aesni

norx3241_xmm

aes128poetv2aes4ls0lt0_ni

aeadaes128ocbtaglen128v1_opt

deoxysneq256128v13_aesni

lakekeyakv2_generic64lc

aes128otrsv2_ref

hs1sivlov1_ref

stribob192r2_ssse3

trivia128v2_ref

scream10v3_ref

elmd600v2_ref

acorn128v2_opt

omdsha256k192n104tau128v2_avx1

pi32cipher256v2_ref

ketjejrv1_compact

icepole128av2_ref

shellaes128v2d4n64_ref

aescopav2_ref

joltikneq6464v13_ref

minalpherv11_ref

primatesv1hanuman80_ref

Performance (cpb)

47.98

79.05

89.42

91.44

94.82

96.11

126.73

166.5

175.2

182.46

186.21

320.42

381.65

425.78

433.06

478.97

516.84

596.4

652.64

743.39

812.69

850.87

925.52

1396.83

1575.73

2033.89

2097.23

4717.89

7219.62

18828.45

Figure 5: Comparison of 2nd round CAESAR candidates in the SSH setting (onthe SandyBridge processor).

Page 11: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

Ralph Ankele and Robin Ankele 11

100

101

102

103

104

aegis128l_aesnic

aezv4_aesni

tiaoxinv2_nim

aes128otrpv3_nip7m2

aesjambuv2_aesni

aes128n12t8clocv2_aesni

ascon128av11_opt64

aes128n8t8silcv2_aesni

morus1280256v1_avx2

deoxysneq128128v13_aesni

poetv2aes4_ni

paeq80_aesni

norx6441_ymm

aeadaes128ocbtaglen128v1_opt

aes128gcmv1_openssl

lakekeyakv2_generic64

hs1sivlov1_ref

trivia0v2_sse4

acorn128v2_opt

stribob192r2_ssse3

ketjesrv1_reference

pi64cipher128v2_goptv

icepole128av2_ref

scream10v3_sse

aescopav2_ref

omdsha512k128n128tau128v2_sse4

shellaes128v2d8n80_ref

minalpherv11_ref

joltikeq12864v13_ref

primatesv1gibbon80_ref

Performance (cpb)

44.82

49.46

50.06

54.47

61.53

67.45

71.4

80.5

100.4

101.41

130

144.74

158.96

191.06

213.41

235.17

256.33

257.01

391.85

440.1

475.58

489.86

646.63

712.97

954.52

1121.2

1267.94

3879.76

6969.07

7487.21

Figure 6: Comparison of 2nd round CAESAR candidates in the SSH setting (onthe SkyLake processor).

6 ConclusionIn this paper, we present a new benchmarking framework to perform software performancemeasurements of authenticated encryption schemes. We apply our framework to benchmarkthe 2nd round CAESAR candidates. Software performance is an important factor inreal-world applications. Our benchmarks show that candidates supported by a software-optimised implementation achieve a huge speedup with regards to others, without anoptimised implementation. The aim of the CAESAR competition is to provide a portfolioof authenticated encryption algorithms, including implementations optimised for software(and hardware), as in the eStream portfolio [est08]. Currently, nearly two thirds of thecandidates are supported by one or more software optimisations. Nevertheless, withour benchmarks we want to encourage designers to increase the number of optimisedimplementations.

AcknowledgementsWe would like to thank the designers of the CAESAR candidates, which supported us withtheir software implementations. The first author is supported by the Marie Sklodowska-Curie ITN ECRYPT-NET grant (Project Reference 643161).

References[ABB+14] Elena Andreeva, Begül Bilgin, Andrey Bogdanov, Atul Luykx, Florian Mendel,

Bart Mennink, Nicky Mouha, Quingju Wang, and Kan Yasuda. PRIMATEsv1.02. https://competitions.cr.yp.to/round2/primatesv102.pdf, 2014.

Page 12: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

12 Software Benchmarking of the 2nd round CAESAR Candidates

[ABL+15] Elena Andreeva, Andrey Bogdanov, Atul Luykx, Bart Mennink, El-mar Tischhauser, and Kan Yasuda. AES-COPA v.2. https://competitions.cr.yp.to/round2/aescopav2.pdf, 2015.

[AFF+15] Farzaneh Abed, Scott R. Fluhrer, John Foley, Christian Forler, Eik List,Stefan Lucks, David McGrew, and Jakob Wenzel. The POET Family of On-Line Authenticated Encryption Schemes. https://competitions.cr.yp.to/round2/poetv20.pdf, 2015.

[AFL14] Farzaneh Abed, Christian Forler, and Stefan Lucks. General Overview of theAuthenticated Schemes for the First Round of the CAESAR Competition.Cryptology ePrint Archive, Report 2014/792, 2014.

[AJN15] Jean-Philippe Aumasson, Philipp Jovanovic, and Samuel Neves. NORX v2.0.https://competitions.cr.yp.to/round2/norxv20.pdf, 2015.

[BDP+14] Guido Bertoni, Joan Daemen, Michaël Peeters, Gilles Van Assche, andRonny Van Keer. CAESAR submission: KETJE v1. https://competitions.cr.yp.to/round1/ketjev1.pdf, 2014.

[BDP+15] Guido Bertoni, Joan Daemen, Michaël Peeters, Gilles Van Assche, andRonny Van Keer. CAESAR submission: KEYAK v2. https://competitions.cr.yp.to/round2/keyakv21.pdf, 2015.

[Ber14] Daniel J. Bernstein. Caesar: Competition for authenticated encryption:Security, applicability, and robustness. https://competitions.cr.yp.to/caesar.html, 2014.

[Ber16] Daniel J. Bernstein. Supercop. https://bench.cr.yp.to/supercop.html,2016.

[BK14] Alex Biryukov and Dimitry Khovratovich. PAEQ v1. https://competitions.cr.yp.to/round1/paeqv1.pdf, 2014.

[Bux11] Mark Buxton. Intel R© advanced vector extensions programming ref-erence. https://software.intel.com/en-us/blogs/2011/06/13/haswell-new-instruction-descriptions-now-available, 2011.

[CHVV03] Brice Canvel, Alain Hiltgen, Serge Vaudenay, and Martin Vuagnoux. PasswordInterception in a SSL/TLS Channel, pages 583–599. Springer, 2003.

[CM15] Avik Chakraborti and Nandi Mridul. TriviA-ck-v2. https://competitions.cr.yp.to/round2/triviackv2.pdf, 2015.

[CMN+15] Simon Cogliani, Diana Maimu�k, David Naccache, Rodrigo Portella, RezaReyhanitabar, Serge Vaudenay, and Damian Vizár. Offset Merkle-Damgard(OMD) version 2.0. https://competitions.cr.yp.to/round2/omdv20.pdf,2015.

[DEMS15] Christoph Dobraunig, Maria Eichlseder, Florian Mendel, and Martin Schläffer.ASCON v1.1. https://competitions.cr.yp.to/round2/asconv11.pdf, 2015.

[DN15] Nilanjan Datta and Mridul Nandi. ELmD v2.0. https://competitions.cr.yp.to/round2/elmdv20.pdf, 2015.

[est08] estream: the ecrypt stream cipher project. http://www.ecrypt.eu.org/stream/, 2008.

Page 13: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

Ralph Ankele and Robin Ankele 13

[GLS+15] Vincent Grosso, Gaëtan Leurent, François-Xavier Standaert, Kerem Varici,Anthony Journault, François Durvaux, Lubos Gaspar, and Stephanie Kerckhof.SCREAM: Side-Channel Resistant Authenticated Encryption with Masking.https://competitions.cr.yp.to/round2/screamv3.pdf, 2015.

[GMS+15] Danilo Gligoroski, Hristina Mihajloska, Simona Samardjiska, Håkon Jacobsen,Mohamed El-Hadedy, Rune Erlend Jensen, and Daniel Otte. π-cipher v2.0.https://competitions.cr.yp.to/round2/picipherv20.pdf, 2015.

[Gue12] Shay Gueron. Intel R© advanced encryption standard (aes) new instructions set.Technical report, Intel Corporation, 2012.

[HDF+15] Ekawat Homsirikamol, William Diehl, Ahmed Ferozpuri, Farnoud Farahmand,Malik Umar Sharif, and Kris Gaj. GMU Hardware API for AuthenticatedCiphers. Cryptology ePrint Archive, Report 2015/669, 2015.

[HKR15] Viet Tung Hoang, Ted Krovetz, and Phillip Rogaway. AEZ v4.1: Authenti-cated Encryption by Enciphering. https://competitions.cr.yp.to/round2/aezv41.pdf, 2015.

[HW15] Tao Huang and Hongjun Wu. The Authenticated Cipher MORUS (v1). https://competitions.cr.yp.to/round2/morusv11.pdf, 2015.

[IMG+15] Tetsu Iwata, Kazuhiko Minematsu, Jian Guo, Sumio Morioka, and EitaKobayashi. SILC: SImple Lightweight CFB. https://competitions.cr.yp.to/round2/silcv2.pdf, 2015.

[IMGM15] Tetsu Iwata, Kazuhiko Minematsu, Jian Guo, and Sumio Morioka. CLOC:Compact Low-Overhead CFB. https://competitions.cr.yp.to/round2/clocv2.pdf, 2015.

[JNP15a] Jérémy Jean, Ivica Nikolić, and Thomas Peyrin. Deoxys v1.3. https://competitions.cr.yp.to/round2/deoxysv13.pdf, 2015.

[JNP15b] Jérémy Jean, Ivica Nikolić, and Thomas Peyrin. Joltik v1.3. https://competitions.cr.yp.to/round2/joltikv13.pdf, 2015.

[KR11] Ted Krovetz and Phillip Rogaway. The Software Performance of Authenticated-Encryption Modes, pages 306–327. Springer, 2011.

[KR14] Ted Krovetz and Phillip Rogaway. OCB (v1). https://competitions.cr.yp.to/round1/ocbv1.pdf, 2014.

[Kro15] Ted Krovetz. HS1-SIV (v2). https://competitions.cr.yp.to/round2/hs1sivv2.pdf, 2015.

[Lom11] Chris Lomont. Introduction to intel R© advanced vector exten-sions. https://software.intel.com/en-us/articles/introduction-to-intel-advanced-vector-extensions, 2011.

[MGH+15] Paweł Morawiecki, Kris Gaj, Ekawat Homsirikamol, Krystian Matusiewicz,Josef Pieprzyk, Marcin Rogawski, Marian Srebrny, and Marcin Wójcik. ICE-POLE v2. https://competitions.cr.yp.to/round2/icepolev2.pdf, 2015.

[Min16] Kazuhiko Minematsu. AES-OTR v3. https://competitions.cr.yp.to/round2/aesotrv3.pdf, 2016.

[neo] Neon.

Page 14: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

14 Software Benchmarking of the 2nd round CAESAR Candidates

[Nik15] Ivica Nikolić. Tiaoxin-346. https://competitions.cr.yp.to/round2/tiaoxinv2.pdf, 2015.

[OSB15] Markku-Juhani O. Sarrinen and Billy B. Brumley. STRIBOBr2: "WHIRLBOB".https://competitions.cr.yp.to/round2/stribobr2.pdf, 2015.

[Pao10] Gabriele Paoloni. How to benchmark code execution times on intel R© ia-32 andia-64 instruction set architectures. Technical report, Intel Corporation, 2010.

[Saa16] Markku-Juhani Saarinen. The BRUTUS automatic cryptanalytic framework.Journal of Cryptographic Engineering, 6(1):75–82, 2016.

[Sie09] Sam Siewert. Using Intel R© Streaming SIMD Extensions andIntel R© Integrated Performance Primitives to Accelerate Algorithms.https://software.intel.com/en-us/articles/using-intel-streaming-simd-extensions-and-intel-integrated-performance-primitives-to-accelerate-algorithms, 2009.

[STA+15] Yu Sasaki, Yosuke Todo, Kazumaro Aoki, Yusuke Naito, Takeshi Sugawara,Yumiko Murakami, Mitsuru Matsui, and Shoichi Hirose. Minalpher v1.1.https://competitions.cr.yp.to/round2/minalpherv11.pdf, 2015.

[Vau02] Serge Vaudenay. Security Flaws Induced by CBC Padding — Applications toSSL, IPSEC, WTLS..., pages 534–545. Springer, 2002.

[Wan15] Lei Wang. SHELL v2.0. https://competitions.cr.yp.to/round2/shellv20.pdf, 2015.

[WH15] Hongjun Wu and Tao Huang. The JAMBU Lightweight Authentication Encryp-tion Mode (v2). https://competitions.cr.yp.to/round2/aesjambuv2.pdf,2015.

[WP14] Hongjun Wu and Bart Preneel. AEGIS: A Fast Authenticated Encryption Al-gorithm (v1). https://competitions.cr.yp.to/round1/aegisv1.pdf, 2014.

[Wu15] Hongjun Wu. ACORN: a Lightweight Authenticated Cipher (vs). https://competitions.cr.yp.to/round2/acornv2.pdf, 2015.

Page 15: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

Ralph Ankele and Robin Ankele 15

A Comparison of Candidates per Type

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 20480.29

0.47

0.63

0.94

1.5

2.01

2.8

7.25

10.07

36.93

61.14

104.93

510.99

Message length (bytes)

Perf

orm

ance (

cpb)

aeadaes256ocbtaglen128v1_opt

aegis128l_aesnib

aes128n8t8clocv2_aesni

aes128n8t8silcv2_aesni

aes128otrpv3_nip7m1

aescopav2_ref

aesjambuv2_aesni

aezv4_aesni

deoxysneq256128v13_aesni

elmd600v2_ref

joltikneq6464v13_ref

poetv2aes4_ni

scream10v3_sse

shellaes128v2d4n80_ref

tiaoxinv2_nim

Figure 7: Comparison of Block cipher based 2nd round CAESAR candidates

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 20480.93

1.35

6.95

8.64

10.48

25.16

Message length (bytes)

Perf

orm

ance (

cpb)

acorn128v2_opt

hs1sivlov1_ref

morus1280256v1_avx2

trivia0v2_sse4

Figure 8: Comparison of Stream cipher based 2nd round CAESAR candidates

Page 16: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

16 Software Benchmarking of the 2nd round CAESAR Candidates

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 20482.23

2.87

3.69

5.83

8.3

11.42

22.38

32.4638.5

56.84

1309.31567.98

Message length (bytes)

Perf

orm

ance (

cpb)

ascon128av11_opt64

icepole128av2_ref

ketjesrv1_reference

norx6441_ymmpi64cipher256v2_goptv

primatesv1gibbon80_ref

seakeyakv2_SandyBridge

stribob192r2_ssse3

Figure 9: Comparison of Sponge based 2nd round CAESAR candidates

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 20484.255.09

531.83609.55

Message length (bytes)

Perf

orm

ance (

cpb)

minalpherv11_ref

paeq64_aesni

Figure 10: Comparison of Permutation based 2nd round CAESAR candidates

Page 17: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

Ralph Ankele and Robin Ankele 17

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

16.9

21.65

60.24

Message length (bytes)

Perf

orm

ance (

cpb)

omdsha512k512n256tau256v2_avx1

Figure 11: Comparison of Compression function based 2nd round CAESARcandidates

Page 18: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

18 Software Benchmarking of the 2nd round CAESAR Candidates

B Comparison of all Implementations per Candidate

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

8.6410.48

25.16

400.75489.48

1313.5

Perf

orm

ance (

cpb)

Figure 12: acorn128v2_opt (blue), acorn128v2_ref (orange)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

0.31

0.410.5

0.64

1.45

5.92

7.68

9.88

17.86

25.84

Perf

orm

ance (

cpb)

Figure 13: aegis128_aesni(blue), aegis128_ref (orange), aegis128l_aesnia (yel-low), aegis128l_aesnib (brown), aegis128l_aesnic (green), aegis128l_ref (lightblue), aegis256_aesni (purple), aegis256_ref (darkblue)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

0.62

0.79

2.01

49.32

66.33

Perf

orm

ance (

cpb)

Figure 14: aezv4_aesni (blue), aezv4_ref (orange)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

6.21

8.61

9.32

10.3710.86

14.04

17.06

20.27

Perf

orm

ance (

cpb)

Figure 15: ascon128av11_opt64 (blue), ascon128av11_ref (orange), as-con128v11_opt64 (yellow), ascon128v11_ref (purple)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

2.52.8

5.18

14.3

24.92

232.82

Perf

orm

ance (

cpb)

Figure 16: aes128n12t8clocv2_aesni (blue), aes128n12t8clocv2_ref (or-ange), aes128n8t8clocv2_aesni (yellow), aes128n8t8clocv2_ref (purple),twine80n6t4clocv2_ref (green), twine80n6t4clocv2_vperm (light blue)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

104.48107.35

128.46

Perf

orm

ance (

cpb)

Figure 17: aescopav2_ref(blue)

Page 19: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

Ralph Ankele and Robin Ankele 19

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

0.91

1.31

124.1

191.63

287.61379

444.16

Perf

orm

ance (

cpb)

Figure 18: deoxyseq128128v13_ref (blue), deoxyseq256128v13_ref (orange),deoxysneq128128v13_aesni (yellow), deoxysneq128128v13_ref (purple), de-oxysneq256128v13_aesni (light blue), deoxysneq256128v13_ref (cyan)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

58.159.76

64.02

75.09

82.67

110.38

Perf

orm

ance (

cpb)

Figure 19: elmd1000v2_ref (blue), elmd1001v2_ref (green),elmd101270v2_ref(yellow), elmd101271v2_ref (purple), elmd600v2_ref(green), elmd601v2_ref (cyan doted), elmd61279v2_ref (magneta),elmd61271v2_ref (purple dotted)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

6.95

8.258.86

10.32

12.18

15.01

41.76

Perf

orm

ance (

cpb)

Figure 20: hs1sivhiv1_ref (blue), hs1sivlov1_ref (orange), hs1sivv1_ref (yel-low)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

8.79

9.57

12.31

32.46

37.67

Perf

orm

ance (

cpb)

Figure 21: icepole128av2_ref (blue), icepole128v2_ref (orange), ice-pole256av2_ref (yellow)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

7.21

9.99

17.74

22.95

48.26

78.82

99.3

Perf

orm

ance (

cpb)

Figure 22: aesjambuv2_aesni (blue), aesjambuv2_ref (orange), si-monjambu128v2_ref (yellow), simonjambu64v2_ref (purple), simon-jambu96v2_ref (green)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

482.39

872.78

976.93

1749.99

1904.17

Perf

orm

ance (

cpb)

Figure 23: joltikeq12864v13_ref (light green), joltikeq6464v13_ref(orange), joltikeq80112v13_ref (yellow), joltikeq9696v13_ref (pur-ple), joltikneq12864v13_ref (green), joltikneq6464v13_ref (light blue),joltikneq80112v13_ref (magenta), joltikneq9696v13_ref (blue)

Page 20: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

20 Software Benchmarking of the 2nd round CAESAR Candidates

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

36.2238.89

80.12

91.32

114.57

Perf

orm

ance (

cpb)

Figure 24: ketjejrv1_compact (blue), ketjejrv1_reference (orange), ket-jesrv1_compact (yellow), ketjesrv1_reference (purple)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

4.04

5.055.98

11.49

14.27

29.29

146.8

320.04

Perf

orm

ance (

cpb)

Figure 25: lakekeyakv2_reference32bit (magenta), lakekeyakv2_compact(red), lakekeyakv2_generic32 (yellow), lakekeyakv2_generic32lc (purple),lakekayakv2_generic64 (green), lakekeyakv2_generic64lc (light blue),lakekeyakv2_SandyBridge (dark blue)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

8.51

9.4310.08

11.14

13.08

14.46

20.9922.86

27.0429.5

42.64

Perf

orm

ance (

cpb)

Figure 26: riverkeyakv2_compact (orange), riverkeyakv2_generic32 (yel-low), riverkeyak_generic32lc (purple), riverkeyakv2_generic64 (purple),riverkeyakv2_generic64lc (light blue), riverkeyakv2_reference (brown),riverkeyakv2_reference32bits (dark blue), riverkeyakv2_SandyBridge (blue)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

3.58

5.29

8.27

14.54

23.57

86.74

178.1

254.38

836.19

Perf

orm

ance (

cpb)

Figure 27: seakeyakv2_reference32bits (magenta), seakeyakv2_compact(orange), seakeyakv2_generic32 (yellow), seakeyakv2_generic32lc (pur-ple), seakeyakv2_generic64 (green), seakeyakv2_generic64lc (light blue),seakeyakv2_SandyBridge (blue)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

9.1311.1214.8619.8826.07

38.94

67.43

118.69

450.79

772.37

4371.63

Perf

orm

ance (

cpb)

Figure 28: lunarkeyakv2_reference32bits (magenta), lunarkeyakv2_compact(orange), lunarkeyakv2_generic32 (yellow), lunarkeyakv2_generic32lc (pur-ple), lunarkeyav2_generic64 (green), lunarkeyakv2_generic64lc (light blue),lunarkeyakv2_SandyBridge (blue)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

4.92

7.599.84

13.22

20.64

34.92

47.25

175.09

243.92

395.88

1682.93

Perf

orm

ance (

cpb)

Figure 29: oceankeyakv2_reference32bits (magenta), oceankeyakv2_compact(orange), oceankeyakv2_generic32 (yellow), oceankeyakv2_generic32lc (pur-ple), oceankeyakv2_generic64 (green), oceankeyakv2_generic64lc (light blue),oceankeyakv2_SandyBridge (blue)

Page 21: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

Ralph Ankele and Robin Ankele 21

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

531.83

555.65

609.55

Perf

orm

ance (

cpb)

Figure 30: minalpherv11_ref (blue)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

0.93

1.26

1.46

1.68

1.992.24

2.55

3.51

4.04

4.76

7.69

9.02

Perf

orm

ance (

cpb)

Figure 31: morus640128v1_ref (orange), morus1280256v1_ref(light blue), morus1280128v1_avx2 (red), morus1280128v1_ref64(brown), morus1280256v1_ref64 (brown), morus1280128v1_ref(brown), morus1280256v1_avx2 (green dotted), morus1280256v1_sse2(blue),morus6400128v1_sse2 (purple),morus1280128v1_sse2 (yellow)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

2.23

2.87

3.744.28

5.56

6.777.96

9.8511.29

20.79

60.8472

Perf

orm

ance (

cpb)

Figure 32: norx0841_ref (blue), norx1641_ref (orange), norx3241_ref(black), norx3241_xmm (purple), norx3261_ref (green), norx3261_xmm(pink), norx6441_ref (magenta), norx6441_ymm (brown), norx6441_xmm(light green), norx6444_ref (yellow), norx6461_ref (purple), norx6461_xmm(cyan dotted), norx6461_ymm (light blue)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

1

1.22

1.772.09

7.85

28.4832.2936.15

40.445.8851.82

Perf

orm

ance (

cpb)

Figure 33: aes128ocbtaglen128v1_opt (black), aes128ocbtaglen128v1_ref(purple), aes128ocbtaglen64v1_ref (purple), aes128ocbtaglen96v1_ref(purple), aes192ocbtaglen128v1_opt (yellow), aes192ocbtaglen128v1_ref(blue), aes192ocbtaglen64v1_ref (blue), aes192ocbtaglen96v1_ref (blue),aes256ocbtaglen128v1_opt (red), aes256ocbtaglen128v1_ref (green),aes256ocbtaglen64v1_ref (green), aes256ocbtaglen96v1_ref (green)

128 256 384 512 640 768 896 1024 1152

Message length (bytes)

0.680.79

1

2.56

13.34

20.47

Perf

orm

ance (

cpb)

Figure 34: aes128otrpv3_ref (green), aes128otrsv3_ref (yellow),aes128otrpv3_nip7m1 (blue), aes128otrpv3_nip7m2 (magenta dotted),aes128otrpv3_nip8m1 (purple dotted), aes128otrpv3_nip8m2 (light green),aes128otrsv3_nip7m1 (light red dotted), aes128otrsv3_nip7m2 (magentadotted), aes128otrsv3_nip8m1 (dark blue), aes128otrsv3_nip8m2 (light blue)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

0.9111.05

3.39

17.46

25.77

Perf

orm

ance (

cpb)

Figure 35: aes128otrpv3_ref (green), aes256otrsv3_ref (yellow),aes256otrpv3_nip7m1 (blue), aes256otrpv3_nip7m2 (magenta dotted),aes256otrpv3_nip8m1 (purple dotted), aes256otrpv3_nip8m2 (light green),aes256otrsv3_nip7m1 (light red dotted), aes256otrsv3_nip7m2 (magentadotted), aes256otrsv3_nip8m1 (dark blue), aes256otrsv3_nip8m2 (light blue)

Page 22: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

22 Software Benchmarking of the 2nd round CAESAR Candidates

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

4.254.976.12

9.46

329.12397.07510.58

1556.83

Perf

orm

ance (

cpb)

Figure 36: paeq128_aesni (pink), paeq128_ref (purple), paeq128t_aesni(black), paeq128t_ref (purple), paeq128tnm_aesni(cyan dotted),paeq128tnm_ref (light blue dotted), paeq160_aesni (magenta), paeq160_ref(blue), paeq64_aesni (brown), paeq64_ref (yellow), paeq80_aesni (lightgreen), paeq80_ref (green dotted)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

21.43

23

24.5125.63

31.88

34.65

48.91

59.91

Perf

orm

ance (

cpb)

Figure 37: omdsha256k128n96tau128v2_avx1 (red),omdsha256k128n96tau128v2_ref (green dotted), omd-sha256k128n96tau128v2_sse4 (yellow), omdsha256k128n96tau64v2_avx1(purple), omdsha256k128n96tau64v2_ref (green dotted), omd-sha256k128n96tau64v2_sse4 (red), omdsha256k128n96tau96v2_avx1(purple), omdsha256k128n96tau96v2_ref (blue dotted), omd-sha256k128n96tau96v2_sse4 (red)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

21.45

24.52

31.88

34.62

49.03

59.72

Perf

orm

ance (

cpb)

Figure 38: omdsha256k192n104tau128v2_avx1 (blue), omd-sha256k192n104tau128v2_ref (orange), omdsha256k192n104tau128v2_sse4(yellow dotted)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

21.43

23

24.5125.63

31.88

34.65

48.91

59.91

Perf

orm

ance (

cpb)

Figure 39: omdsha256k256n104tau160v2_avx1 (red dotted), omd-sha256k256n104tau160v2_ref (blue), omdsha256k256n104tau160v2_sse4(red dotted), omdsha256k256n248tau256v2_avx1 (yellow), omd-sha256k256n248tau256v2_ref (green), omdsha256k256n248tau256v2_sse4(purple)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

21.4522.12

24.5

31.85

34.58

48.98

59.68

Perf

orm

ance (

cpb)

Figure 40: omdsha512k128n128tau128v2_avx1 (light blue dotted), omd-sha512k128n128tau128v2_ref (red), omdsha512k128n128tau128v2_sse4(purple dotted), omdsha512k256n256tau256v2_avx1 (yellow), omd-sha512k256n256tau256v2_ref (green), omdsha512k256n256tau256v2_sse4(purple), omdsha512k512n256tau256v2_avx1 (yellow), omd-sha512k512n256tau256v2_ref (green), omdsha512k512n256tau256v2_sse4(light blue dotted)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

5.846.73

8.39.51

17.1219.89

29.3331.81

56.5365.5

72.47

128.19

Perf

orm

ance (

cpb)

Figure 41: pi16cipher096v2_ref (red), pi16cipher128v2_ref (light greendotted), pi32cipher128v2_ref (pink dotted) pi32cipher256v2_ref (lightgblue), pi64cipher128v2_ref (yellow), pi64cipher256v2_ref (cyan dot-ted), pi16cipher096v2_gotpv (black dotted), pi16cipher128v2_goptv(blue), pi32cipher128v2_goptv (green) pi32cipher256v2_goptv (magenta),pi64cipher128v2_goptv (purple), pi64cipher256v2_goptv (purple)

Page 23: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

Ralph Ankele and Robin Ankele 23

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

1.42

2.01

2.63

3.19

8.4

20.26

29.9

46.77

Perf

orm

ance (

cpb)

Figure 42: poetv2aes128_ni (blue), poetv2aes128_ref (orange), poetv2aes4_ni(yellow) poetv2aes4_ref (purple)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

1306.19

2344.4

2549.63

4628.23

5228.8

Perf

orm

ance (

cpb)

Figure 43: primatesv1ape120_ref (blue), primatesv1ape80_ref (orange),primatesv1gibbon120_ref (yellow), primatesv1gibbon80_ref (purple), pri-matesv1hanuman120_ref (green), primatesv1hanuman80_ref (cyan dotted)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

8

9.61

11.91

38.59

43.6145.8

50.92

Perf

orm

ance (

cpb)

Figure 44: scream10v3_ref (blue), scream10v3_sse (orange), scream12v3_ref(yellow), scream12v3_sse (purple)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

35.87

40.8

76.69

85.06

94.25

Perf

orm

ance (

cpb)

Figure 45: shellaes128v2d4n64_ref (red), shellaes128v2d4n80_ref(blue), shellaes128v2d5n64_ref (green), shellaes128v2d5n80_ref (pur-ple), shellaes128v2d6n64_ref (yellow), shellaes128v2d6n80_ref (cyan),shellaes128v2d7n64_ref (light green), shellaes128v2d7n80_ref (purple),shellaes128v2d8n64_ref (orange), shellaes128v2d8n80_ref (black)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

2.52.82

5.27

23.8328.56

4980.83

Perf

orm

ance (

cpb)

Figure 46: present80n6t4silcv2_ref(green), aes128n12t8silcv2_aesni(blue), aes128n12t8silcv2_ref (red), aes128n8t8silcv2_aesni (yellow),aes128n8t8silcv2_ref (purple)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

22.14

31.25

65.771.43

428.68

716.9

Perf

orm

ance (

cpb)

Figure 47: stribob192r2_8bit (blue), stribob192r2_bitslice (orange), stri-bob192r2_ref (yellow), stribob192r2_smaller (purple), stribob192r2_ssse3(green)

Page 24: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

24 Software Benchmarking of the 2nd round CAESAR Candidates

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

0.29

0.47

1.95

4.73

7.01

27.17

Perf

orm

ance (

cpb)

Figure 48: tiaoxinv2_nim (blue), tiaoxinv2_ref (orange)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

8.74

9.98

30.7133.03

52.66

Perf

orm

ance (

cpb)

Figure 49: trivia0v2_ref (blue), trivia128v2_ref (yellow), trivia128v2_sse4 (or-ange)

0 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

1.51

2.75

11.22

978311969.25

Perf

orm

ance (

cpb)

Figure 50: aesgcm128_ref (orange), aesgcm128_openssl (blue), aes-gcm256_openssl (yellow)

Page 25: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

Ralph Ankele and Robin Ankele 25

C 3D Plots for Increasing Message and Associated DataLengths

Message (bytes)Associated Data (bytes)

15

00

20

128 128

25

256 256

30

384384

Perf

orm

ance (

cycle

s/b

yte

)

35

512 512

40

640 640

45

768 768

50

8968961024 1024

11521152 128012801408 1408

153615361664 1664

179217921920 1920

2048 2048

cpb: 48.77

cpb: 23.51

cpb: 16.22

cpb: 13.35

cpb: 51.01

cpb: 30.88

cpb: 20.88

cpb: 15.82

cpb: 13.32

cpb: 25.73

cpb: 21.97

cpb: 18.28

cpb: 15.31

cpb: 13.29

cpb: 18.47cpb: 17.48

cpb: 16.16

cpb: 14.66

cpb: 13.24

cpb: 15.55cpb: 15.23

cpb: 14.75

cpb: 14.02

cpb: 13.15

Figure 51: acorn128v2_opt

Message (bytes)Associated Data (bytes)

00

20

128128

40

256256 384384

Perf

orm

ance (

cycle

s/b

yte

)

60

512 512640 640

80

768768

100

8968961024 1024

1152 11521280 1280

1408 140815361536 16641664 17921792 19201920 20482048

cpb: 95.01

cpb: 29.27

cpb: 3.92

cpb: 5.59

cpb: 101.9

cpb: 50.29

cpb: 19.7

cpb: 11

cpb: 2.94

cpb: 28.49cpb: 25.11

cpb: 16.34

cpb: 9.63

cpb: 5.06

cpb: 11.46cpb: 12.19

cpb: 8.29

cpb: 6.74

cpb: 2.57

cpb: 6.37cpb: 5.92cpb: 4.79

cpb: 4.12

cpb: 2.83

Figure 52: aeadaes256ocbtaglen128v1_opt

Message (bytes)Associated Data (bytes)

0.6

0 0

0.8

128128

1

256 256

1.2

384384

1.4

Perf

orm

ance (

cycle

s/b

yte

)

512 512

1.6

640640

1.8

768768

2

2.2

896 8961024 1024

115211521280 1280

1408 140815361536

1664 16641792 1792

192019202048 2048

cpb: 2.22

cpb: 1

cpb: 0.66

cpb: 0.52

cpb: 2.12

cpb: 1.27

cpb: 0.83

cpb: 0.61

cpb: 0.5

cpb: 0.92

cpb: 0.79

cpb: 0.65

cpb: 0.55

cpb: 0.49

cpb: 0.58cpb: 0.55

cpb: 0.52

cpb: 0.48

cpb: 0.45

cpb: 0.43cpb: 0.43cpb: 0.42

cpb: 0.42

cpb: 0.42

Figure 53: aegis128l_aesnic

Message (bytes)Associated Data (bytes)

5.5

0 0

6

128128

6.5

256 256

7

384 384

Perf

orm

ance (

cycle

s/b

yte

)

7.5

512 512640640

8

768768

8.5

896 8961024 1024

1152 115212801280 14081408 15361536

1664 166417921792 19201920

2048 2048

cpb: 8.83

cpb: 6.53

cpb: 5.77

cpb: 5.5

cpb: 7.89

cpb: 6.72

cpb: 5.99

cpb: 5.65

cpb: 5.44

cpb: 6.05cpb: 5.92

cpb: 5.7

cpb: 5.52

cpb: 5.41

cpb: 5.52cpb: 5.52cpb: 5.47

cpb: 5.41

cpb: 5.35

cpb: 5.31cpb: 5.32cpb: 5.33

cpb: 5.31

cpb: 5.3

Figure 54: aes128n8t8clocv2_aesni

Message (bytes)Associated Data (bytes)

5.5

00

6

128128

6.5

256 256

7

384 384

7.5

Perf

orm

ance (

cycle

s/b

yte

)

512 512

8

640 640

8.5

768 768

9

896896 10241024 115211521280 1280

140814081536 1536

166416641792 1792

1920 19202048 2048

cpb: 9.16

cpb: 6.56

cpb: 5.81

cpb: 5.53

cpb: 8.78

cpb: 7.15

cpb: 6.2

cpb: 5.73

cpb: 5.51

cpb: 6.34

cpb: 6.14

cpb: 5.84

cpb: 5.61

cpb: 5.47

cpb: 5.65cpb: 5.63

cpb: 5.56

cpb: 5.47

cpb: 5.4

cpb: 5.37cpb: 5.38cpb: 5.36

cpb: 5.34

cpb: 5.33

Figure 55: aes128n8t8silcv2_aesni

Message (bytes)Associated Data (bytes)

120

0 0

140

128128

160

256 256

180

384384

200

Perf

orm

ance (

cycle

s/b

yte

)

512512

220

640 640

240

768768

260

896896 102410241152 1152

128012801408 1408

15361536 166416641792 1792

1920 192020482048

cpb: 272.2

cpb: 234.1

cpb: 223.1

cpb: 218.7

cpb: 191.7cpb: 189.6

cpb: 202.2cpb: 208.6

cpb: 211.7

cpb: 135.6

cpb: 148.7

cpb: 170.8cpb: 188.4

cpb: 200.2

cpb: 119.2cpb: 127.7

cpb: 145.2

cpb: 165.2cpb: 183.2

cpb: 112.5cpb: 117.2cpb: 128.1

cpb: 144cpb: 162.8

Figure 56: aescopav2_ref

Page 26: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

26 Software Benchmarking of the 2nd round CAESAR Candidates

Message (bytes)Associated Data (bytes)

11

00

11.5

128128

12

256256

12.5

384 384

13

Perf

orm

ance (

cycle

s/b

yte

)

512512

13.5

640640

14

768768

14.5

896 8961024 1024

1152 115212801280 14081408 15361536 16641664

1792 17921920 1920

20482048

cpb: 14.82

cpb: 11.88

cpb: 11.05

cpb: 10.71

cpb: 14.95

cpb: 12.73

cpb: 11.61

cpb: 11.06

cpb: 10.78

cpb: 12.01

cpb: 11.64

cpb: 11.27

cpb: 10.95

cpb: 10.74

cpb: 11.18cpb: 11.08

cpb: 10.94

cpb: 10.79

cpb: 10.65

cpb: 10.85cpb: 10.82

cpb: 10.77

cpb: 10.71

cpb: 10.63

Figure 57: aesjambuv2_aesni

Associated Data (bytes)Message (bytes)

1

00

1.5

128 128

2

256 256

2.5

384384

Perf

orm

ance (

cycle

s/b

yte

)

3

512512

3.5

640640

4

768 768

4.5

896896 10241024 11521152 128012801408 1408

1536 153616641664 17921792 19201920

2048 2048

cpb: 4.62

cpb: 2.38

cpb: 1.63

cpb: 1.33

cpb: 2.05

cpb: 2.58

cpb: 1.9

cpb: 1.49

cpb: 1.28

cpb: 0.96

cpb: 1.5

cpb: 1.41

cpb: 1.28

cpb: 1.18cpb: 0.66

cpb: 0.97cpb: 1.02

cpb: 1.04

cpb: 1.05 cpb: 0.54cpb: 0.7cpb: 0.76

cpb: 0.82

cpb: 0.89

Figure 58: aezv4_aesni

Message (bytes)Associated Data (bytes)

15

00

16

128128

17

256256

18

384 384

Perf

orm

ance (

cycle

s/b

yte

)

19

512512 640640

20

768 768

21

896 8961024 1024

1152 11521280 1280

140814081536 1536

1664 166417921792 19201920 20482048

cpb: 20.41

cpb: 16.54

cpb: 15.52

cpb: 15.1

cpb: 21.71

cpb: 18.2

cpb: 16.47

cpb: 15.61

cpb: 15.18

cpb: 16.75

cpb: 16.18

cpb: 15.72

cpb: 15.33

cpb: 15.1

cpb: 15.38cpb: 15.26

cpb: 15.17

cpb: 15.08

cpb: 14.9

cpb: 14.81cpb: 14.77

cpb: 14.78

cpb: 14.75

cpb: 14.75

Figure 59: ascon128av11_opt64

Message (bytes)Associated Data (bytes)

4

0 0

6

128128

8

256256

10

384384

Perf

orm

ance (

cycle

s/b

yte

)

12

512 512

14

640640

16

768768

18

8968961024 1024

115211521280 1280

140814081536 1536

1664 166417921792

1920 19202048 2048

cpb: 18.07

cpb: 7.41

cpb: 4.18

cpb: 2.81

cpb: 17.93

cpb: 10.3

cpb: 6.21

cpb: 4.03

cpb: 2.81

cpb: 7.32

cpb: 6.15

cpb: 4.81

cpb: 3.66

cpb: 2.78

cpb: 4.11cpb: 3.94

cpb: 3.55

cpb: 3.17

cpb: 2.65

cpb: 2.77cpb: 2.79cpb: 2.68

cpb: 2.59

cpb: 2.41

Figure 60: deoxysneq256128v13_aesni

Associated Data (bytes) Message (bytes)

40

0 0128128

50

256 256

60

384 384

Perf

orm

ance (

cycle

s/b

yte

)

512 512

70

640640768 768

80

8968961024 1024

11521152 12801280 14081408 153615361664 1664

1792 17921920 1920

20482048

cpb: 85.89

cpb: 66.21

cpb: 60.5

cpb: 58.25

cpb: 64.66

cpb: 60.42

cpb: 58.33

cpb: 57.31

cpb: 56.8cpb: 44.84

cpb: 47.64cpb: 50.61cpb: 52.85

cpb: 54.39

cpb: 39.1cpb: 41.3

cpb: 44.3cpb: 47.71

cpb: 50.86

cpb: 36.97cpb: 38.16cpb: 40.17cpb: 43.11

cpb: 46.61

Figure 61: elmd600v2_ref

Message (bytes)Associated Data (bytes)

5

0 0128 128

10

256 256

15

384 384

Perf

orm

ance (

cycle

s/b

yte

)

512 512

20

640 640

25

768768

30

8968961024 1024

1152 115212801280

1408 14081536 1536

16641664 17921792 192019202048 2048

cpb: 30.39

cpb: 16.81

cpb: 12.88

cpb: 11.25

cpb: 21.93

cpb: 14.92

cpb: 12.83

cpb: 11.32

cpb: 10.74

cpb: 7.13cpb: 7.81cpb: 8.93

cpb: 9.2

cpb: 9.61cpb: 3.53cpb: 4.16cpb: 5.48cpb: 6.79

cpb: 8.01

cpb: 1.97cpb: 2.49cpb: 3.4cpb: 4.58

cpb: 6.1

Figure 62: hs1sivlov1_ref

Page 27: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

Ralph Ankele and Robin Ankele 27

Message (bytes)Associated Data (bytes)

00

30

128128

40

256 256

50

384 384

Perf

orm

ance (

cycle

s/b

yte

)

512 512

60

640640

70

768 768

80

896896 102410241152 1152

12801280 140814081536 1536

1664 16641792 1792

19201920 20482048

cpb: 82.96

cpb: 41.84

cpb: 30.05

cpb: 25.31

cpb: 79.13

cpb: 44.2

cpb: 32.77

cpb: 27.01

cpb: 24.08

cpb: 38.05

cpb: 30.85

cpb: 27.67

cpb: 25.11

cpb: 23.35

cpb: 26.22cpb: 24.11

cpb: 23.55

cpb: 22.87

cpb: 22.25

cpb: 21.5cpb: 20.74

cpb: 20.8

cpb: 20.87

cpb: 20.94

Figure 63: icepole128av2_ref

Message (bytes)Associated Data (bytes)

110

00

120

128128 256256

130

384384

140

Perf

orm

ance (

cycle

s/b

yte

)

512512

150

640640

160

768 768

170

896896 102410241152 1152

12801280 14081408 15361536 16641664 17921792 19201920 20482048

cpb: 171.8

cpb: 127.4

cpb: 114.7

cpb: 109.5

cpb: 165.3

cpb: 134.3

cpb: 119.7

cpb: 112.5

cpb: 108.8

cpb: 121.3

cpb: 116.8

cpb: 112.9

cpb: 109.8

cpb: 107.7

cpb: 108.8cpb: 108

cpb: 107.4

cpb: 106.9

cpb: 106.2

cpb: 103.7cpb: 103.6

cpb: 103.8cpb: 104

cpb: 104.4

Figure 64: ketjesrv1_reference

Associated Data (bytes) Message (bytes)

8

0 0128128

10

256 256

12

384 384

14

Perf

orm

ance (

cycle

s/b

yte

)

512512

16

640640

18

768768896 896

20

1024 10241152 1152

1280 128014081408 15361536 16641664 17921792

1920 192020482048

cpb: 20.01

cpb: 12.34

cpb: 8.91

cpb: 7.55

cpb: 19.41

cpb: 14.36

cpb: 11.72

cpb: 8.28

cpb: 7.33

cpb: 9.23cpb: 9.26

cpb: 9.2cpb: 8.26

cpb: 7.16

cpb: 7.4cpb: 7.64

cpb: 7.93cpb: 7.07

cpb: 6.96

cpb: 6.16cpb: 6.37

cpb: 6.67cpb: 6.7

cpb: 6.46

Figure 65: lakekeyakv2_generic64lc

Associated Data (bytes) Message (bytes)

1.5

0 0

2

128128

2.5

256256

3

384 384

Perf

orm

ance (

cycle

s/b

yte

)

512512

3.5

640 640

4

768768

4.5

896 896102410241152 1152

1280 12801408 1408

153615361664 1664

179217921920 1920

20482048

cpb: 4.75

cpb: 2.34

cpb: 1.66

cpb: 1.4

cpb: 4.75

cpb: 2.94

cpb: 2.04

cpb: 1.6

cpb: 1.38

cpb: 2.33

cpb: 2.02

cpb: 1.74

cpb: 1.5

cpb: 1.35

cpb: 1.66cpb: 1.58

cpb: 1.49cpb: 1.39

cpb: 1.31

cpb: 1.37

cpb: 1.36cpb: 1.34

cpb: 1.3

cpb: 1.27

Figure 66: morus640128v1_sse2

Message (bytes)Associated Data (bytes)

4

0 0

5

128128

6

256 256

7

384384

Perf

orm

ance (

cycle

s/b

yte

)

8

512 512

9

640640

10

768768

11

896896 102410241152 1152

12801280 14081408 153615361664 1664

1792 17921920 1920

2048 2048

cpb: 11.19

cpb: 5.7

cpb: 3.91

cpb: 3.31

cpb: 11.15

cpb: 7.78

cpb: 5.4

cpb: 3.98

cpb: 3.39

cpb: 5.77

cpb: 5.43

cpb: 4.62

cpb: 3.79

cpb: 3.35

cpb: 3.92cpb: 4.01

cpb: 3.8cpb: 3.45

cpb: 3.21

cpb: 3.35

cpb: 3.41cpb: 3.37

cpb: 3.22

cpb: 3.11

Figure 67: norx6441_xmm

Associated Data (bytes) Message (bytes)

00

20

128128

30

256256 384384

40

Perf

orm

ance (

cycle

s/b

yte

)

512512

50

640 640768768

60

8968961024 1024

1152 11521280 1280

140814081536 1536

166416641792 1792

1920 19202048 2048

cpb: 65.95

cpb: 33.08

cpb: 23.73

cpb: 20.02

cpb: 57.01

cpb: 32.82

cpb: 24.75

cpb: 20.74

cpb: 20.08

cpb: 24.59

cpb: 20.57

cpb: 19.3

cpb: 18.27

cpb: 17.59

cpb: 15.28cpb: 14.43

cpb: 14.9cpb: 15.41

cpb: 15.89cpb: 11.58cpb: 12.17

cpb: 11.97cpb: 12.84

cpb: 13.88

Figure 68: omdsha512k512n256tau256v2_avx1

Page 28: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

28 Software Benchmarking of the 2nd round CAESAR Candidates

Message (bytes)Associated Data (bytes)

4

0 0

6

128 128256 256

8

384 384

Perf

orm

ance (

cycle

s/b

yte

)

10

512 512640640

12

768768896 89610241024

1152 11521280 1280

14081408 153615361664 1664

1792 179219201920

2048 2048

cpb: 13.92

cpb: 8.45

cpb: 7.05

cpb: 6.42

cpb: 8.54

cpb: 10.03

cpb: 7.87

cpb: 6.97

cpb: 6.4

cpb: 5.26

cpb: 6.83

cpb: 6.45

cpb: 6.31

cpb: 6.1 cpb: 4.09

cpb: 5.03

cpb: 5.19cpb: 5.42

cpb: 5.57

cpb: 3.73cpb: 4.23

cpb: 4.39cpb: 4.7

cpb: 5.01

Figure 69: paeq64_aesni

Associated Data (bytes) Message (bytes)

30

00128 128

40

256256384 384

50

Perf

orm

ance (

cycle

s/b

yte

)

512512 640640

60

768 768

70

8968961024 1024

115211521280 1280

140814081536 1536

1664 16641792 1792

19201920 20482048

cpb: 71.13

cpb: 39.16

cpb: 29.96

cpb: 26.27

cpb: 70.16

cpb: 46.72

cpb: 35.02

cpb: 29.02

cpb: 26.03

cpb: 38.18

cpb: 34.46

cpb: 30.75

cpb: 27.68

cpb: 25.61

cpb: 28.95

cpb: 28.24cpb: 27.27

cpb: 26.07

cpb: 24.98

cpb: 25.25cpb: 25.12

cpb: 24.94cpb: 24.6

cpb: 24.19

Figure 70: pi64cipher128v2_ref

Message (bytes)Associated Data (bytes)

15

20

0 0

25

128128

30

256 256

35

384384

40

Perf

orm

ance (

cycle

s/b

yte

)

512512

45

640 640

50

768 768

55

8968961024 1024

1152 115212801280 14081408 15361536

1664 16641792 1792

1920 192020482048

cpb: 57.01

cpb: 27.41

cpb: 18.96

cpb: 15.57

cpb: 59.99

cpb: 43.1

cpb: 27.84

cpb: 20.25

cpb: 16.41

cpb: 28.3

cpb: 27.82

cpb: 22.72

cpb: 18.69

cpb: 15.98

cpb: 19.34cpb: 20.19

cpb: 18.67

cpb: 16.94

cpb: 15.36

cpb: 15.72cpb: 16.36

cpb: 15.94cpb: 15.34

cpb: 14.61

Figure 71: scream10v3_sse

Message (bytes)Associated Data (bytes)

40

00

60

128 128256 256

80

384384

100

Perf

orm

ance (

cycle

s/b

yte

)

512512

120

640640

140

768768

160

8968961024 1024

1152 11521280 1280

1408 14081536 1536

1664 16641792 1792

1920 192020482048

cpb: 172.9

cpb: 108.2

cpb: 89.14

cpb: 81.37

cpb: 109.5

cpb: 91.72

cpb: 83.77

cpb: 79.16

cpb: 76.98 cpb: 45.96cpb: 53.34

cpb: 60.69

cpb: 66.42

cpb: 69.97

cpb: 28.31cpb: 34.02

cpb: 42.28

cpb: 51.6

cpb: 59.96

cpb: 20.96

cpb: 24.33

cpb: 30.05cpb: 38.15

cpb: 47.87

Figure 72: shellaes128v2d4n80_ref

Message (bytes)Associated Data (bytes)

0 0

30

128128 256256

35

384384

Perf

orm

ance (

cycle

s/b

yte

)

512 512640640

40

768 7688968961024 1024

1152 11521280 1280

1408 14081536 1536

16641664 179217921920 1920

20482048

cpb: 44.45

cpb: 31.88

cpb: 28.04

cpb: 26.66

cpb: 43.44

cpb: 31.91

cpb: 28.58

cpb: 27.07

cpb: 26.2

cpb: 30.97

cpb: 28.31

cpb: 27.38

cpb: 26.55

cpb: 26.07

cpb: 27.31

cpb: 26.51

cpb: 26.29cpb: 26.09

cpb: 25.86

cpb: 25.93

cpb: 25.61cpb: 25.58

cpb: 25.55

cpb: 25.51

Figure 73: stribob192r2_ssse3

Associated Data (bytes) Message (bytes)

0.5

0 0

1

128128

1.5

256256

2

384 384

Perf

orm

ance (

cycle

s/b

yte

)

512512

2.5

640 640

3

768768

3.5

896896 10241024 115211521280 1280

1408 140815361536 16641664 17921792 19201920

2048 2048

cpb: 3.67

cpb: 1.47

cpb: 0.84

cpb: 0.59

cpb: 3.45

cpb: 1.98

cpb: 1.21

cpb: 0.79

cpb: 0.58

cpb: 1.34

cpb: 1.12

cpb: 0.87

cpb: 0.67

cpb: 0.54

cpb: 0.74cpb: 0.71

cpb: 0.64cpb: 0.56

cpb: 0.49

cpb: 0.49

cpb: 0.49cpb: 0.48

cpb: 0.46

cpb: 0.44

Figure 74: tiaoxinv2_nim

Page 29: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

Ralph Ankele and Robin Ankele 29

Associated Data (bytes) Message (bytes)

4500

50

128128

55

256256

60

384 384

Perf

orm

ance (

cycle

s/b

yte

)

512 512

65

640640

70

768768896 896

1024 102411521152

1280 128014081408

1536 15361664 1664

1792 179219201920 20482048

cpb: 74.49

cpb: 54.99

cpb: 49.41

cpb: 47.18

cpb: 72.48

cpb: 58.97

cpb: 52.08

cpb: 48.67

cpb: 46.93

cpb: 52.89

cpb: 51.02

cpb: 49.11

cpb: 47.55

cpb: 46.5

cpb: 47.28cpb: 47.07

cpb: 46.7cpb: 46.26

cpb: 45.87

cpb: 45.07

cpb: 45.07cpb: 45.1

cpb: 45.11

cpb: 45.08

Figure 75: trivia0v2_ref

Message (bytes)Associated Data (bytes)

00

25

128 128256256

30

384 384

Perf

orm

ance (

cycle

s/b

yte

)

512 512

35

640 640768768

40

896 8961024 1024

11521152 128012801408 1408

1536 15361664 1664

1792 17921920 1920

20482048

cpb: 40.2

cpb: 30.19

cpb: 27.24

cpb: 26.12

cpb: 38.35

cpb: 30.43

cpb: 27.78

cpb: 26.37

cpb: 25.74

cpb: 26.12

cpb: 25.04cpb: 25.06

cpb: 25.18

cpb: 25.12cpb: 22.56cpb: 22.6cpb: 22.98

cpb: 23.7

cpb: 24.19cpb: 21.23

cpb: 21.08cpb: 21.54cpb: 22.19

cpb: 22.99

Figure 76: aes128otrsv2_ref

Message (bytes)Associated Data (bytes)

4

0 0

5

128 128256 256

6

384384

7

Perf

orm

ance (

cycle

s/b

yte

)

512512

8

640 640

9

768 768

10

8968961024 1024

1152 11521280 1280

14081408 153615361664 1664

1792 17921920 1920

2048 2048

cpb: 10.2

cpb: 5.19

cpb: 3.92

cpb: 3.21

cpb: 10.6

cpb: 6.58

cpb: 4.63

cpb: 3.8

cpb: 3.19

cpb: 5.86

cpb: 5.06

cpb: 4.26

cpb: 3.71

cpb: 3.23

cpb: 4.18cpb: 3.97

cpb: 3.76

cpb: 3.46

cpb: 3.17

cpb: 3.41cpb: 3.35

cpb: 3.28

cpb: 3.23

cpb: 3.1

Figure 77: aes128poetv2aes4ls0lt0_ni

Page 30: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

30 Software Benchmarking of the 2nd round CAESAR Candidates

D Bar Charts for Real-World Message Sizes for each CAE-SAR Candidate

1 16 557 1500 16000 10000000

50

100

150

200

250

300

350

400

Message Size

Pe

rfo

rma

nce

(cp

b)

391.85

116.13

11.65 9.05 7.67 7.52

Figure 78: acorn128v2_opt

1 16 557 1500 16000 10000000

20

40

60

80

100

120

140

160

180

200

Message Size

Pe

rfo

rma

nce

(cp

b)

191.06

50.89

2.51 1.26 0.6 0.53

Figure 79: aeadaes128ocbtaglen128v1_opt

1 16 557 1500 16000 10000000

5

10

15

20

25

30

35

40

45

Message Size

Pe

rfo

rma

nce

(cp

b)

44.82

12.74

0.7 0.4 0.22 0.21

Figure 80: aegis128l_aesnic

1 16 557 1500 16000 10000000

10

20

30

40

50

60

70

Message Size

Pe

rfo

rma

nce

(cp

b)

67.45

18.72

2.98 2.57 2.34 2.32

Figure 81: aes128n12t8clocv2_aesni

1 16 557 1500 16000 10000000

10

20

30

40

50

60

70

80

90

Message Size

Pe

rfo

rma

nce

(cp

b)

80.5

22.71

3.13 2.62 2.35 2.33

Figure 82: aes128n8t8silcv2_aesni

1 16 557 1500 16000 10000000

10

20

30

40

50

60

Message Size

Pe

rfo

rma

nce

(cp

b)

54.47

13.23

1.21 0.82 0.58 0.57

Figure 83: aes128otrpv3_ref

1 16 557 1500 16000 10000000

20

40

60

80

100

120

140

Message Size

Pe

rfo

rma

nce

(cp

b)

130

34.77

2.48 1.6 1.12 1.07

Figure 84: poetv2aes4_ni

1 16 557 1500 16000 10000000

100

200

300

400

500

600

700

800

900

1000

Message Size

Pe

rfo

rma

nce

(cp

b)

954.52

275.55

110.15 105.68 103.1 102.95

Figure 85: aescopav2_ref

Page 31: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

Ralph Ankele and Robin Ankele 31

1 16 557 1500 16000 10000000

10

20

30

40

50

60

70

Message Size

Pe

rfo

rma

nce

(cp

b)

61.53

21.52

6.66 6.31 6.13 6.11

Figure 86: aesjambuv2_aesni

1 16 557 1500 16000 10000000

5

10

15

20

25

30

35

40

45

50

Message Size

Pe

rfo

rma

nce

(cp

b)

49.46

9.13

0.84 0.68 0.53 0.52

Figure 87: aezv4_aesni

1 16 557 1500 16000 10000000

10

20

30

40

50

60

70

80

Message Size

Pe

rfo

rma

nce

(cp

b)

71.4

25.03

6.63 6.25 6.06 6.04

Figure 88: ascon128av11_opt64

1 16 557 1500 16000 10000000

20

40

60

80

100

120

Message Size

Pe

rfo

rma

nce

(cp

b)

101.41

29.02

2.05 1.37 0.77 0.73

Figure 89: deoxysneq128128v13_aesni

1 16 557 1500 16000 10000000

100

200

300

400

500

600

700

800

Message Size

Pe

rfo

rma

nce

(cp

b)

743.39

256.27

63.97 59.19 57.05 56.8

Figure 90: elmd600v2_ref

1 16 557 1500 16000 10000000

50

100

150

200

250

300

Message Size

Pe

rfo

rma

nce

(cp

b)

256.33

73.5

8.99 7.11 6.53 6.71

Figure 91: hs1sivlov1_ref

1 16 557 1500 16000 10000000

100

200

300

400

500

600

700

Message Size

Pe

rfo

rma

nce

(cp

b)

646.63

186.55

13.65 9.45 7.39 7.2

Figure 92: icepole128av2_ref

1 16 557 1500 16000 10000000

50

100

150

200

250

300

350

400

450

500

Message Size

Pe

rfo

rma

nce

(cp

b)

475.58

159.02

39.78 36.76 35.25 35.26

Figure 93: ketjesrv1_reference

Page 32: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

32 Software Benchmarking of the 2nd round CAESAR Candidates

1 16 557 1500 16000 10000000

50

100

150

200

250

300

350

400

450

500

Message Size

Pe

rfo

rma

nce

(cp

b)

474.14

135.44

7.08 4.2 2.6 2.55

Figure 94: seakeyakv2_SandyBridge

1 16 557 1500 16000 10000000

20

40

60

80

100

120

Message Size

Pe

rfo

rma

nce

(cp

b)

100.4

28.82

1.73 1.07 0.71 0.68

Figure 95: morus1280256v1_avx2

1 16 557 1500 16000 10000000

20

40

60

80

100

120

140

160

Message Size

Pe

rfo

rma

nce

(cp

b)

158.96

45.56

3.32 2.43 1.92 1.87

Figure 96: norx6441_ymm

1 16 557 1500 16000 10000000

200

400

600

800

1000

1200

Message Size

Pe

rfo

rma

nce

(cp

b)

1121.2

320.74

24.61 18.05 14.21 13.83

Figure 97: omdsha512k512n256tau256v2_avx1

1 16 557 1500 16000 10000000

50

100

150

Message Size

Pe

rfo

rma

nce

(cp

b)

144.74

41.51

5.62 4.56 3.98 3.93

Figure 98: paeq80_aesni

1 16 557 1500 16000 10000000

50

100

150

200

250

300

350

400

450

500

Message Size

Pe

rfo

rma

nce

(cp

b)

489.86

140.14

9.34 7.14 4.75 4.56

Figure 99: pi64cipher128v2_goptv

1 16 557 1500 16000 10000000

100

200

300

400

500

600

700

800

Message Size

Pe

rfo

rma

nce

(cp

b)

712.97

204.21

14.39 9.26 7.41 7.21

Figure 100: scream10v3_sse

1 16 557 1500 16000 10000000

200

400

600

800

1000

1200

1400

Message Size

Pe

rfo

rma

nce

(cp

b)

1267.94

388.48

45.68 37.31 33.02 35.32

Figure 101: shellaes128v2d8n80_ref

Page 33: SoftwareBenchmarkingofthe2nd round CAESARCandidatesccccspeed.win.tue.nl/papers/benchmarking-final.pdf · SoftwareBenchmarkingofthe2nd round CAESARCandidates Ralph Ankele1 and Robin

Ralph Ankele and Robin Ankele 33

1 16 557 1500 16000 10000000

50

100

150

200

250

300

350

400

450

Message Size

Pe

rfo

rma

nce

(cp

b)

440.1

127.18

25.19 22.47 21.31 21.22

Figure 102: stribob192r2_ssse3

1 16 557 1500 16000 10000000

10

20

30

40

50

60

Message Size

Pe

rfo

rma

nce

(cp

b)

50.06

13.93

0.71 0.38 0.2 0.19

Figure 103: tiaoxinv2_nim

1 16 557 1500 16000 10000000

50

100

150

200

250

300

Message Size

Pe

rfo

rma

nce

(cp

b)

257.01

79.65

10.61 9.01 8.06 7.97

Figure 104: trivia0v2_sse4