multi-variant execution at the edge

Multi-Variant Execution at the Edge

Javier Cabrera ArteagaKTH

[email protected]

Pierre LaperdrixUniversity of Lille

[email protected]

Martin MonperrusKTH

[email protected]

Benoit BaudryKTH

[email protected]

Abstract—Edge-cloud computing offloads parts of the com-putations that traditionally occurs in the cloud to edge nodes,e.g., CDN servers, in order to get closer to the users andreduce latency. To improve performance even further, WebAssem-bly is increasingly used in this context. Edge-cloud computingproviders, such as Fastly or Cloudflare, let their clients deploystateless services in the form of WebAssembly binaries, whichare then translated to machine code and sandboxed for a safeexecution at the edge.

In this context, we propose a technique that (i) automaticallydiversifies WebAssembly binaries that are deployed to the edgeand (ii) randomizes execution paths at runtime, turning theexecution of the services into a moving target. Given a service tobe deployed at the edge, we automatically synthesize functionallyequivalent variants for the functions that implement the service.All the variants are then wrapped into a single multivariantWebAssembly binary. When the service endpoint is executed,every time a function is invoked, one of its variants is randomlyselected. We implement this technique in the MEWE tool and wevalidate it with 7 services for cryptography and QR encoding.MEWE generates multivariant binaries that embed hundreds offunction variants. We execute the multivariant binaries on theworld-wide edge platform provided by Fastly. We show that,at runtime, the multivariant exhibit a remarkable diversity ofexecution traces, across the whole edge platform.

I. INTRODUCTION

Edge-cloud computing distributes a part of the data andcomputation for onto edge nodes [1], [2]. Edge nodes areservers located in many countries and regions. Consequently,edge-cloud computing gets Internet resources closer to theend users, which reduces latency and save bandwidth. Videoand music streaming services, mobile games, as well as e-commerce and news sites leverage this new type of cloud ar-chitecture to increase the quality of their services. For example,the New York Times website was able to serve more than 2million concurrent visitors during the 2016 US presidentialelection with no difficulty thanks to Edge computing [3].

In order to achieve even greater speed, some Edge com-puting platforms like Cloudflare or Fastly have turned toWebAssembly (Wasm) [4], [5]. WebAssembly is a portablebinary-code format designed to be lightweight, fast and safe[6], [7]. After compiling code to a WebAssembly binary,developers can spawn a compute service by deploying theirbinary on all nodes in an Edge platform. Thanks to its simplememory and computation model, WebAssembly is consideredbetter than virtualization or containerization [8]. However, We-bAssembly is not perfect and is not exempt of vulnerabilities.Implementations in both browsers and standalone runtimes [9]have been found to be vulnerable [10], [9], opening the doorto different attacks. This means that if one node in an Edge

network is vulnerable, all the others are vulnerable is theexact same manner as the same binary is replicated on eachnode. This illustrates how fragile Edge computing can be asone single vulnerability can affect the whole network, like ithappened on June 8, 2021 [11].

In this work, we introduce MEWE, a framework thatgenerates diversified WebAssembly binaries so that no twoexecutions in the edge network are identical. Our solutionis inspired by Moving Target Defense [12] where the attacksurface of a program is constantly moving. Here, our goalis not to remove vulnerabilities but to drastically increasethe effort for exploitation through large-scale execution pathrandomization. MEWE operates in two distinct steps. Atcompile time, MEWE generates variants for different functionsin the program. A function variant is semantically identicalto the original function but structurally different, i.e., binaryinstructions are in different orders or have been replaced withequivalent ones. All the function variants for one service arethen embedded in a single multivariant WebAssembly binary.MEWE is integrated with LLVM, one of the most popularpipelines for the compilation of WebAssembly. At runtime,every time a function is invoked, one of its variant is randomlyselected. This way, the actual execution path taken to providethe service is randomized each time the service is executed,creating a function level moving target.

We experiment MEWE with 7 services, composed ofhundreds of functions. We successfully synthesize thousandsof function variants, which create orders of magnitude morepossible execution paths than in the original service. Todetermine if these new paths embedded in the service binariesare actually randomly triggered at runtime, we deploy and runthem on a Fastly. Fastly is a leading edge-cloud computingprovider. Its platform relies on a world-wide content-deliverynetwork (CDN), as well as on state-of-the-art technology forhigh-performance web services. As part of their continuousinnovations for low latency, Fastly was an early adopter ofWebAssembly to deploy and sandbox client applications [5].We collaborated with Fastly to experiment MEWE on theactual production edge computing nodes that they provideto their clients. This means that all the services that weexecuted for our experiments concurrently ran with Fastly’sclient applications, such as the New-York Times servicesmentioned earlier. For this experiment we execute each originaland multivariant binary thousands of times on every edgecomputing node provided by Fastly. This experiment showsthat the multivariant binaries render the same service as theoriginal, yet with highly diverse execution traces. MEWEsuccessfully turns WebAssembly binaries into moving targets.

arX

iv:2

108.

0812

5v1

[cs

.SE

] 1

8 A

ug 2

021

To sum up, our contributions are:

• MEWE: a framework that builds multivariant WebAssem-bly binaries for edge computing, combining the automaticsynthesis of semantically equivalent function variants,with execution path randomization (https://anonymous.4open.science/r/fastly4edge-C532).

• original results on the large-scale diversification We-bAssembly binaries, at the function and execution pathlevels.

• empirical evidence of a global moving-target for servicesdeployed on a real-world edge-computing platform.

This work is structured as follows. First, Section II presenta background on WebAssembly and its usage in an Edge-cloudcomputing scenario. Section III introduces the architectureand foundation of MEWE while Section IV and Section Vpresent the different experiments we conducted to show thefeasibility of our approach. Section VI details the RelatedWork, Section VII discusses our results while Section VIIIconcludes this paper.

II. BACKGROUND

In this section we introduce WebAssembly, as well as thedeployment model that edge-cloud platforms such as Fastlyprovide to their clients. This forms the technical context forour work.

A. WebAssembly

WebAssembly is a bytecode designed to bring safe, fast,portable and compact low-level code on the Web. The languagewas first publicly announced in 2015 and formalized by Haas etal. [6]. Since then, most major web browsers have implementedsupport for the standard. Besides the Web, WebAssembly isindependent of any specific hardware, which means that itcan run in standalone mode. This allows for the adoptionof WebAssembly outside web browsers [7], e.g., for edgecomputing [9].

WebAssembly binaries are usually compiled from sourcecode like C/C++ or Rust. Listing 1 and 2 illustrate an exampleof a C function turned into WebAssembly. Listing 1 presentsthe C code of one function and Listing 2 shows the resultof compiling this function into a WebAssembly module. TheWebAssembly code is further interpreted or compiled aheadof time into machine code.

B. Web Assembly and Edge Computing

Using Wasm as an intermediate layer is better, in terms ofstartup and memory usage, than containerization or virtualiza-tion [8], [13]. This has encouraged edge computing platformslike Cloudflare or Fastly to adopt WebAssembly (Wasm) todeploy client applications in a modular and sandboxed manner[4], [5]. In addition, WebAssembly is a compact representationof code, which saves bandwidth when transporting code overthe network . This allows edge-cloud platform providers todeploy the same Wasm binary, for a client application, aroundthe world in a few seconds.

Client applications that are designed to be deployed onedge-cloud computing platforms are usually isolated services,

int f(int x) {return 2 * x + x;

}

Listing 1: C function that calculates the quantity 2x + x

(module(type (;0;) (func (param i32) (result i32)))(func (;0;) (type 0) (param i32) (result i32)

local.get 0local.get 0i32.const 2i32.muli32.add)(export "f" (func 0)))

Listing 2: WebAssembly code for Listing 1.

Developer computer Edge-cloud platform

Edge node1

CompileSource Wasm

Upload CompileWasm

Sandbox

x86 Edge node2

Edge node3

HTTP Request

Fig. 1: Deployment to the edge. Client application developersbuild their application and compile it to WebAssembly beforesubmitting to the Edge-cloud computing platform. Then, theWasm binary is distributed to all edge nodes, where it iscompiled to a sandboxed machine code (x86 or ARM). Thismachine code is executed at every service request.

having one single responsibility. This development model isknown as serverless computing, or function-as-a-service [14],[9]. Figure 1 summarizes the development and deploymentprocess for a client application to an edge-cloud platform. First,the developers of a client application implement the isolatedservices in a given programming language. The source codeand the HTTP harness are then compiled to WebAssemblyusing, e.g. with LLVM [15] or Binaryen [16].

When client application developers deploy a WebAssemblybinary for a function-as-a-service, it is sent to all edge nodesin the platform. Then, the WebAssembly binary is compiled,on each node, to machine code. This way, if the edge nodeshave different architectures, the clients can still deploy a singleWebAssembly binary, and the compilers take care of the finalmachine code generation step. Each binary is compiled in away that ensures that the code runs inside an isolated sandbox.

III. MEWE: MULTIVARIANT EXECUTION FORWEBASSEMBLY

In this section we present MEWE, a novel technique tosynthesize multivariant binaries and deploy them on an edgecomputing platform.

2

https://anonymous.4open.science/r/fastly4edge-C532


A. Overview

The goal of MEWE is to provide moving target defense(MTD) for WebAssembly binaries. It performs MTD at theapplication-level without any change in the operating systemor Wasm runtime. The core idea of MEWE is to combine thesynthesis of diversified function variants with execution-pathrandomization. Figure 2 provides an overview of how it works.

MEWE takes as input a binary that is to be diversified. Theimplemented binary format is LLVM’s intermediate represen-tation (LLVM IR), it can be obtained from any language withan LLVM frontend such as C/C++, Rust or Go. In Step 1 ,the binary is passed to CROW [17], which is a superdiversifierfor Wasm, it generates a set of variants for the functions inthe binary. Step 2 packages all the variants in one singlemultivariant LLVM binary. In Step 3 , we use a specialcomponent, called a “mixer”, which augments the binary withtwo different components: an HTTP endpoint harness and arandom generator, which are both required for executing Wasmat the edge. The harness is used to connect the program to itsexecution environment while the generator provides supportfor random execution path at runtime. The final output ofStep 4 is a standalone multivariant WebAssembly binarythat can be deployed on an edge-cloud computing platform.In the following sections, we describe in greater details thedifferent stages of the workflow.

B. Variant generation

MEWE relies on the superdiversifier CROW [17] to au-tomatically diversify each function in the input llvm binary(Step 1 ). CROW receives an LLVM module, analyzes thebinary at the function block level and generates semanticallyequivalent variants for each function, if they exist. CROWvariants are verified as semantically equivalent with an SMTsolver. Here, we define a function variant as:

Definition 1: Function variant: Let F be a function, F ′

is a function variant of F for MEWE if it is semanticallyequivalent (i.e., same input/output behavior), but exhibits adifferent internal behavior through tracing.

In Listing 3, we illustrate two semantically equivalentWasm functions according to Definition 1. The left most listingcorresponds to the Wasm module shown in Listing 2. The rightmost listing is a variant for this function. We can appreciatethat the multiplication of the original code, in the third andfour lines, is replaced by an addition, making the variant tohave the same semantic but executing different instructions.

...(func (;0;)

local.get 0local.get 0i32.const 2i32.muli32.add)

...

...(func (;0;)

local.get 0local.get 0local.get 0i32.addi32.add)

...

Listing 3: Example of two semantically equivalent functions.The left listing corresponds to the original code. The rightlisting shows a semantically equivalent variant.

CROW synthesizes variants by enumerative synthesis based oncode transformation. The most relevant transformations are:constant inferring to replace control flow statements, arith-metic’s equivalent replacement, and loop unrolling. CROWperforms stacked transformations, this means that it can syn-thesize variants of different size, i.e., from smaller to largervariants than the original.The variants created by CROW are artificially synthesized fromthe original binary. CROW checks for semantic equivalence ofboth codes, original and variant using the symbolic execution.If the behavior of the variant is not the reference behavior,the variant is discarded. This means that, after Step 1 , thevariant is necessarily equivalent to the original program.

C. Multivariant combinationStep 2 of MEWE consists in combining the generatedvariants into a single binary. The goal is to support execution-path randomization at runtime. The core idea is to replace alldiversified functions by a placeholder responsible for dispatch-ing invocations to variants, called in this paper a dispatcher:a dispatcher function is a synthetic function that executesa variant at random. In other terms, MEWE transforms theoriginal call graph into a multivariant call graph, defined asfollows.Definition 2: Multivariant Call Graph (MCG): A multivariantcall graph is a call graph 〈N,E〉 where the nodes in N repre-sent all the functions in the binary and an edge (f1, f2) ∈ Erepresents a possible invocation of f2 by f1 [18], where thenodes are typed. The nodes in N have three possible types: afunction present in the original program, a generated functionvariant, or a dispatcher function.In Figure 3, we show the multivariant call graph of a multivari-ant binary generated with MEWE. The upper graph illustratesthe original call graph and the graph below the multivariantone. The grey nodes represent function variants, the greenpoints, function dispatchers and the yellow nodes are theoriginal functions. The calls are represented by the directededges. In this case, MEWE generates 43 variants for the firstyellow function, none for the second yellow function and threefor the last yellow function.The body of a dispatcher function follows the structure ofListing 4. The example in Listing 4 is the LLVM constructionfor the function dispatcher corresponding to the right mostgreen node of Figure 3. It first calls the random generator,which returns a value that is then used to invoke a specificfunction variant. It should be noted that the dispatcher functionis constructed using the same signature as the original functionin order to facilitate the merging.We did not implement the dispatcher functions with indirectcalls, even though the constant time cost of these calls couldimprove the performance of the dispatchers. Instead, we im-plement the dispatchers with a switch-case structure to avoidindirect calls that can be susceptible to speculative executionbased attacks [9].

D. MEWE’s MixerThe MEWE mixer has four specific objectives: wrap functionsas HTTP endpoints, link the LLVM multivariant binary, injecta random generator and merge all these components into amultivariant WebAssembly binary.We use the Rustc compiler1 to orchestrate the mixing because

1https://doc.rust-lang.org/rustc/what-is-rustc.html

3

https://doc.rust-lang.org/rustc/what-is-rustc.html

LLVM Originalbinary

function1

functionn

CROW

function1function1function1function1

function1function1function1functionn

Multivariant Generation

LLVM Multivariantbinary

function1function1function1function1

function1function1function1functionn

Randomgenerator

HTTP endpointharness

MIXERWasm

multivariantbinary

1 2

3

4

Fig. 2: Overview of MEWE. It takes as input the LLVM binary representation of a service composed of multiple functions. Itfirst generates a set of functionally equivalent variants for each function in the binary and then generates a LLVM multivariantbinary composed of all the function variants as well as dispatcher functions in charge of selecting a variant when a functionis invoked. The MEWE mixer composes the LLVM multivariant binary with a random number generation library and an edgespecific HTTP harness, in order to produce a WebAssembly multivariant binary accessible through an HTTP endpoint and readyto be deployed to the edge.

Original

Dispatcher

Variant

Fig. 3: Example of two static call graphs for the bin2base64endpoint of libsodium. At the top, the original call graph, at thebottom, the multivariant call graph, which includes nodes thatrepresent function variants (in grey), dispatchers (in green),and original functions (in yellow).

Rustc is a compiler able to merge custom Rust source codewith arbitratry-compatible LLVM binary. For the generator,we rely on WASI’s specification [19] for the random behaviorof the dispatchers. Its exact implementation is dependent onthe platform on which the binary is deployed. For the HTTPharnesses, since our edge computing use case is based on the

Fastly infrastructure, we rely on the Fastly API2 to transformour Wasm binaries into HTTP endpoints. The harness enablesa function to be called as an HTTP request and to return aHTTP response. Throughout this paper, we refer to an endpointas the closure of invoked functions when the entry point of theWebAssembly binary is executed.

E. Multivariant Binary Execution at the EdgeWhen a WebAssembly binary is deployed on an edge platform,it is translated to machine code on the fly. For our experiment,we deploy on the production edge nodes of Fastly. Thisedge computing platform uses Lucet, a native WebAssemblycompiler and runtime, to compile and run the deployed Wasmbinary 3. Lucet generates x86 machine code and ensures thatthe generated machine code executes inside a secure sandbox,controlling memory isolation.Figure 4 illustrates the runtime behavior of the originaland the multivariant binary, when deployed on an Edgenode. The top most diagram illustrates the execution tracefor the original of the endpoint bin2base64. Whenthe HTTP request with the input "HelloWorld!" isreceived, it invokes functions f1, f2 followed by 27recursive calls of function f3. Then, the endpoint sendsthe result "0x000xccv0x10x00b3Jsx130x000x000x00xpopAHRvdGE=" of its base64 encoding in an HTTPresponse.

2https://docs.rs/crate/fastly/0.7.33https://github.com/bytecodealliance/lucet

4

https://docs.rs/crate/fastly/0.7.3

https://github.com/bytecodealliance/lucet

1 define internal i32 @b64_byte_to_urlsafe_char(i32 %0) {

2 entry:3 %1 = call i32 @discriminate(i32 3)4 switch i32 %1, label %end [5 i32 0, label

%case_b64_byte_to_urlsafe_char_43_6 i32 1, label

%case_b64_byte_to_urlsafe_char_44_7 ]89 case_b64_byte_to_urlsafe_char_43_: ; preds =

%entry10 %2 = call i32

@b64_byte_to_urlsafe_char_43_(i32 %0)11 ret i32 %21213 case_b64_byte_to_urlsafe_char_44_: ; preds =

%entry14 %3 = call i32

@b64_byte_to_urlsafe_char_44_(i32 %0)15 ret i32 %31617 end: ; preds = %entry18 %4 = call i32

@b64_byte_to_urlsafe_char_original(i32%0)

19 ret i32 %420 }

Listing 4: Dispatcher function embedded in the multi-variant binary of the bin2base64 endpoint of libsodium,which corresponds to the rightmost green node in Fig-ure 3.

f1 f2 f3 f3 f3 f3

d1 f2 d2 f31f12 d2 f32 d2 f31

Client

Client

Client

...

...

d1 f2 d2 f32f17 d2 f31 d2 f31...

Original Dispatcher Variant

HTTP request call return HTTP response

Fig. 4: Top: an execution trace for the bin2base64 endpoint.Middle and bottom: two different execution traces for the mul-tivariant bin2base64, exhibited by two different requestswith exactly the same input.

The two diagrams at the bottom of Figure 4 illustrate twoexecutions traces observed through two different requests tothe endpoint bin2base64. In the first case, the requestfirst triggers the invocation of dispatcher d1, which randomlydecides to invoke the variant f12; then f2, which has notbeen diversified by MEWE, is invoked; then the recursiveinvocations to f3 are replaced by iterations over the executionof dispatcher d2 followed by a random choice of variants of f3.Eventually the result is computed and sent back as an HTTPresponse. The second execution trace of the multivariant binaryshows the same sequence of dispatcher and function calls asthe previous trace, and also shows that for a different requests,the variants of f1 and f3 are different.The key insights from these figures are as follows. First,from a client’s point of view, a request to the original orto a multivariant endpoint, is completely transparent. Clientssend the same data, receive the same result, through the sameprotocol, in both cases. Second, this figure shows that, atruntime, the execution paths for the same endpoint are differentfrom one execution to another, and that this randomizationprocess results from multiple random choices among functionvariants, made through the execution of the endpoint.

F. ImplementationThe multivariant combination (Step 2 ) is implemented in942 lines of C++ code. Its uses the LLVM 12.0.0 librariesto extend the LLVM standard linker tool capability withthe multivariant generation. MEWE’s Mixer (Step 3 ) isimplemented as an orchestration of the rustc and the We-bAssembly backend provided by CROW. For sake of openscience and for fostering research on this important topic,the code of MEWE is made publicly available on Githubhttps://anonymous.4open.science/r/fastly4edge-C5324

IV. EXPERIMENTAL METHODOLOGYIn this section we introduce our methodology to evaluateMEWE. First, we present our research questions and theservices with which we experiment the generation and theexecution of multivariant binaries. Then, we detail the method-ology for each research question.

A. Research questionsTo evaluate the capabilities of MEWE, we formulate thefollowing research questions:RQ1: (Multivariant Generation) How much diversity can

MEWE synthesize and embed in a multivariantbinary ? MEWE packages function variants in mul-tivariant binaries. With this first question, we aim atmeasuring the amount of diversity that MEWE cansynthesize in the call graph of a program.

RQ2: (Multivariant Preservation). To what extent doesFastly’s WebAssembly compiler preserve the CROWdiversification transformations? The Wasm binariesare translated to machine code to be executed in theEdge nodes. We want to measure if this translation stagecan revert some transformations that produced functionvariants with CROW.

4This address will be eventually deanonymized.

5


RQ3: (Intra MTD) To what extent does MEWE achievemoving-target defense on an edge compute node?With this question we assess the ability of MEWE toproduce binaries that actually exhibit random executionpaths when executed on one edge node.

RQ4: (Internet MTD) To what extent does MEWE achievemoving-target defense over the worldwide Fastlyinfrastructure? We check the diversity of executiontraces gathered from the execution of a multivariantbinary. The traces are collected from all edge nodesin order to demonstrate the MTD at worldwide scale.

RQ5: What is the impact of the multi-version executionon performance? MEWE injects and embed a multi-variant behavior in the created binaries. The injectedcode might attempt against performance of the service,increasing the size of the binary and extending the exe-cution time. We measure to what extent MEWE affectsthe original performance of Edge deployed services.

The core of the validation methodology for our tool MEWE,consists in building multivariant binaries for several, relevant,endpoints and to deploy and execute them on the Fastly edge-cloud platform.

B. Study subjectsWe select two mature and typical edge-cloud computingprojects to study the feasibility of MEWE. The projects areselected based on: suitability for diversity synthesis withCROW (the projects should have the ability to collect theirmodules in LLVM intermediate representation), suitability fordeployment on the Fastly infrastructure (the project shouldbe easily portable Wasm/WASI and compatible with the RustFastly API), and possibility to collect their execution runtimeinformation (the endpoints should execute in a reasonable timeof maximum 1 second even with the overhead of instrumen-tation). The selected projects are: libsodium, an encryption,decryption, signature and password hashing library which canbe ported to WebAssembly and qrcode-rust, a QrCode andMicroQrCode generator written in Rust.

Name #Endpoints #Functions #Instr.libsodium 5 62 6187https://github.com/jedisct1/libsodiumqrcode-rust 2 1840 127700https://github.com/kennytm/qrcode-rust

TABLE I: Selected projects to evaluate MEWE: project name;the number of endpoints in the project that we consider for ourexperiments, the total number of functions to implement theendpoints, and the total number of WebAssembly instructionsin the original binaries. This metadata is extracted from theWasm binaries before they are deployed at the edge, thus, thenumber of functions might be different than in the source code.

In Table I, we summarize some key metrics that capture therelevance of the selected projects. The table shows the projectname with its repository address, the number of selectedendpoints for which we build multivariant binaries, the totalnumber of functions included in the endpoints and the totalnumber of Wasm instructions in the original binary. Noticethat, the metadata is extracted from the Wasm binaries before

they are sent to the edge-cloud computing platform, thus,the number of functions might be not the same in the staticanalysis of the project source code

C. Experimentation platformWe run all our experiments on the Fastly edge computingplatform. We deploy and execute the original and the multi-variant endpoints on 64 edge nodes located around the world5.These edge nodes usually have and arbitrary and heterogeneouscomposition in terms of architecture and CPU model. Thedeployment procedure is the same described in subsection II-B.The developers implement and compile their services to We-bAssembly. In the case of Fastly, the WebAssembly binariesneed to be implemented with the Fastly platform API spec-ification. This specification states how a service needs to beimplemented in order to deal with HTTP requests. When thecompiled binary is on the Fastly side, it is translated to x86machine code with Lucet, which ensures the isolation of theservice.

D. RQ1 Multivariant diversityWe run MEWE on each endpoint function of our 7 endpoints.In this experiment, we bound the search for function variantwith timeout of 5 minutes per function. This produces onemultivariant binary for each endpoint. To answer RQ1, wemeasure the number of function variants embedded in eachmultivariant binary, as well as the number of execution pathsthat are added in the mutivariant call graphs, thanks to thefunction variants.

E. RQ2 Diversity Preservation After CompilationAfter a WebAssembly binary is uploaded to the Edge platform,it is translated to machine code. In the case of Fastly, theLucet’s code generator processes the WebAssembly binaryto generate x86 machine code. This machine code is theone executed every time a user requests the service. Lucet’sperforms some normalization and optimization passes whentranslating from Wasm to x86. Thus, some variants synthesizedby CROW might not be preserved, i.e., Lucet could generatethe same machine code for two Wasm variants. In RQ2,we check that these translation passes preserve the variantssynthesized by CROW.We retrieve the multivariant call graph for both code repre-sentations, Wasm and x86. Then, we compute the number ofpossible execution paths in each version of the graph. We limitthe number of times a loop can be taken into the path whenwe calculate the possible execution paths in the multivariantcall graph.By comparing the values from both representations, we mea-sure the preservation of the Wasm variants after Lucet trans-lates them to x86 machine code. It should be noted that we usethe most aggressive optimizations possible in Lucet. This way,we know that the variants that are preserved in the producedmultivariant binaries will always be preserved, even with lessaggressive optimizations.

F. RQ3 Intra MTDWe deploy the multivariant binaries of each of the 7 end-points presented in Table II, on the 64 edge nodes of Fastly.We execute each endpoint, multiple times on each node, tomeasure the diversity of execution traces that are exhibited

5The number of nodes provided in the whole platform is 72, we decided tokeep only the 64 nodes that remained stable during our experimentation.

6

https://github.com/jedisct1/libsodium

https://github.com/jedisct1/libsodium

https://github.com/kennytm/qrcode-rust

https://github.com/kennytm/qrcode-rust

by the multivariant binaries. We have a time budget of 48hours for this experiment. Within this timeframe, we canquery each endpoint 100 times on each node. Each query onthe same endpoint is performed with the same input value.This is to guarantee that, if we observe different traces fordifferent executions, it is due to the presence of multiplefunction variants. The input values are available as part ofour reproduction package.For each query, we collect the execution trace , i.e., thesequence of function names that have been executed whentriggering the query. To observe these traces, we instrumentthe multivariant binaries to record each function entrance.To answer RQ3, we measure the number of unique executiontraces exhibited by each multivariant binary, on each separateedge node. To compare the traces, we hash them with thesha256 function. We then calculate the number of uniquehashes among the 100 traces collected for an endpoint on oneedge node. We formulate the following definitions to constructthe metric for RQ3.Metric 1: Unique traces: R(n, e). Let S(n, e) ={T1, T2, ..., T100} be the collection of 100 traces collected forone endpoint e on an edge node n, H(n, e) the collectionof hashes of each trace and U(n, e) the set of unique tracehashes in H(n, e). The uniqueness ratio of traces collectedfor edge node n and endpoint e is defined as

R(n, e) =|U(n, e)||H(n, e)|

The inputs that we pass to execute the endpoints at the edgeand the received output for all executions are available in thereproduction repository at https://anonymous.4open.science/r/fastly4edge-C532.

G. RQ4 Inter MTDWe answer RQ4 by calculating the normalized Shannon en-tropy for all collected execution traces for each endpoint. Wedefine the following metric.Metric 2: Normalized Shannon entropy: E(e) Let e be anendpoint, C(e) = ·64n=0H(n, e) be the union of all tracehashes for all edge nodes. The normalized Shannon Entropyfor the endpoint e over the collected traces is defined as:

E(e) = −Σpx ∗ log(px)

log(|C(e)|)

Where px is the discrete probability of the occurrence of thehash x over C(e).Notice that we normalize the standard definition of the Shan-non Entropy by using the perfect case where all trace hashesare different. This normalization allows us to compare thecalculated entropy between endpoints. The value of the metriccan go from 0 to 1:• 0 for the worst entropy: meaning that the endpoint always

perform the same path independently of the edge node andthe number of times the trace is collected for the samenode.

• 1 for the best entropy: when each edge node executes adifferent path every time the endpoint is requested.

The Shannon Entropy gives the uncertainty in the outcome ofa sampling process. If a specific trace has a high frequency of

appearing in part of the sampling, then it is certain that thistrace will appear in the other part of the sampling.We calculate the metric for the 7 endpoints, for 100 tracescollected from 64 edge nodes, for a total of 6400 collectedtraces per endpoint. Each trace is collected in a round robinstrategy, i.e., the traces are collected from the 64 edge nodessequentially. For example, we collect the first trace from allnodes before continuing to the collection of the second trace.This process is followed until 100 traces are collected from alledge nodes.

H. RQ5 PerformanceFor each endpoint listed in Table II, we measure the perfor-mance impact of the multivariant binaries created by MEWEwith the following metric:Metric 3: Execution time: For a deployed binary, the execu-tion time is the backend-space time between the receiving ofthe user request to the returning of the function as the response.We refer to the backend-space as the time measured directly inthe edge node, i.e., we instrument the code measure the timefor the endpoint execution.We collect 100k execution times for each binary type, originaland multivariant. We perform a Mann-Withney U test [20] tocompare both execution time distributions. If the P-value islower than 0.05, the distributions are different.

V. EXPERIMENTAL RESULTSA. RQ1 Results. Multivariant generationWe use MEWE to generate a multivariant binary for each ofthe 7 endpoints included in our 2 study subjects. We thencalculate the number of diversified functions, in each endpoint,as well as how they combine to increase the number of possibleexecution paths in the static call graph for the original and themultivariant binaries.The sections ’Original binary’ and ’Multivariant WebAssemblybinary’ of Table II summarize the key data for RQ1. In the’Original binary’ section, the first column (#Functions) givesthe number of functions in the original binary and the secondcolumn (#Paths) gives the number of possible execution pathsin the original static call graph. The ’Multivariant WebAssem-bly binary’ section first shows the number of each type ofnodes in the multivariant call graph: #Non dic. is the numberof original functions that could not be diversified by MEWE,#Dispatchers is the number of dispatcher nodes generated byMEWE for each function that was successfully diversified, and#Variants is the total number of function variants generatedby MEWE. The last column of this section is the number ofpossible execution paths in the static multivariant call graph.For all 7 endpoints, MEWE was able to diversify severalfunctions and to combine them in order to increase the numberof possible execution paths in several orders of magnitude. Forexample, in the case of the encrypt function of libsodium,the original binary contains 23 functions that can be combinedin 4 different paths. MEWE generated a total of 56 variantsfor 5 of the 23 functions. These variants, combined with the18 original functions in the multivariant call graph, form 325execution paths. In other words, the number of possible waysto achieve the same encryption function has increased from4 to 325, including dispatcher nodes that are in charge ofrandomizing the choice of variants at 5 different locationsof the control flow graph. This increased number of possiblepaths, combined with random choices, made at runtime, pavesthe way for turning the observable behavior of each endpoint

7



Original binary Multivariant WebAssembly binary x86 codeEndpoint #Functions #Paths #Non div. #Dispatchers #Variants #Paths #Variants #Paths PP PVlibsodiumencrypt 23 4 18 5 56 325 46 253 0.83 0.77decrypt 20 3 16 5 49 84 41 78 0.93 0.83random 8 2 6 2 238 12864 225 12314 0.96 0.94invert 8 2 6 2 125 2784 121 2538 0.91 0.96bin2base64 3 2 2 2 47 172 41 41 0.25 0.87qrcode-rustqr str 982 688∗106 965 17 2092 97∗1012 2040 20∗1012 0.21 0.97qr image 858 1.4∗106 843 15 2063 3∗109 2013 617∗106 0.17 0.97

TABLE II: Static diversity generated by MEWE, measured on the static call graphs of the WebAssembly binaries, and thepreservation of this diversity after translation to machine code. The table is structured as follows: Endpoint name; number offunctions and numbers of possible paths in the original WebAssembly binary call graph; number of non diversified functions,number of created dispatchers (one per diversified functions), total number of function variants and number of execution pathsin the multivariant WebAssembly binary call graph. The x86 section shows the number of preserved variants, the number ofpreserved execution paths, the ratio of preserved variants and the ratio of preserved paths in the multivariant x86 binary. Forsake of simplicity, we use short names for the endpoints, the real names can be found in the reproduction repository.

into a moving target. This moving target increases the effort apotential attacker needs to guess what variant is executed andhence what vulnerability she can exploit.We have observed that there is no linear correlation betweenthe number of diversified functions, the number of generatedvariants and the number of execution paths. We have manuallyanalyzed the endpoint with the largest number of possibleexecution paths in the multivariant Wasm binary: qr_strof qrcode-rust. MEWE generated 2092 in total variants forthis endpoint. Moreover, the calling of this endpoint includesthe invocation of 17 dispatchers in a complex control flowstructure and for each of them MEWE includes between 428and 3 variants. If the original execution path contains functionfor which MEWE is able to generate variants, then, there is acombinatorial explosion in the number of execution paths forthe generated Wasm multivariant module. The increase of thepossible execution paths theoretically augments the uncertaintyon which one to perform, in the latter case, approx. 140 000times. As Cabrera and colleagues observed [17] for CROW,a large presence of loops and arithmetic operations in theoriginal function code leverages to more diversification.Looking at the #Dispatchers and #Variants columns of the’Multivariant WebAssembly binary’ section of Table II, wenotice that the number of variants generated per functiongreatly varies. For example, for both the invert and thebin2base64 functions of Libsodium, MEWE manages todiversify 2 functions (reflected by the presence of 2 dispatchernodes in the multivariant call graph). Yet, MEWE generates atotal of 125 variants for the 2 functions in invert, and only47 variants for the 2 functions in bin2base64. The mainreason for this is related to the complexity of the diversifiedfunctions. For more complex functions in terms of the presenceof control flows, the large its number of variants.Columns #Originals of the ’Multivariant WebAssembly binary’section of Table II indicates that, in each endpoint, there existsa number of functions for which MEWE did not manage togenerate variants. We identify three reasons for this, relatedto the diversification procedure of CROW, used by MEWEto diversify individual functions. First, some functions cannotbe diversified by CROW, e.g., functions that wrap only mem-ory operations, which are oblivious to CROW diversification

technique. Second, the complexity of the function directlyaffects the number of variants that CROW can generate. Third,the diversification procedure of CROW is essentially a searchprocedure, which results are directly impacted by the tie budgetfor the search. In all experiments we give CROW 5 minutesmaximum to synthesize function variants, which is a lowbudget for many functions. It is important to notice that, thesuccessful diversification of some functions in each endpoint,and their combination within the control flow graph of theendpoint, dramatically increases the number of possible pathsthat can be used for a moving target defense.

Answer to RQ1

MEWE dramatically increases the number of possibleexecution paths in the multivariant WebAssembly bi-nary of each endpoint. The large number of possibleexecution paths, combined with multiple points ofrandom choice in the multivariant call graph paves theway for a moving target defense that prevents fromguessing which path will be taken at runtime.

B. RQ2 Results. Compile-time diversity preservationWe translate each WebAssembly multivariant binary withLucet, to determine the impact of this translation to machinecode on the function variants and the diversity of paths in themultivariant call graph.The ’x86 code’ section of Table II summarizes the key data toanswer RQ2. Column #Variants shows the number of preservedvariants in the x86 code of each endpoint, column #Pathsshows the number of possible paths in the x86 multivariantbinary. The last two columns show the paths (PP) preservedthe Wasm and the call graphs and the ratio of variants (PV).Notice that, the path preservation ratio metric is a projection ofthe variant preservation and the control flow in the multivariantbinary.In all cases, more than 77% of the individual function variantspresent in the multivariant Wasm binary are preserved inthe x86 multivariant. This high preservation rate for functionvariants allows to preserve a large ratio of possible paths in the

8

multivariant call graph. In 4 out of 7 cases, more than 83% ofthe possible execution paths in the multivariant Wasm binaryare preserved. The translation to machine code preservers 21%and 17% of the possible paths for qr_str and qr_image.Yet, the x86 version of the multiversion call graph for theseendpoints still includes millions of possible paths with 17and 15 randomization points in the control flow graph. Thetranslation to machine drastically reduces the potential forrandomized execution paths only for bin2base64, for whichit preserves only 25% of the possible paths, for a total of 41paths.We have identified why some variants are not preserved thetranslation from Wasm to x86. Lucet performs optimizationpasses before generating machine code. In some cases, this canannihilate the effect of CROW’s diversification transformation.For example, in Listing 5, CROW synthesizes a variant inthe right column by splitting it in two multiplications relyingon the integer overflow mechanism. A constant merging op-timization pass could remove the constant multiplications byperforming it at compilation time. The other transformationcases that we have observed have the same property, thetransformations are simple enough to be quickly verified atcompilation time.

i32.const -10i32.mul

i32.const -1931544174i32.muli32.const 109653155i32.mul

Listing 5: Two examples of block variants that arefunctionally equivalent and implement with differentWebAssembly instructions. The variant on the left, gen-erated by MEWE, is not preserved through the transla-tion to machine code.

We identified where the optimizations are done in Lucet’scompiler, 6. It performs optimization-like transformations thatare simpler than the ones introduced by CROW. With this resultwe also encourage to avoid the usage of the insertion of nopinstructions either in Wasm or machine code. nop operationscould be easily detected and removed by a latter optimizationstage.Moreover, the last three endpoints have a path preservationratio that is less than 0.25, even with more than 87% of indi-vidual function variants that are preserved. This is explainedby the fact that the number of possible paths is related to boththe number of variants and to the complexity of the call graph.The example in Figure 5 illustrates this phenomenon. Supposean original binary composed of three functions with the callgraph illustrated at the top of the figure. Here, we count 2possible paths (one with no iteration, and one with a singleiteration). MEWE generates 2 variants for f2 and 4 variantsfor f3, the multivariant wasm call graph is illustrated at thecenter of the figure. The number of possible execution pathsincreases to 40. In the translation process, Lucet transforms thetwo WebAssembly function variants for f2 into the same x86function. In this case, the number of possible execution pathsin the x86 multivariant call graph is reduced by a factor of 2,

6https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/codegen/src/preopt.peepmatic and https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/codegen/src/postopt.rs

OriginalWasm Binary

MultivariantWasm Binary

Multivariantx86 Binary

f1 f2 f3

f1f21

f31

f22

f32

f33

f34

f1 f21

f31

f32

f33

f34

2 paths

40 paths

20 paths

Fig. 5: From top to bottom: original binary call graph,multivariant WebAssembly binary call graph generated byMEWE and the multivariant x86 binary call graph resultingfrom Lucet’s compilation to x86 machine code. This exampleillustrates that if only variant is not preserved through thetranslation to x86, the number of paths significantly decreases

from 40 to 20. However, the number of variants is decreasedonly in 1. The complexity of the call graph has a major impacton the number of possible execution paths.

Answer to RQ2

The translation from WebAssembly to machine codethrough Lucet preserves a high ratio of function vari-ants. This leads to the preservation of high numbers ofpossible execution paths in the multivariant binaries.Our moving target defense architecture is appropriatefor the state-of-the-art runtime of edge computingnodes.

C. RQ3 results. Intra MTDTo answer RQ3, we execute the multivariant binaries of eachendpoint, on the Fastly edge-cloud infrastructure. We executeeach endpoint 100 times on each of the 64 Fastly edge nodes.All the executions of a given endpoint are performed with thesame input. This allows us to determine if the execution tracesare different due to the injected dispatchers and their randombehavior. After each execution of an endpoint, we collect thesequence of invoked functions, i.e., the execution trace. Ourintuition is that the random dispatchers combined with thefunction variants embedded in a multivariant binary are verylikely to trigger different traces for the same execution, i.e.,when an endpoint is executed several times in a row with thesame input and on the same edge node. The way both thefunction variants and the dispatchers contribute to exhibitingdifferent execution traces is illustrated in Figure 4.Figure 6 shows the ratio of unique traces exhibited by eachendpoint, on each of the 64 separate edge nodes. The Xcorresponds to the edge nodes. The Y axis gives the nameof the endpoint. In the plot, for a given (x,y) pair, there is bluepoint in the Z axis representing Metric 1 over 100 executiontraces.

9

https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/codegen/src/preopt.peepmatic

https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/codegen/src/preopt.peepmatic

https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/codegen/src/postopt.rs

https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/codegen/src/postopt.rs

For all edge nodes, the ratio of unique traces is above 0.38. In6 out of 7 cases, we have observed that the ratio is remarkablyhigh, above 0.9. These results show that MEWE generatesmultivariant binaries that can randomize execution paths atruntime, in the context of an edge node. The randomizationdispatchers, associated to a significant number of functionvariants greatly reduce the certainty about which computationis performed when running a specific input with a given inputvalue.Let’s illustrate the phenomenon with the endpoint invert.The endpoint invert receives a vector of integers and returnsits inversion. Passing a vector of integers with 100 elementsas input, I = [100, ..., 0], results in output O = [0, ..., 100].When the endpoint executes 100 times with the same input onthe original binary, we observe 100 times the same executiontrace. When the endpoint is executed 100 times with the sameinput I on the multivariant binary, we observe between 95and 100 unique execution traces, depending on the edge node.Analyzing the traces we observe that they include only twoinvocations to a dispatcher, one at the start of the trace andone at the end. The remaining events in the trace are fixedeach time the endpoint is executed with the same input I .Thus, the maximum number of possible unique traces is themultiplication of the number of variants for each dispatcher,in this case 27×95 = 2538 . The probability of observing thesame trace is 1/2538.For multivariant binaries that embed only a few variants,like in the case of the bin2base64 endpoint, the ratio ofunique traces per node is lower than for the other endpoints.With the input we pass to bin2base64, the execution traceincludes 57 function calls. Only one of these calls is invocationto a dispatcher, which can select among 41 variants. Thus,probability of having the same execution trace twice is 1/41.Meanwhile, qr_str embeds thousands of variants, and the in-put we pass triggers the invocation of 3M functions, for which210666 random choices are taken relying on 17 dispatchers.Consequently, the probability of observing the same trace twiceis infinitesimal. Indeed, all the executions of qr_str areunique, on each separate edge node. This is shown in Figure 6,where the ratio of unique traces is 1 on all edge nodes.

Answer to RQ3

Repeated executions of a multivariant binary withthe same input on an individual edge node exhibitsdiverse execution traces. MEWE successfully embedsexecution path randomization within the multivariantbinaries, turning them into moving targets on individ-ual edge nodes.

D. RQ4 results. Internet MTDTo answer RQ4, we build the union of all the execution tracescollected on all edge nodes for a given endpoint. Then, wecompute the normalized Shannon Entropy over this set foreach endpoint (Metric 2). Our goal is to determine whetherthe diversity of execution traces we observed on individualnodes in RQ3, actually generalizes to the whole edge-cloudinfrastructure. Depending on many factors, such as the randomnumber generator or a bug in the dispatcher, it could happenthat we observe different traces on individual nodes, but thatthe set of traces is the same on all nodes. With RQ4 we assess

Edge nodes

n0n1

n2n3

n4n5

n6n7

n8n9

n10n11

n12n13

n14n15

n16n17

n18n19

n20n21

n22n23

n24n25

n26n27

n28n29

n30n31

n32n33

n34n35

n36n37

n38

n39

n40

n41

n42

n43

n44

n45

n46

n47

n48

n49

n50

n51

n52

n53

n54

n55

n56

n57

n58

n59

n60

n61

n62

n63 bin2base64

decrypt

encrypt

.invert

random

qr str

qr image

Un

iqu

etr

aces

rati

o

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

Fig. 6: Ratio of unique execution traces for each endpoint oneach edge node. The X axis illustrates the edge nodes. TheY axis annotates the name of the endpoint. In the plot, for agiven (x,y) pair, there is blue point representing the Metric 1value in a set of 100 collected execution traces.

the ability of MEWE to provide a moving target at a globalscale.Table III provides the data to answer RQ4. The second col-umn gives the normalized Shannon Entropy value (Metric 2).Columns 3 and 4 give the median and the standard deviationfor the length of the execution traces. Columns 5 and 6 give thenumber of dispatchers that are invoked during the executionof the endpoint (#ED) and the total number of invocationsof these endpoints (#Rch). These last two columns indicateto what extent the execution paths are actually randomized atruntime. In the cases of invert and random, both have thesame number of taken random choices. However, the numberof variants to chose in random are larger, thus, the entropy,is larger than invert.Overall, the normalized Shannon Entropy is above 42%. Thisis evidence that the multivariant binaries generated by MEWEcan indeed exhibit a high degree of execution trace diversity,while keeping the same functionality. The number of random-ization points along the execution paths (#Rch) is at the coreof these high entropy values. For example, every execution ofthe encrypt endpoint triggers 4M random choices amongthe different function variants embedded in the multivariantbinaries. Such a high degree of randomization is essential togenerate very diverse execution traces.The bin2base64 endpoint has the lowest level of diversity.As discussed in RQ3, this endpoint is the one that has the leastvariants and its execution path can be randomized only at onepoint. The low level of unique traces observed on individualnodes is reflected at the system wide scale with a globally lowentropy.For both qr_str and qr_image the entropy value is 1.0.This means that all the traces that we observe for all theexecutions of these endpoints are different from each other.

10

Endpoint Entropy MTL σ #ED #RChlibsodiumencrypt 0.87 816 0 5 4Mdecrypt 0.96 440 0 5 2Mrandom 0.98 15 5 2 12800invert 0.87 7343 0 2 12800bin2base64 0.42 57 0 1 6400qrcode-rustqr str 1.00 3045193 0 17 1348Mqr image 1.00 3015450 0 15 1345M

TABLE III: Execution trace diversity over the Fastly edge-cloud computing platform. The table is formed of 6 columns:the name of the endpoint, the normalized Shannon Entropyvalue (Metric 2), the median size of the execution traces(MTL), the standard deviation for the trace lengths the numberof executed dispatchers (#ED) and the number of total randomchoices taken during all the 6400 executions (#RCh).

In other words, someone who runs these services over andover with the same input cannot know exactly what code willbe executed in the next execution. These very high entropyvalues are made possible by the millions of random choicesthat are made along the execution paths of these endpoints.While there is a high degree of diversity among the tracesexhibited by each endpoint, they all have the same length,except in the case of random. This means that the entropyis a direct consequence of the invocations of the dispatchers.In the case of random, it naturally has a non-deterministicbehavior. Meanwhile, we observe several calls to dispatchersin during the execution of the multivariant binary, whichindicates that MEWE can amplify the natural diversity oftraces exhibited by random. For each endpoint, we managedto trigger all dispatchers during its execution. There is acorrelation between the entropy and the number of randomchoices (Column #RChs) taken during the execution of theendpoints. For a high number of dispatchers, and thereforerandom choices, the entropy is large, like the cases of qr_strand qr_image show. The contrary happens to bin2base64where its multivariant binary contains only one dispatcher.

Answer to RQ4

At the internet scale of the Edge platform, the multi-variant binaries synthesized by MEWE exhibit a mas-sive diversity of execution traces, while still providingthe original service. It is virtually impossible for anattacker to predict which is taken for a given query.

E. RQ5 results. PerformanceFor each endpoint listed in Table II, we compare the executiontime distributions for the original binary and the multivariantbinary. In Table IV we show the execution time for theoriginal endpoints and their corresponding multivariant. Thetable is structured in two sections. The first section showsthe endpoint name, the median and standard deviation forthe of the original endpoint. The second section shows themedian and the standard deviation for the execution time ofthe multivariant binary.

Original bin. Multivariant Wasm bin.Endpoint Median(ns) σ Median(ns) σ

libsodiumencrypt 7976 5263 217341 43905decrypt 13175 6508 225529 47786random 16511 7857 232509 53713invert 119917 34721 341420 65830bin2base64 10751 5373 215253 35315qrcode-rustqr str 3117575 418539 492606388 36864122qr image 3091728 412912 512669965 41718087

TABLE IV: Execution time for the original endpoints and itsmultivariants. The table is structured in two sections. The firstsection shows the endpoint name, the median execution timeand its standard deviation for the original endpoint. The secondsection shows the median execution time and its standarddeviation for the multivariant WebAssembly binary.

In all cases, the multivariant binary executes slower than theoriginal. The highest increase is observed for the endpointqr_image, where the median value for the execution timegoes from 3117573ns to 492606388ns. This result is in con-cordance with the large number of random choices during theexecution. While the relative increase is important, the absolutevalues (given in nanoseconds in Table IV) are acceptable forthe user experience. Let us consider the performance evaluationperformed by Fastly as a baseline [21], [22]. They show thata Markdown to HTML conversion service running on theiredge platform returns a response in less than 100 ms, allowingone request for every single keystroke. In this context, all themultivariant binaries for Libsodium match the baseline and stillsupport requests at the speed of keystrokes. The multivariantbinaries for QR encoding respond in a reasonable for end users,i.e., in less than half a second, but are below the baseline.Finding a good trade-off between randomization and responsetime is part of our future work.The standard deviations indicate that the execution time dis-tributions considerably change from the original binary to themultivariant one (P-value = 0.05 with a Mann-Withney U test).In particular, in all cases, the distributions for multivariantexecution times are spread over a larger range of values thanthe original binary. This phenomenon is a direct consequenceof the execution path randomization. The choice of functionvariant is randomized at each function invocation, and thevariants have different execution times as a consequence ofthe code transformations, i.e., some variants execute more in-structions than others. Consequently, the time for one endpointexecution is randomized and the distribution of a series ofendpoint execution is more spread with the multivariant binarythan with the original.

11

Answer to RQ5

The multivariants execution times are realistic andpractical from an user experience perspective, butslower than the original. The execution time distribu-tions are different between the original binary and themultivariant binary, which contributes to the movingtarget, with less predictable execution times than theoriginal binary.

VI. RELATED WORKOur work is in the area of software diversification for security[23], [24] and is related to the recent work on execution pathrandomization [25]. Davi et al. proposed Isomeron [25], anapproach for execution-path randomization. Isomeron createscode variants through transformations on the original programwith ASLR techniques. The main idea of their randomizationstage is to relocate and change ROP gadgets that could beused in potential attacks. Isomeron was succesfully evaluatedby using the insertion of nop instructions at basic blocklevel for randomization. Isomeron simultaneously loads twocopies of a program code, the original and a variant. Whilethe program is running, Isomeron continuously flips a coin todecide which copy of the program should be executed nextat the level of function calls. With this strategy a potentialattacker cannot predict whether the original or the variant ofa program will execute. The creation of artificial softwarediversity, through transformations on the original code, and therandomization at the level of function calls is similar to ourapproach. Meanwhile, MEWE proposes two key novel con-tributions. First, using CROW, we can also diversify complexcontrol flow structures by inferring constants or loop unrolling.Second, MEWE interconnects hundreds of variants and severalrandomization dispatchers in a single binary, increasing byorders of magnitude the runtime uncertainty about what codewill actually run.By executing in parallel two or more functionally equivalentvariants it is possible the detection of inconsistencies duringruntime. If one of the variants is compromised due to apotential attack, the system can detect it and halt or reportthe error [26], [27]. Voulimeneas et al. recently proposed amulti-variant execution system by parallelizing the executionof the variants in different machines [28], their usage of het-erogeneous ISAs also increases the diversity between variants.MEWE randomly selects the variants at runtime, increasingthe unpredictability of execution behavior. The detection ofanomalies by executing variants in parallel is done after thepotential attack is performed while MEWE is a preemptivetechnique.Narayan et al. [9] remarked that the security model of We-bAssembly for edge-cloud computing platforms is still isvulnerable to Spectre attacks. This means that WebAssemblyendpoint sandboxes can be hijacked to leak memory. Theyproposed to modify the Lucet compiler used by Fastly toincorporate fence instructions in the machine code gener-ation, trying to avoid speculative misstraining. Johnson et al.[29], on the other hand, proposed a static SFI checking of theWebAssembly binaries before they could be submitted to theedge-cloud platforms. MEWE uses a preemptive technique, itis not meant to tackle known vulnerabilities. Besides, MEWEis agnostic from the last-step compiler that translates Wasm

to machine code, which means that the multivariant binariescan be deployed on any edge-cloud platform that can receiveWebAssembly endpoints.There are significant previous work in the space of movingtarget defenses. Taguinod et al. [30] and Christodorescu et al.[31] proposed to diversify different layers of web applicationsby using preexisting diversity. In the same machine, they con-stantly switch the application’s logic between SQL databaseengines and web servers implemented in different languagessuch as PHP, Python, Java and Ruby. Similarly, Roy et al.[32] harnessed diverse machine learning algorithms for thesame classification task at the Edge, creating a moving targetdefense against adversarial learning attacks. These works try tobreak reconnaisance stages from potential attackers. The usageof preexisting variants limits the diversification space. MEWEcreates artificial diversity instead of using preexisting variants.MEWE is agnostic to the platform where the service isdeployed and no manpower is needed to implement a rotationalstack system. Besides, MEWE operates at a worldwide scaleedge-cloud computing platform.Different layers of software stacks can be diversified [12]. Inparticular for dynamic platforms techniques, Holland et al.[33], Okhravi et al. [34] and Wang et al. [35] proposed to usepreexisting diversity to achieve implicit protection for MTD.They proposed to use different virtual machines configurationsto create a diverse set of ISAs. Besides, Caballero et al.[36] also proposed to use preexisting diversity over TIER-1ISP layers to prevent monoculture over network router imple-mentations. Edge-cloud computing platforms are heterogeneusarchitectures by definition, therefore, they have an implicitprotection. However, as previous works remarked, it does notprotect yet against application-level attacks. MEWE is thecounterpart, it provides runtime diversification and providesa practical MTD for edge-cloud platforms, at software level.

VII. DISCUSSIONSpecialising, optimizing, improving performance In sec-tion IV we validated the key features of MEWE: automaticallygenerate multivariant binaries, which exhibit random executionpaths at runtime. Several aspects of these procedures can beoptimized. For example, the generated code can be optimizedby inline function variants in the dispatchers. This minimalchange will decrease the number of function calls. On the otherhand, the number of times a dispatcher is called can be bound.As discussed in RQ4 and RQ5, the dispatchers are massivelyinvoked at runtime, which is great for randomization, but alsoa challenge with respect to the execution time of the services.Fuzzing and security The diversification created by MEWEcan unleash hidden behaviors in compilers like Lucet. One”holy grail” in the fuzzing of compilers is the ability toreach latter stages in the machine code generation pipeline.By generating several functionally equivalent, and yet differentvariants, deeper bugs can be discovered. During the writing ofthis work, an error was discovered during the execution of oneof the variants provided by MEWE. Fastly acknowledged ourwork as part of a technical blog post that describes the bugand the patch. The link to the post will be provided with thedeanonymized version of this paper.MEWE is platform agnostic: Several components of MEWEare implemented to operate at the level of the LLVM in-termediate language. These components are compatible withother LLVM frontend and backend. For example, Cloudflaresupports the deployment of workers that are already written in

12

WebAssembly [4]. With little engineering effort, MEWE canbe extended to it, by adding support to the Haskell LLVMfrontend. Firefox delegates the implementation of some in-ternal libraries to WebAssembly. Consequently, MEWE couldrandomize execution paths in millions of browser clients.WebAssembly to WebAssembly solution: E. Wen et al. [37]and Kishore et al. [38] created a LLVM frontend for Wasm.This means that WebAssembly binaries can be passed as inputsto LLVM. Leveraging this technique, we can extend MEWEto create multivariants from preexisting Wasm binaries.

VIII. CONCLUSIONIn this work we propose a novel technique to automaticallysynthesize multivariant binaries to be deployed on edge com-puting platforms. Our tool, MEWE, operates on a single ser-vices implemented as a WebAssembly binary. It automaticallygenerates functionally equivalent variants for each functionthat implements the service, and combines all the variants ina single WebAssembly binary, which exact execution path israndomized at runtime, with a drastic augmentation of theirexecution paths. Our evaluation with 7 real-world cryptogra-phy and QR encoding services shows that MEWE can generatehundreds of function variants and combine them into binariesthat include from thousands to millions of possible executionpaths. The deployment and execution of the multivariantbinaries on the Fastly cloud platform showed that they actuallyexhibit a very high diversity of execution at runtime.Future work with MEWE will focus on performance. A keychallenge here is to establish a trade-off between a largespace for execution path randomization and the computationcost of large-scale runtime randomization. In addition, thesynthesis of a large pool of variants supports the explorationof the concurrent execution of multiple variants to detectmisbehaviors in services deployed at the edge.

REFERENCES

[1] S. Choy, B. Wong, G. Simon, and C. Rosenberg, “A hybrid edge-cloud architecture for reducing on-demand gaming latency,” Multimediasystems, vol. 20, no. 5, pp. 503–519, 2014.

[2] T. Taleb, K. Samdanis, B. Mada, H. Flinck, S. Dutta, and D. Sabella,“On multi-access edge computing: A survey of the emerging 5g networkedge cloud architecture and orchestration,” IEEE Comm. Surveys &Tutorials, vol. 19, no. 3, 2017.

[3] “The New York Times on failure, risk, andprepping for the 2016 US presidential election –Fastly.” [Online]. Available: https://www.fastly.com/blog/new-york-times-on-failure-risk-and-prepping-2016-us-presidential-election

[4] K. Varda, “Webassembly on cloudflare workers,” Tech.Rep., Jan. 2018. [Online]. Available: https://blog.cloudflare.com/webassembly-on-cloudflare-workers/

[5] P. Hickey, “Announcing lucet: Fastly’s native we-bassembly compiler and runtime,” Tech. Rep.,Mar. 2018. [Online]. Available: https://www.fastly.com/blog/announcing-lucet-fastly-native-webassembly-compiler-runtime

[6] A. Haas, A. Rossberg, D. L. Schuff, B. L. Titzer, M. Holman,D. Gohman, L. Wagner, A. Zakai, and J. Bastien, “Bringing theweb up to speed with WebAssembly,” in Proceedings of the 38thACM SIGPLAN Conference on Programming Language Design andImplementation, 2017, pp. 185–200.

[7] D. Bryant, “Webassembly outside the browser: A new foundation forpervasive computing,” in Proc. of ICWE 2020, 2020, pp. 9–12.

[8] P. Mendki, “Evaluating webassembly enabled serverless approach foredge computing,” in 2020 IEEE Cloud Summit, 2020, pp. 161–166.

[9] S. Narayan, C. Disselkoen, D. Moghimi, S. Cauligi, E. Johnson,Z. Gang, A. Vahldiek-Oberwagner, R. Sahita, H. Shacham, D. Tullsen

et al., “Swivel: Hardening webassembly against spectre,” in USENIXSecurity Symposium, 2021.

[10] D. Lehmann, J. Kinder, and M. Pradel, “Everything old is new again:Binary security of WebAssembly,” in 29th USENIX Security Symposium(USENIX Security 20). USENIX Association, Aug. 2020.

[11] “Global CDN Disruption.” [Online]. Available: https://status.fastly.com/incidents/vpk0ssybt3bj

[12] H. Okhravi, T. Hobson, D. Bigelow, and W. Streilein, “Finding focusin the blur of moving-target techniques,” Security & Privacy, IEEE,vol. 12, pp. 16–26, 03 2014.

[13] M. Jacobsson and J. Wahslen, “Virtual machine execution for wearablesbased on webassembly,” in EAI International Conference on Body AreaNetworks. Springer, Cham, 2018, pp. 381–389.

[14] S. Shillaker and P. Pietzuch, “Faasm: Lightweight isolation for efficientstateful serverless computing,” in USENIX Annual Technical Confer-ence, 2020, pp. 419–433.

[15] “ The LLVM Compiler Infrastructure .” [Online]. Available: https://llvm.org/

[16] “Binaryen.” [Online]. Available: https://github.com/WebAssembly/binaryen

[17] J. Cabrera-Arteaga, O. F. Malivitsis, O. Vera-Perez, B. Baudry, andM. Monperrus, “Crow: Code diversification for webassembly,” in MAD-Web, NDSS 2021, 2021.

[18] B. G. Ryder, “Constructing the call graph of a program,” IEEETransactions on Software Engineering, no. 3, pp. 216–226, 1979.

[19] “Webassembly system interface.” [Online]. Available: https://github.com/WebAssembly/WASI

[20] H. B. Mann and D. R. Whitney, “On a test of whether one of tworandom variables is stochastically larger than the other,” Ann. Math.Statist., vol. 18, no. 1, pp. 50–60, 03 1947.

[21] “The power of serverless, 72 times over.” [Online]. Available:https://www.fastly.com/blog/the-power-of-serverless-at-the-edge

[22] “Markdown to HTML.” [Online]. Available: https://markdown-converter.edgecompute.app/

[23] S. Forrest, A. Somayaji, and D. H. Ackley, “Building diverse computersystems,” in Proceedings. The Sixth Workshop on Hot Topics in Oper-ating Systems. IEEE, 1997, pp. 67–72.

[24] F. B. Cohen, “Operating system protection through program evolution.”Computers & Security, vol. 12, no. 6, pp. 565–584, 1993.

[25] L. Davi, C. Liebchen, A.-R. Sadeghi, K. Z. Snow, and F. Monrose, “Iso-meron: Code randomization resilient to (just-in-time) return-orientedprogramming.” in NDSS, 2015.

[26] T. Jackson, C. Wimmer, and M. Franz, “Multi-variant program exe-cution for vulnerability detection and analysis,” in Proceedings of theSixth Annual Workshop on Cyber Security and Information IntelligenceResearch, 2010, pp. 1–4.

[27] P. Hosek and C. Cadar, “Varan the unbelievable: An efficient n-versionexecution framework,” ACM SIGARCH Computer Architecture News,vol. 43, no. 1, pp. 339–353, 2015.

[28] A. Voulimeneas, D. Song, P. Larsen, M. Franz, and S. Volckaert, “dmvx:Secure and efficient multi-variant execution in a distributed setting,” inProceedings of the 14th European Workshop on Systems Security, 2021,pp. 41–47.

[29] E. Johnson, D. Thien, Y. Alhessi, S. Narayan, F. Brown, S. Lerner,T. McMullen, S. Savage, and D. Stefan, “Sfi safety for native-compiledwasm,” NDSS. Internet Society, 2021.

[30] M. Taguinod, A. Doupe, Z. Zhao, and G.-J. Ahn, “Toward a movingtarget defense for web applications,” in 2015 IEEE International Con-ference on Information Reuse and Integration, 2015, pp. 510–517.

[31] M. Christodorescu, M. Fredrikson, S. Jha, and J. Giffin, “End-to-endsoftware diversification of internet services,” in Moving Target Defense.Springer, 2011, pp. 117–130.

[32] A. Roy, A. Chhabra, C. A. Kamhoua, and P. Mohapatra, “A movingtarget defense against adversarial machine learning,” in Proceedings ofthe 4th ACM/IEEE Symposium on Edge Computing, 2019, p. 383–388.

[33] D. A. Holland, A. T. Lim, and M. I. Seltzer, “An architecture a daykeeps the hacker away,” SIGARCH Comput. Archit. News, vol. 33, no. 1,2005.

13

https://www.fastly.com/blog/new-york-times-on-failure-risk-and-prepping-2016-us-presidential-election

https://www.fastly.com/blog/new-york-times-on-failure-risk-and-prepping-2016-us-presidential-election

https://blog.cloudflare.com/webassembly-on-cloudflare-workers/

https://blog.cloudflare.com/webassembly-on-cloudflare-workers/

https://www.fastly.com/blog/announcing-lucet-fastly-native-webassembly-compiler-runtime

https://www.fastly.com/blog/announcing-lucet-fastly-native-webassembly-compiler-runtime

https://status.fastly.com/incidents/vpk0ssybt3bj

https://status.fastly.com/incidents/vpk0ssybt3bj

https://llvm.org/

https://llvm.org/

https://github.com/WebAssembly/binaryen

https://github.com/WebAssembly/binaryen

https://github.com/WebAssembly/WASI

https://github.com/WebAssembly/WASI

https://www.fastly.com/blog/the-power-of-serverless-at-the-edge

https://markdown-converter.edgecompute.app/

https://markdown-converter.edgecompute.app/

[34] H. Okhravi, A. Comella, E. Robinson, and J. Haines, “Creating a cybermoving target for critical infrastructure applications using platformdiversity,” International Journal of Critical Infrastructure Protection,vol. 5, no. 1, pp. 30–39, 2012.

[35] X. Wang, S. Yeoh, R. Lyerly, P. Olivier, S.-H. Kim, and B. Ravindran,“A framework for software diversification with ISA heterogeneity,” in23rd International Symposium on Research in Attacks, Intrusions andDefenses (RAID 2020), 2020, pp. 427–442.

[36] J. Caballero, T. Kampouris, D. Song, and J. Wang, “Would diversityreally increase the robustness of the routing infrastructure againstsoftware defects?” in Proceedings of the Network and DistributedSystem Security Symposium, NDSS 2008, San Diego, California, USA,10th February - 13th February 2008. The Internet Society, 2008.

[37] E. Wen and G. Weber, “Wasmachine: Bring the edge up to speed witha webassembly os,” in 2020 IEEE 13th International Conference onCloud Computing (CLOUD), 2020, pp. 353–360.

[38] P. K. Gadepalli, S. McBride, G. Peach, L. Cherkasova, and G. Parmer,“Sledge: A serverless-first, light-weight wasm runtime for the edge,” inProceedings of the 21st International Middleware Conference, 2020, p.265–279.

14

multi-variant execution at the edge

Documents