Transcript
Page 1: Virtual machine monitor-based lightweight intrusion detection

Virtual Machine Monitor-Based Lightweight IntrusionDetection

Fatemeh AzmandianNortheastern University

[email protected]

Micha MoffieNortheastern University

[email protected]

Malak AlshawabkehNortheastern University

[email protected]

Jennifer DyNortheastern [email protected]

Javed AslamNortheastern [email protected]

David KaeliNortheastern [email protected]

ABSTRACTAs virtualization technology gains in popularity, so do at-tempts to compromise the security and integrity of virtual-ized computing resources. Anti-virus software and firewallprograms are typically deployed in the guest virtual machineto detect malicious software. These security measures are ef-fective in detecting known malware, but do little to protectagainst new variants of intrusions. Intrusion detection sys-tems (IDSs) can be used to detect malicious behavior. Mostintrusion detection systems for virtual execution environ-ments track behavior at the application or operating systemlevel, using virtualization as a means to isolate themselvesfrom a compromised virtual machine.

In this paper, we present a novel approach to intrusion de-tection of virtual server environments which utilizes onlyinformation available from the perspective of the virtual ma-chine monitor (VMM). Such an IDS can harness the abilityof the VMM to isolate and manage several virtual machines(VMs), making it possible to provide monitoring of intru-sions at a common level across VMs. It also offers uniqueadvantages over recent advances in intrusion detection forvirtual machine environments. By working purely at theVMM-level, the IDS does not depend on structures or ab-stractions visible to the OS (e.g., file systems), which aresusceptible to attacks and can be modified by malware tocontain corrupted information (e.g., the Windows registry).In addition, being situated within the VMM provides ease ofdeployment as the IDS is not tied to a specific OS and can bedeployed transparently below different operating systems.

Due to the semantic gap between the information availableto the VMM and the actual application behavior, we em-ploy the power of data mining techniques to extract usefulnuggets of knowledge from the raw, low-level architecturaldata. We show in this paper that by working entirely atthe VMM-level, we are able to capture enough information

to characterize normal executions and identify the presenceof abnormal malicious behavior. Our experiments on over300 real-world malware and exploits illustrate that there issufficient information embedded within the VMM-level datato allow accurate detection of malicious attacks, with an ac-ceptable false alarm rate.

Categories and Subject DescriptorsD.4.6 [Security and Protection]

General TermsVirtualization, Security, Data Mining, Intrusion Detection

KeywordsVirtual Machine, Virtual Machine Monitor, Intrusion De-tection System, Data Mining

1. INTRODUCTIONVirtual execution environments provide many advantagesover traditional computing environments, such as server con-solidation, increased reliability and availability, and enhancedsecurity through isolation of virtual machines (VMs) [29].Anti-virus programs and firewalls can guard a system againstknown exploits, but these mechanisms provide little protec-tion against new classes of attacks and insider threats. Vir-tualization can provide us the ability to isolate and inspectVM-based execution. Virtual machines themselves are notcompletely immune to viruses and malicious attacks. Toprotect the guest OS running inside a virtual machine andguard against the existence of malicious software, or mal-ware, there needs to be an intrusion detection system (IDS)in place.

Traditionally, an IDS can be categorized as one of two types:a host-based intrusion detection system (HIDS) or a network-based intrusion detection system (NIDS). An HIDS resideson the system that is being monitored and thus has theadvantage of a rich view of the internal workings of the sys-tem. The disadvantage with this approach is that a malwarecan determine the existence of the HIDS and subsequentlycompromise it or attempt to evade detection. An NIDS, onthe other hand, performs intrusion detection from outsidethe target system, using information from the network flow.This makes it more resistant to attacks and evasion, but atthe cost of poor visibility of the system.

38

Page 2: Virtual machine monitor-based lightweight intrusion detection

In a virtualized execution environment, the virtual machinemonitor (VMM) is a software layer that allows the multi-plexing of the underlying physical machine between differentvirtual machines, each running its own operating system. Inthis paper we propose a VMM-based IDS, a variant of host-based intrusion detection systems wherein the IDS resideson the physical host machine, yet remains outside of the vir-tual machine being monitored. As such, a VMM IDS is ableto enjoy the advantages offered by both HIDSs and NIDSs:a rich view of the target system (the VM) combined witha greater resistance to attacks and evasion by the malware.The latter is one of the benefits of isolation provided by theVMM.

The VMM IDS only uses information available at the VMM-level to detect intrusions. There exists a large semantic gapbetween this low-level architectural data and the actual pro-gram behavior. Consequently, we utilize sophisticated datamining algorithms to extract meaningful and useful informa-tion to distinguish normal (non-malicious) from abnormal(malicious) behavior.

There are two main approaches to intrusion detection: mis-use detection and anomaly detection. In misuse detection,the behavior of the system is compared to patterns of knownmalicious behavior, or attack signatures. A weakness of thisapproach is its inability to detect new and previously unseenattacks, known as zero-day attacks. In anomaly detection,a profile of normal behavior is built and any deviations fromthis normal profile is flagged as a potential attack. Whileanomaly detection has the ability to detect zero-day attacks,it is also prone to false alarms, i.e., previously unseen nor-mal behavior may incorrectly be identified as an attack. Asvirtualization and the information available to the VMM fa-cilitate the profiling of normal behavior, in our VMM IDS wetake the second approach to intrusion detection. We use sys-tem events visible to the VMM and incorporate data miningalgorithms to help characterize normal execution patternsand distinguish deviating anomalous behavior, while tryingto balance the trade-off between true detections and falsealarms.

A key advantage that a pure VMM-level IDS provides isease of deployment. Only the VMM needs to be modified toextract low-level architectural events during runtime. Thisties the IDS to a particular VMM and instruction set archi-tecture (ISA). No modification to the operating system isrequired. Hence, it can be deployed in any virtualized com-puting environment with minimal effort. In our work, wefocus on virtualized server applications [38]. These applica-tions are combined with a customized commodity operatingsystem to run optimally in a virtual environment. As thereare no login operations and typical execution consists of onemain process running alongside background processes, weexpect the normal behavior of these workloads to be fairlystable in time and space. Our IDS uses data mining algo-rithms to characterize the normal behavior of the workload.A malicious attack would introduce deviations from the nor-mal behavior, which should be identified by the data min-ing algorithms and flagged by the IDS. Along these lines, aVMM IDS has the advantage of being able to detect zero-dayattacks, in addition to previously known malware.

As part of our contributions, we have implemented a proto-type of a pure VMM-level intrusion detection system usingVirtualBox [15], an open-source full-virtualization VMM. Tothe best of our knowledge, this is the first work to utilize onlythe low-level architectural information visible to the VMMfor detecting the existence of malware. Our IDS consists oftwo key components:

• A front-end, whose duties include:

– Event Extraction - Capturing the low-level archi-tectural data available to the VMM such as diskand network IO accesses, page faults, translationlook-aside buffer (TLB) flushes, and control reg-ister updates.

– Feature Construction - Using statistical techniquesto transform the raw data into features, which areused by the data mining algorithms.

• A back-end, whose duties include:

– Feature Reduction - Reducing the large space ofpossible features, which improves both the timecomplexity and the accuracy of data mining algo-rithms.

– Normal Model Creation - Profiling the normal ex-ecution of the workloads and build a model ofnormal behavior.

– Anomaly Detection - Identifying anomalous be-havior as deviations from the model of normalbehavior.

– Raising an Alarm - Flagging behavior that devi-ates from the norm as a possible threat.

������������� �������

������������������ ���������������

���������������

�����

�������� �������

������������������

���!����

Calibration

Testing

Figure 1: High-level design of our VMM IDS

A high-level overview of our VMM IDS design is presentedin Figure 1. There are two main phases of the VMM IDS op-eration: a calibration phase and a testing phase. In the cali-bration phase, the front-end extracts the VMM-level eventsand constructs all the possible features (using methods de-scribed in section 3.2). These features are passed on to theback-end where feature reduction takes place. The reducedset of features are provided to the data mining algorithms tobuild a model of normal execution behavior. Next, anomaly

39

Page 3: Virtual machine monitor-based lightweight intrusion detection

detection is performed on a set of both normal and abnormaldata points, assigning a score to each based on how muchthey deviate from the normal model. The scores are thenpassed through a filter to remove noise and determine whento raise an alarm.1 During the calibration phase, we evalu-ate the true detection and false alarm accuracy of our IDSto select an optimal set of features and filter configuration.

In the testing phase of the IDS, once the VMM-level eventsare extracted, only the reduced set of features are constructed.Using the model and IDS configuration from the previousphase, anomaly detection is performed on a previously un-seen set of normal and abnormal data points. The scoresassigned to them are passed through the filter to distinguishan appropriate time to raise an alarm.

To examine the effectiveness of our VMM IDS in detectingreal-word attacks, we evaluated the IDS on several differentserver workloads, injecting more than 300 malware obtainedfrom a repository of real attacks. It is important that theIDS not only detect the malware, but do so within a rea-sonable amount of time. To this end, we present both theaccuracy of the IDS (in terms of true detections and falsealarms) and the time-to-detection results. We show that onaverage, we are able to correctly detect about 93% of themalicious attacks within about 20 seconds from the start ofthe attack, at a cost of only 3% false alarms.

The remainder of the paper is organized as follows. In sec-tion 2, we present a revised IDS taxonomy and use it toclassify the current state of the art in IDSs. In section 3, wedescribe the front-end of our VMM IDS, including the infor-mation we are able to extract from the VMM and how it isused to build features. In section 4, we review the approachtaken by our back-end to best learn the normal behavior andidentify malware. In section 5, we evaluate our VMM-basedIDS in terms of its detection and false alarm rate, as wellas its ability to detect intrusions in a timely manner. Wediscuss several aspects of our work in section 6. Finally, weconclude the paper and present directions for future work insection 7.

2. RELATED WORKMuch work has been done in the area of host-based IDSs. Weorganize our discussion here according to the information, orsemantics, utilized by the IDS:

1. Program-level IDS – An IDS that uses informationavailable at the program/application abstraction level.This includes source code, static or dynamic informa-tion flow, and application execution state.

2. OS-level IDS – An IDS that utilizes information avail-able at the OS level such as system calls and systemstate.

3. VMM-level IDS – An IDS that uses semantics and in-formation available at the VMM-level. This includesarchitectural information.

1When the alarm is raised, we assert that a malware hasbeen found.

A related characterization of IDSs can be found in the workdone by Gao et al. [10]. They use the terms white box, graybox, and black box to refer to the class information availableto the IDS. Black box systems only use system call infor-mation, white box systems include all information availableincluding high program-level source or binary analysis, andgray box lies in between.

In our classification criteria, we consider a broader rangeof semantics available to the IDS. Program-level IDSs useinformation similar to that available in white or gray boxsystems. OS-level IDSs can use all system-level informationavailable including (but not limited to) system calls (i.e., ablack box system). VMM IDSs extend the characterizationeven further to include VMM-level information. We use thisclassification to contrast and compare current IDSs in thenext sections, and to highlight the novelty of our own work.

2.1 Program-Level IDSWagner et al. [47] show how static analysis can be used tothwart attacks that change the run-time behavior of a pro-gram. They build a static model of the expected behavior(using system calls, call graph, etc.) and compare it to theruntime program behavior. In the work done by Kirda etal. [22], both static and dynamic analysis (including infor-mation leakage) is used to determine if behavior is malicious.

There have been a number of information flow tracking sys-tems that fall into this category. These systems includestatic [7, 31] and dynamic [45, 33, 41] data flow analysisto extract program-level information available to the appli-cation only.

2.2 Operating System-Level IDSSystem calls have been used extensively to distinguish nor-mal from abnormal behavior. One example is the work doneby Kosoresow et al. [23]. In this work they use system calltraces to find repeated system calls and common patterns,and store them in an efficient deterministic finite automa-ton. Then, during execution they compare and verify thatall system call traces have been seen before.

Many other intrusion detection systems have used systemcall profiles to successfully detect malicious code [14, 40,49]. System call tracing can be done very efficiently and canprovide much insight into program activities.

Stolfo et al. [42] use Windows registry accesses to detectanomalous behavior. The underlying idea is that while reg-istry activity is regular in time and space, attacks tend tolaunch programs never launched before and change keys notmodified since OS installation.

A disk-based IDS is presented in [12]. This IDS monitorsdata accesses, meta data accesses, and access patterns whenlooking for suspicious behavior. This disk-based IDS usessemantics available at the OS-level – it is able to read andinterpret on-disk structures used by the file system.

In the work by Oliveira et al. [6], a virtual machine is usedto provide recovery from zero-day control-flow hijacking at-tacks. The attack detection mechanism involves augmentingevery 32-bit word of memory and general purpose register

40

Page 4: Virtual machine monitor-based lightweight intrusion detection

with an integrity bit, used to determine when a vulnerabilityis being exploited.

2.3 Virtual Machine Monitor-Level IDSTo clarify the novelty of our work and distinguish it fromprevious approaches that incorporate the VMM for intru-sion detection, we define two classes of VMM-level intrusiondetection systems:

• Hybrid VMM/OS IDS

• Pure VMM IDS

Hybrid VMM/OS intrusion detection systems utilize theVMM as a means to isolate and secure the IDS. However,they rely on OS-level information and therefore are not pureVMM IDSs. Pure VMM intrusion detection systems, onthe other hand, only use semantics visible to the VMM toperform intrusion detection. This limits the amount of in-formation available to the IDS and poses a greater challenge.Chen et al. [5] allude to this difficult task as they acknowl-edge the importance of bridging the semantic gap betweenvirtual machine events and operating system events. Thework presented in this paper is an example of a pure VMM-level IDS that uses data mining techniques as a powerfultool to bridge this semantic gap and make the most of thelimited information available to the VMM.

Work on hybrid VMM/OS intrusion detection systems in-clude the efforts of Laureano et al. [24]. In their work, a VMis used to isolate and secure the IDS outside the guest OS.The guest OS is a User Mode Linux [44] that is modified toextract system calls. Then, system call sequence analysis isused to perform anomaly detection. Zhang et al. [50] use aXen [3] VMM to intercept sequences of system calls that areanalyzed to detect intrusions.

In the work of Jin et al. [19], a privileged VM is set up toperform intrusion detection in a centralized manner for adistributed virtual computing environment. They use ipt-ables [37] for the firewall and SNORT [1], a network-basedIDS, for the intrusion detection.

Garfinkel et al. present a Virtual Machine Introspectionarchitecture [11] which is used to create a set of policies forintrusion detection. They use a special OS interface library(built on top of a Linux crash dump examination tool) toaccess information available at the OS level. Similarly, theVMwatcher system uses a VMM to export the guest OS rawdisk and memory to a separate VM. A driver and symbolsare used to compare memory views to detect rootkits andrun an anti-virus software on the (exported) disk [18].

The work done by Jones et al. [20, 21] takes an approachsimilar to ours. They use only VMM-level semantics andinformation to develop a service specifically aimed at iden-tifying and detecting hidden processes in virtual machines.So while they utilize information available in the VMM, thescope of their IDS is focused on hidden processes. Mal-ware may create new process (not hidden) or attach itselfto existing processes, thus eluding detection. In our workwe develop a generic IDS able to detect a broader class ofintrusions.

Ether, a project developed by Dinaburg et al. is a transpar-ent malware analyzer [8]. Ether uses hardware virtualizationextensions to extract information from the guest to analyzemalware behavior. In our work we are able to extract similarinformation and use it to identify malicious activity.

2.4 IDS ComparisonIn Table 1, we present our view of the trade-offs associ-ated with the different IDS types according to the semanticsavailable to them.

Program OS and Pure VMM

IDS Hybrid IDS IDS

Semantics High Medium Low

Applicability High High Medium

Performance Medium High High

Ease of deployment Low Medium High

Attack Resistance Low Medium High

Table 1: IDS level comparison

We identify the following trends in IDS design. First, themore program-level semantics there are available to the IDS,the more accurately it is able to identify and classify the mal-ware. Using fewer semantics (as in the case of a VMM-levelIDS) can limit the effectiveness of an IDS and may restrictthe guest environments that can be effectively secured. Inthis sense, the applicability of a VMM IDS to more generalcomputing environments is lower than that of IDSs whichhave more information available to them.

Second, more semantics may also introduce more overhead(i.e., impacting performance, as extracting this information(if available) can introduce significant overhead (e.g., in-formation flow tracking systems). A VMM IDS only re-quires that information be extracted from the VMM layerand therefore results in significantly less overhead. Sincethe VMM is the only element that needs to be modifiedto extract this information, it also has the advantage of agreater ease of deployment. In most VMMs, a majority ofthe information is already available through standard pro-filing interfaces (e.g., Vcenter Server in VMware’s ESX).

Third, extracting program-level semantics may be done bymodifying or monitoring the execution of the application.Frequently, this is done at runtime in the application addressspace (e.g., using binary instrumentation). While these tech-niques are able to protect the IDS from the application theymonitor, they are still vulnerable to malware running at ahigher privilege level. For example, a root-kit may thwartboth application-level and OS-level IDSs. In contrast, mal-ware would first have to detect that an application or operat-ing system is virtualized before a successful corruption couldbe launched against a VMM-level IDS. Hence, a VMM-levelIDS has a higher resistance to a malicious attack.

2.5 Data Mining for SecurityRelated work has also been done in the area of applyingdata mining and machine learning techniques to ensure thesecurity of computing resources. In [39], Schultz et al. usedata mining techniques to detect the existence of new ma-licious executables. They use the static properties of an

41

Page 5: Virtual machine monitor-based lightweight intrusion detection

executable to generate features and apply an inductive rulelearner, Naive Bayes classifier, and an ensemble classifier(constructed from several Naive Bayes classifiers) to distin-guish malicious executables. Their results show that thesetechniques outperform a signature-based scanner. In or-der to discriminate between benign executables and viruses,Wang et al. [48] statically extract dynamically linked li-braries and application programming interfaces, utilizing Sup-port Vector Machines for feature extraction, training, andclassification.

In the work of Lee et al. [26, 27] data mining techniqueshave been applied to system calls and network data to de-velop intrusion detection models. Their methods includeclassification, meta-learning, association rules, and frequentepisodes. In [13], clustering is applied to network traffic withthe goal of detecting botnets. Their detection frameworkclusters together similar communication traffic and similarmalicious traffic, and performs cross-cluster correlation toidentify hosts that share both similar communication pat-terns and similar malicious activity patterns.

The Local Outlier Factor (LOF) and the K-Nearest Neigh-bor (KNN) algorithms have also been applied in host-basedand network intrusion detection systems, as we shall de-scribe in section 4.2. In our work, we apply these data min-ing techniques to low-level architectural data extracted fromthe VMM-layer in order to protect software appliances andservers in a virtualization setting.

Next, we describe the framework for our VMM IDS in moredetail.

3. VMM IDS FRONT-ENDThe front-end of the VMM IDS has the responsibility ofextracting low-level architectural events and subsequentlytransforming them into features used by the data miningalgorithms. Next, we describe the VMM-level events thatare captured.

3.1 VMM-Level Event ExtractionWe use the term events to describe the raw data and in-formation extracted from the VMM during execution. Theinformation we can extract from different VMMs differs de-pending on the specific VMM implementation; this can af-fect the effectiveness and accuracy of the IDS. The successof a VMM-based IDS hinges on its ability to take the ex-tracted events and piece together an accurate picture of thesystem behavior at the application level. In particular, itmust be able to detect the change in behavior caused by theexecution of the malware.

In our work, we target similar VMMs (in terms of perfor-mance, target architecture, etc.) such as VMWare Worksta-tion [46], VirtualBox [15], ESX Server [46], and Xen [3]. Asubset of events can be found in all of them. These events arearchitectural events that the VMMmust intercept to guaran-tee correct execution. They include execution of privilegedinstructions, access to shared resources (memory), and IO(disk, network, devices). We rely on this common set ofevents to create a robust VMM-based IDS.

The VMM lies below the guest OS layer and provides an

illusion that it is running on a real machine. Hence, ev-ery time the OS needs to interact with the hardware, theVMM must intervene. It is through this intervention thatwe are able to collect events. We instrument the VMM tolog the occurrence of an event, as well as any available infor-mation relevant to the event. For example, when a disk IOevent occurs, the VMM intercepts the request and we recordthe disk sector accessed, the number of bytes accessed, andread/write status.

The data available to the VMM dictate the types of eventsthat we are able to monitor. The richness of the events,in terms of the wealth of information they provide to theVMM, is critical to the success of a VMM-based IDS. Con-sequently, a vast array of events must be collected in orderto reconstruct a more accurate picture of what occurs at theapplication level. Events can be categorized into two maintypes:

1. Virtual or VM events - architectural-level and systemevents related to the virtualized guest OS executinginside the VM.For example, a guest modifying control registers, flush-ing the TLB, or writing to the disk.

2. VMM events - these events are extracted from theVMM and relate to the state of the VMM itself. 2

For example, VirtualBox has two internal modes: oneto execute the guest OS directly on the CPU with-out intervention (user mode instructions) and anotherto intercept, instrument, or emulate supervisor modeinstructions.3

For some events, in addition to determining when they oc-cur, we can also extract useful information about the events.For example, during a disk IO event, we can determine in-formation such as the disk sector, number of bytes accessed,and whether the disk access was a read or write event.4 Asummary of the Virtual and VMM events that we extractis provided in Table 2 and Table 3, respectively. In the ta-bles, we also list additional useful information that can beextracted for the events. The events we are interested ingathering should be able to hint at the underlying behaviorof the system to allow us to distinguish changes in the sys-tem behavior. These high-level semantics of the events areprovided in the last column of the tables.

3.2 Feature ConstructionWe use the term features to describe the information and in-put format provided to the back-end. The back-end is sentprocessed information that can be the result of filtering, ag-gregating, or correlating events. Features do not contain all

2Naturally, the VMM state is influenced by the state of theguest OS and the interaction between the guest OS andVMM.3While we do not directly profile the code cache, due toVMM events that relate to recompilation (REM), we knowwhen code is possibly inserted into the code cache.4It is possible to extract even more information, such as thedisk ID (pdisk) and whether the access was synchronous orasynchronous. We have not found them to be particularlyuseful, but they can be utilized in future work.

42

Page 6: Virtual machine monitor-based lightweight intrusion detection

Virtual Event Additional Information High-Level Semantics

Disk IODisk sector, Number of bytes,

Disk read/write eventWhether data was read or written

Network IONumber of bytes

Network read/write eventWhether data was sent or received

Programmable Interrupt Timer (PIT)– Means by which the OS can track time

or Real Time Clock (RTC)

Page mode change – Hints5at the OS state6

Read/write events to control registers

Register number, Value,

Hints at the OS state7Whether the value was read from

or written into the control register

Page faults Error code, Virtual EIP register8Hints at memory usage

and application behavior9

TLB flush Global flag, New CR3 register value Occurs during a context switch

Invalidate page Page to invalidate (linear address) Hints at memory and application behavior

Local descriptor table (LDT),

Fault address (CR2 register)global descriptor table (GDT), and Hints at application startup

task state segment (TSS) access and context switches

CPUID Leaf (from ring 0) Request for the ID by the application

Current Privilege level (CPL) Privilege levelHints at the type of code running,10

Whether in user mode or supervisor mode

Load segment (descriptor)Segment register, base, limit,

Hints at application startupselector, flags

Table 2: Summary of the Virtual events, the additional information extracted for the events, and what theymean for the OS

VMM Event Additional Information High-Level Semantics

Set guest trap handler Trap numberUseful to identify the type

of guest OS running, if needed

Reschedule execution mode Values of registers EIP, ESP, CS, Hints at the type11of code

(RAW, HWACC or REM) SS, EFLAGS, CR0, and CR4 running inside the OS

Depending on the execution mode,

Enter execution mode the information collected includes CS, EIP, Similar to above, hints at

(RAW, HWACC or REM) ESP, V86/CPL, EFLAGS.IF, CR0, vmFlags, the guest OS behavior

and the new mode (RAW, HWACC or REM)

Table 3: Summary of the VMM events, the additional information extracted for the events, and what theymean for the OS

the information extracted from the events but incorporatepatterns that are effective in identifying normal and abnor-mal execution. Using features rather than raw events canhelp identify relevant execution patterns and behaviors.

5The term “hint” is used to imply the possible occurrence ofa specific type of event.6For example, when in the real mode, there is no paging andthe system is still booting up.7For example, a write event to control register 3 (CR3) de-notes a context switch.8The Exception Instruction Pointer (EIP) register specifiesthe address that caused the page fault.9A page fault can hint at the start of a new application.

10The type of code may be privileged OS code or the appli-cation code.

11The type of code could be application code, previously un-seen OS code, or previously seen OS code.

The space of possible features is very large; there are manyways to process raw events to create features. All of thefeatures we use in this work are constructed directly fromthe event stream in the following way: first, we divide theevent stream into consecutive segments of equal (virtual)time. Next, we apply statistical methods (described next)on the events in the segment to produce feature-values foreach feature. Finally, we send a window – containing all thefeature-values in a segment – to our back-end.

The advantages of using time-based windows are two-fold.First, all windows represent approximately the same amountof execution time and are therefore comparable. Second,splitting the event stream into windows provides us with theability to classify each window on its own, enabling on-lineclassification. The length of the window (the virtual time

43

Page 7: Virtual machine monitor-based lightweight intrusion detection

captured in the window) introduces a trade-off: longer win-dows capture more behavior while shorter ones reduce thetime-to-detection. We have found that windows of approx-imately two seconds long provide a good balance betweenthese trade-offs. This time quantum provides the back-endwith sufficient information to make accurate classificationson a per-window basis, while also allowing it to identify anymalicious activity within seconds of its execution.

In the next sections, we describe how we process events tocreate the features used in this work.

3.2.1 Rate FeaturesThe first category of features we generate from events arecreated by storing a running sum of an event. These ratefeatures are constructed by simply counting the occurrencesof events in each segment. In Figure 2, we show how astream of events are processed to construct windows con-taining feature-values for rate features. First, on the left weshow a list of events as they occur and how they are dividedinto segments (in this example, a segment is constructed oneach timer event). Next, on the right we show (for eachsegment) rate feature-values for network IO and disk IO,contained in a window.

. . . >

....

Window 1

Window 2

Window 0

EventsTime

Seg Load

Disk IO

Disk IONet IO

PF

Timer

CR3 Write

InvPageDisk IO

TimerCR4 Read

Net IONet IODisk IOInvPageTimer

Segment 0

Segment 1

Segment 2

Disk IO

rateN

etwork IO

rate

0,1,

2,1,

2,

<

<

< 1,

. . . >

. . . >

Figure 2: Constructing rate features

We build multiple rate features such as page-fault count,control register modifications, disk and network IO accesses,etc. These features can be constructed efficiently and pro-vide rich information about the execution behavior of thesystem.

3.2.2 Relationship FeaturesThe next category of features is what we refer to as relation-ship features, as they capture the relationship between pairsof events. Relationship features are built with the intentionof capturing information not covered by rate features. Thismay happen for example, when different events have thesame rate across vastly different windows. We are able todifferentiate between these windows by accounting for theorder that particular events took place. For example, weare able to detect that a sequence of disk IO read eventswhich is followed by a sequence of disk IO write events isdifferent than a sequence of interleaving read and write diskIO events.

Time

A B A B A B A B A B A B A B A B

A B B A A A A A A BA B A A B A A A B A

A A A A A A A A B B B B B B B B

Window 2.a :Window 2.b :

Window 1.b :Window 1.a :

2)

1)

Time

Figure 3: Information captured by relationship fea-tures (event order)

In Figure 3, we present an example. We show two pairs ofwindows, each pair containing an equal number of events Aand B, but different orders in which they occur. We gener-ate features that are able to detect the difference betweenthe top and bottom window in each pair, using statisticalmethods such as (1) maximum discrepancy, the largest gapbetween successive occurrences of events A and B withinany sub-window and (2) the Mann-Whitney test, a measureof how randomly interleaved events A and B are within thewindow. Another relationship feature counts the number ofruns, i.e., the number of sub-windows containing a sequenceof the same event. For the first pair of windows in Figure3, this relationship feature produces a feature-value of 2 forWindow 1.a and 16 for Window 1.b. For the second pair,the value of this relationship feature is 4 for Window 2.aand 7 for Window 2.b. Using relationship features, we candistinguish between the top and bottom windows in eachpair. The significance of relationship features is reflected inthe fact that they account for more than half of the fea-tures chosen by the feature selection algorithm. As we shallshow in section 5.3, the information conveyed by relationshipfeatures allows the data mining algorithms to distinguishnormal activity from malicious. An example relationshipfeature chosen for one of our workloads (Exchange.Light)is a feature that measures the random interleaving betweenwrites to the task state segment (TSS) and TLB flushes,both of which can be triggered by a task switch.

In order to fairly compare the feature values in the datamining context, we normalize them to have a mean of zeroand standard deviation of one. This standardization bringsthe features on the same scale and is accomplished by sub-tracting the mean from each feature and dividing the resultby the standard deviation. Next, we describe the steps per-formed in the back-end.

4. VMM IDS BACK-ENDThe back-end is responsible for reducing the large space ofpossible features by selecting a subset of them. These fea-tures are then passed on to the data mining algorithms tobuild a model of the normal behavior of the workloads. Oncethis model is established, anomaly detection is performedon new windows to determine whether they deviate enoughfrom the normal model to warrant the raising of an alarm.In the following sections, we shall delve into the details ofthese steps.

4.1 Feature ReductionAs previously mentioned, the space of possible features isvery large. In order to perform accurate anomaly detection,we must select the most useful features to help differenti-

44

Page 8: Virtual machine monitor-based lightweight intrusion detection

ate normal from abnormal behavior. Along these lines, wereduce the feature space by performing feature subset se-lection, in which a subset of the original set of features ischosen. The search for the best features is guided by howwell the features do when provided to the data mining algo-rithm. This gives us the ability to understand which featuresand consequently, which VMM-level events, are significantin modeling the behavior of normal workload execution anddistinguishing irregular, malicious activity. There are alsobenefits to having a small number of features, which includereducing the computational complexity of the data miningalgorithm, as well as improving its accuracy by removingredundant and irrelevant features.

The feature subset selection algorithm we apply is the Se-quential Floating Forward Selection (SFFS) algorithm [36],also known as Floating Forward Search. This is a simple,yet effective algorithm that has been shown to yield resultsthat are very close to those obtained through an exhaus-tive search of all possible subsets of the feature space. Ateach step, the SFFS algorithm keeps a list of the currentlyselected features, along with a measure of how well thosefeatures do when evaluated on the data mining algorithm.In section 5, we describe the criterion we use in this work.

Initially, the algorithm begins with the empty set of features.Then, each feature is evaluated by itself and the one withthe highest value of the criterion function is added to thelist of features. This forward step is repeated to find thenext best feature which, when added, provides the highestvalue of the criterion function and is an improvement overthe value at the previous step. Next, another forward stepis taken so that the current list of features contains threefeatures. Then, the algorithm tries to remove each of thefeatures individually to see if it results in an improvementin the criterion value for two features. If so, the backwardstep is taken and the feature is removed. Otherwise, anotherforward step is attempted. This process is iterated until nobackward or forward step results in an improvement in theevaluation criterion.

4.2 Model Creation and Anomaly DetectionOnce a subset of the features is selected, the window datapoints only need to contain feature-values that correspondto the chosen features. The windows can now be providedas input to the data mining algorithms. Using only nor-mal windows, we train the classifier to produce a model ofnormal activity. For this we employ a profiling approachthat does not require or make assumptions about maliciousbehavior, other than that it is “different” from normal be-havior. Such an approach fits well with deployment in realproduction environments. For example, new servers are of-ten “stress tested” for many days or weeks with actual orrealistic workloads before they are deployed. During thisperiod, the behavior of the (virtualized) server can be pro-filed, and substantially different behavior encountered post-deployment can be flagged as potentially malicious.

For any data mining task, the dataset can be divided intothree sets: a training set, a validation set, and a test set.The training set is used to build the data mining model.Since we build a model of normal behavior, our training setis comprised of only normal windows, i.e., those that rep-

resent time periods in which there is no malicious activity.Once the model is created, its performance is evaluated ona validation set. Our validation set contains both normaland abnormal windows. We evaluate how well the modelof normal behavior is able to distinguish abnormal windows(true detections), while making sure that normal windowsfrom the validation set are not incorrectly identified as ma-licious (false alarms). By tuning the parameters of the in-trusion detection process, it is possible to achieve a reason-able trade-off between the true detections and false alarms.Thus, the validation set can be used to find good values forthe unknown parameters (in our case, an optimal subset offeatures and EWMA filter configuration). Once these pa-rameters are chosen, they are evaluated on the test set tosee how well the IDS (using those parameters) performs onpreviously unseen normal and abnormal windows.

Various data mining and machine learning techniques havebeen applied to the design of IDSs [34]. In this study, weemploy two well known techniques to build our VMM-basedIDS, each taking a different approach to creating a model ofnormal behavior and performing anomaly detection. Theyare the distance-based K-Nearest Neighbor (KNN) and thedensity-based Local Outlier Factor (LOF) algorithms.

Several studies have applied KNN and LOF classifiers to thearea of intrusion detection. Liao and Vemuri [28] employedKNN to categorize program behavior as being either normalor intrusive, using the frequency of system calls. Lazarevicet al. [25] performed a study of several anomaly detectionschemes, including KNN and LOF in network intrusion de-tection using the DARPA 1998 dataset and real networkdata. They found that the most promising technique fordetecting intrusions in the DARPA dataset was the LOFalgorithm. In addition, when performing experiments onreal network data, the LOF approach was very successfulin picking several interesting novel attacks that could notbe detected using other state-of-the-art intrusion detectionsystems such as SNORT [1]. For our VMM IDS, we alsofind that LOF can be valuable when combined with KNNto detect novel attacks. Next, we describe these algorithmsin more detail.

4.2.1 K-Nearest Neighbor AlgorithmTheK-Nearest Neighbor (KNN) algorithm treats data pointsas vectors in a feature space. It consists of two main phases:(1) a model-creation or profiling phase, and (2) an anomalydetection phase. The profiling phase of the algorithm con-sists of simply storing the vectors of training data points(in our case, windows of normal activity). In the anomalydetection phase, each (validation or test) data point is as-signed a score, or decision value, indicating how “abnormal”it is by calculating the sum of the distances to its k-nearestneighbors. The farther a data point is with respect to itsk-nearest neighbors, the more abnormal it is and the largerthe decision value assigned to it. The distance between pairsof data points can be measured using different metrics, suchas the Euclidean distance.

4.2.2 Local Outlier Factor AlgorithmThe Local Outlier Factor (LOF) algorithm [4] takes a density-based approach to anomaly detection. Anomalous data pointsare also referred to as outliers. The strength of this algo-

45

Page 9: Virtual machine monitor-based lightweight intrusion detection

rithm lies in its ability to identify local, as well as global,outliers. A local outlier is a data point that is outlying whencompared to its surrounding local neighborhood. In the LOFalgorithm, the density of a data point is compared to thatof its neighborhood and based on this, the point is assigneda degree of being an outlier, known as its local outlier fac-tor. The size of the neighborhood under consideration isdetermined by the parameter k.

The profiling phase of the LOF algorithm consists of cal-culating the density of all the training data points. In theanomaly detection phase, the LOF factor of each (validationor test) data point is found by dividing the average densityof its k-nearest neighbors by its own density. Intuitively, thegreater the density of a data point’s neighbors relative to itsown density, the more outlying the data point is and hence,the higher the LOF score (decision value) assigned to it. Inthe case of intrusion detection, if a malware tries to imitatenormal behavior (for the most part) and only differs fromit slightly, due to its malicious nature, the LOF algorithmshould be able to detect this relative deviation.

As we mentioned, an advantage of using anomaly detectionis its ability to identify novel attacks, which stems from themethod by which the classifier is trained. Rather than pro-filing using a set of normal and abnormal data points, thealgorithm performs profiling using a set of only normal datapoints. In this way, it is used to build a model of typical, nor-mal behavior. Subsequently, any deviations from the normalmodel are flagged as anomalous. In an environment wherenormal execution tends to change and evolve, it is importantto use an algorithm that can dynamically update the modelof normal behavior so as to reduce the false alarm rate. Anice feature of the LOF algorithm is that it lends itself toan incremental version [35] that does just that.12

Once we train the model on a dataset consisting of onlynormal data, we assign decision values to data points from atest set consisting of both normal and abnormal data. Thedecision values provide a measure of how abnormal the testdata points are with respect to the model of normal exe-cution. Based on these decision values, we can now choosewhether or not to raise an alarm. We describe the methodby which we do so in the next section.

4.3 Raising an AlarmWhen we suspect that an attack has occurred, we raise analarm. The decision values assigned to the time-based win-dows are what we use to decide whether to raise the alarm.As we described in section 3.2, these windows are foundby sampling the event stream. Since it is not clear whereone phase of behavior (either normal or abnormal) ends andthe next one begins, we must deal with the issue of over-sampling or under-sampling. This introduces noise into thetime-based windows. For the case of normal windows, thisnoise can cause one or two contiguous windows to appear tobe outliers, increasing the probability that a high decisionvalue is assigned to them and resulting in a false alarm. Toaddress this issue, we apply a filter on the sequence of de-cision values to help smooth out the values and lower thefalse alarm rate. This allows us to filter out the noise intro-

12In this work, we used the original LOF algorithm.

duced by sampling the time-based windows, while remainingsensitive to the intrusion.

4.3.1 Exponentially Weighted Moving Average FilterThe exponentially weighted moving average (EWMA) modelis often used to smooth out fluctuations in time series data.Given a sequence of decision values, the EWMA model ap-plies a set of weights to them that decrease exponentiallythe further back they are in time, giving more importanceto recent values while not entirely discarding previous val-ues. The sequence of EWMA values are found using thefollowing formula:

EWMAi = α · d valuei + (1− α) · EWMAi−1 (1)

where EWMAi denotes the EWMA value at time i, d valueiis the decision value at time i, and α determines the weightassociated with more recent decision values. An α close to1 gives more importance to recent decision values, whereasan α close to 0 better reflects values seen in the past. It canbe represented as a function of the effective filter width, N ,as: α = 2

N+1.

If the EWMA value is higher than a threshold, we suspectan attack has occurred and raise the alarm. To illustratethe method by which we raise an alarm, we provide a timeseries plot of the decision values versus the window numberin Figure 4 for the Exchange.Heavy workload. Normal win-dows are shown with the ◦ symbol and abnormal windowsare represented with the + symbol. The solid, horizontalline corresponds to the threshold, set at 1.81. The dashedline represents the EWMA values, calculated using an effec-tive filter width (N) of 40. Note that window 112 is assigneda high decision value (9.92), which causes the EWMA valueto rise to 1.50. Since it does not exceed the threshold, we re-frain from raising an alarm at that point and thereby avoida false alarm. Instead, we raise the alarm at window 304where the EWMA reaches a value of 1.95.

1

2

3

4

5

6

7

8

9

10

0 50 100 150 200 250 300 350 400 450

Dec

isio

n V

alue

Window Number

Exchange.Heavy Time Series Plot

NormalAbnormal

Threshold = 1.81EWMA Values, N = 40

Figure 4: Time series plot illustrating the alarm-raising mechanism

To summarize, the approach of the back-end for detectingmalware using information extracted at the VMM-level con-sists of the following main steps:

1. Use Sequential Floating Forward Selection to select

46

Page 10: Virtual machine monitor-based lightweight intrusion detection

features that can best characterize normal behaviorand distinguish normal from abnormal behavior.

2. Apply data mining algorithms to build a model usingnormal data and assign a decision value to each newdata point, quantifying how abnormal it is relative tonormal data points.

3. Use an exponentially weighted moving average (EWMA)model to smooth out fluctuations in the decision valuesand determine when to raise an alarm.

The steps described above fall nicely into our vision in whichour VMM IDS is automatically calibrated and customizedto the behavior of the virtual appliance. In our currentwork, we relax the constraint of utilizing only normal datain two of the steps: feature reduction and choosing an op-timal EWMA filter configuration. In other words, we useboth the normal and abnormal data to select the featuresand choose an EWMA filter, which we will explain in sec-tion 5. Conversely, training of the data mining algorithmsis done with normal data only. Once we’ve used the nor-mal data to generate a model of normal behavior, withoutany prior knowledge of malicious behavior, we evaluate themodel using abnormal data to determine an optimal set offeatures and the filter configuration. This is similar to a realsystem where we look for sensitivity to different parametersand adjust the system accordingly. Therefore, our approachis not strictly unsupervised; training of the classifier is doneusing one-class learning and the IDS calibration uses super-vised learning. Our future work will include extending theone-class learning to feature reduction and filter selection.

5. EVALUATIONIn this section we provide evidence that an effective VMMIDS can be constructed. The section is organized in the fol-lowing way. First we review our experimental setup. Next,we discuss the features selected and information extracted.Finally, we present the accuracy results we obtain using ourtwo data mining techniques and discuss the advantages ofusing both these techniques together.

5.1 Experimental SetupAs our VMM, we use VirtualBox [15], a hosted virtualizationenvironment (i.e., the VMM runs on a commodity operat-ing system). We use the open source edition 2.2, currentlydeveloped by Sun Microsystems as part of its Sun xVM vir-tualization platform. All the experiments are executed ona Dell XPS710 equipped with an Intel Core2 processor (2cores) running at 1.86 GHz with 4 GB of RAM. The hostoperating system is Windows XP (SP2).

The target deployment for our VMM-based IDS is to se-cure virtual machine appliances. Each appliance is usuallyprepackaged with a commodity OS (Windows or Linux) anda software stack that is configured to perform one or morespecific tasks [32]. To this end, we set up different classesof servers, virtual appliances, and workloads as shown in ta-ble 4. These systems are chosen to reflect a broad range ofbehaviors (CPU intensive workloads, disk accesses, networkIO, etc.).

Server Virtual Appliance Workload(s)

Database MySQL Server + On-line transaction

management Window XP (SP3) processing bench-

system mark (TPC-C like)

Web Apache HTTP Apache bandwidth

Server Server + benchmark (ab)

Windows XP (SP3)

EMail Exchange Server Microsoft Exchange

Server + Windows Server Load Simulator

2003 (SP2) (LoadSim)

Table 4: Normal workload (appliances)

The TPC-C like workload is an on-line transaction process-ing benchmark [43]. It generates a pseudo-random sequenceof client accesses that create a stream of random reads andwrites. The ab workload is the Apache HTTP server bench-marking tool [2]. It is designed to measure performance inhttp requests per second. We create a mailserver using Mi-crosoft exchange. LoadSIM [30] is a benchmarking tool sim-ulating clients of an Exchange server. We configure Load-Sim to generate two distinct workloads. In the first work-load, Exchange.Light, we configure LoadSim to simulate 32clients with a medium load. In the second workload, Ex-change.Heavy, we configure LoadSim to simulate 64 clientswith a heavy load.

We use these virtual appliances and more than 300 real-world malicious executables to generate our normal and ab-normal workloads. All malware binaries are taken fromMalfease, an online repository [16], and are unique by theirMD5 checksum. They include very recent (2008 & 2009),real and unknown (zero-day) malware. An anti-virus soft-ware equipped with a recently updated signature databasewas able to identify about 70% of the malware. In a recentstudy, we characterized the malware into four main classesof attacks, based on their general behavior. These classesare: (1) Trojan Horses, (2) Downloaders, (3) Backdoors,and (4) Infostealers.

The normal workload is generated by simply executing ourserver (and its workload generator) for a period of 1 hour.The abnormal workload is generated by executing our serverand, at a predefined point in time, injecting and executingthe binaries of a real-world malware. The abnormal execu-tions are each 15 minutes long and contain both normal andabnormal windows. They consist of a 10-minute period ofnormal activity, after which a script initiates a malware’sexecutable. This is followed by an additional 5 minutes ofexecution time.

For each execution, the stream of events extracted at theVMM-level can be used to construct features and discardedthereafter. This enables the IDS to be deployed in an ef-ficient, online manner. Furthermore, in a real productionenvironment, the feature reduction and normal model cre-ation tasks need only be performed periodically, allowingthe corresponding features and model to be used until suchtime as it is deemed necessary to update them. Hence, oncethe VMM-level events are extracted and used to constructthe (selected) features, anomaly detection can take place to

47

Page 11: Virtual machine monitor-based lightweight intrusion detection

determine if the resulting window of feature-values containsmalicious activity.

5.2 Feature AnalysisAs described previously, we use Sequential Floating ForwardSelection (SFFS) to select features that best characterize thenormal behavior and which are able to distinguish normalfrom abnormal activity. In this study, we use LOF as thedata mining algorithm incorporated in the feature reduc-tion task. Once the features are selected, we build a modelof normal execution and perform anomaly detection usingKNN, in addition to LOF. In this way, we can see how wellthe features chosen to optimize the LOF results generalizeto another data mining technique (KNN).

The criterion we chose to optimize during feature reductionis the minimum (Euclidean) distance to the ideal operat-ing point on the Receiver Operating Characteristic (ROC)curve. An ROC curve is a plot of the true detection rateversus the false alarm rate. It is a technique for visualiz-ing, organizing and selecting classifiers based on their per-formance [9]. The ideal operating point on an ROC curveis the point (0,1), where the false alarm rate is 0% and thedetection rate is 100%. Each of our abnormal traces can bedivided into two parts: (1) all the windows from the begin-ning until the point where the malware was injected, and(2) the windows from the malware injection point until theend. Since there is no malicious activity in the left part ofthe trace, we can treat it as a “normal” data point. The IDSmust not raise an alarm during this part of the trace, oth-erwise it will have produced a false alarm.13 The malwareis injected during the right portion of the trace and hence itcan be treated as an “abnormal” data point. The IDS mustraise at least one alarm during this portion of the trace, oth-erwise it will have produced a false negative (the inability tocorrectly raise an alarm when it should have been raised).With this viewpoint of an abnormal trace, we can produceproper ROC curves for the collection of abnormal traces.

Each point on the ROC curve is found by varying the thresh-old value on the EWMA filter output. Since the effective fil-ter width (N) also affects the true detection and false alarmrate, we vary N from 1 to 50 and produce their correspond-ing ROC plots. Then, we find the distance from all thepoints on the ROC curves to the point (0,1). The smallestdistance provides an indication of how well the set of fea-tures performed. Figure 5 shows a generic example of anROC curve. The line y = x represents the true detectionversus the false alarm rate of a classifier which makes ran-dom guesses as to which class each data point belongs. Aviable classifier is one whose corresponding ROC curve isabove this line and, preferably, comes close to the ideal op-erating point P . Point A is an operating point on the ROCcurve. The distance of point A from point P (the length ofline segment AP ) provides an indicator of the optimality ofoperating point A. The smaller the distance, the better theoperating point A on the ROC curve. We select the optimaloperating point on the curve as the point with the smallestdistance to the ideal operating point P .14 If two points on

13Since the left portion of the trace is equivalent to a sin-gle normal data point, multiple alarms raised during thisportion of the trace are counted as a single false alarm.

14For symmetric or near-symmetric ROC curves, such a point

the ROC curve have the same distance to point P , we choosethe operating point with the lower false alarm rate.

Tru

e D

etec

tion

Rat

e

0.1

0.1 0.2 0.3 0.5 0.6 0.7 0.8 0.90.4

0.3

0.5

0.6

0.7

0.8

0.9

1.0

0.2

0.4

1.0

y = x

False Alarm Rate

A

P

Figure 5: Choosing an optimal operating point onthe ROC curve

Feature reduction was performed for each of our workloads.The features themselves possess low-level architectural in-formation which, for the most part, can be difficult to corre-late to high-level program behavior. That is where featurereduction can be a valuable tool to reduce the space of pos-sible features, removing irrelevant and redundant featureswhile maintaining sufficient information to help distinguishnormal from malicious behavior. We analyzed the featureschosen and made several observations. Most of the selectedfeatures correspond to events related to page faults, taskswitches, privilege level changes, disk IO, and network IO.Both rate and relationship features were chosen as they eachprovide different and complementary information. A list ofthe features selected for each workload, along with a briefdescription of each, are provided in Table 5. We also specifywhether a feature is based on VM events or VMM eventsand whether it is a rate or relationship feature.

Among the features for the Apache workload is a rate fea-ture that counts the number of invalidate page events and afeature that shows the relationship between setting the cur-rent privilege level to 0 and network IO events. As Apacheis a network-intensive workload, we expect to see featuresbased on network IO events playing a role in characterizingits normal execution. A relationship feature chosen for theMySQL workload counts the number of runs for read eventsthat caused a page fault versus write events that caused apage fault. Page faults can provide information about thememory access patterns of the workload. For MySQL, whichis disk-intensive, it is intuitive that accesses to random lo-cations on the disk would entail page faults due to the fre-quent swapping between memory and disk. A rate featurefound in common among three of the workloads (MySQL,Exchange.Light, and Exchange.Heavy) counts the numberof page fault events that occurred due to a page-level pro-

will roughly strike a balance between the false positive rate(i.e., false alarm rate) and the false negative rate (i.e.,1− true detection rate) [17].

48

Page 12: Virtual machine monitor-based lightweight intrusion detection

Feature Feature Type Short Description

Apache

CR3 WRITE VM/Rate Number of writes to control register 3 (CR3)

INVALIDATE PAGE VM/Rate Number of page invalidation events

CPL SET 0 VS NETWORKIO[numRuns] VM/Relationship Number of runs for setting the current privilege level to 0 events

versus network IO events

MySQL

CR3 WRITE VM/Rate Number of writes to CR3

PAGE FAULT P 1 VM/Rate Number of page faults caused by a page-level protection violation

PAGE FAULT WR 0 VS PAGE FAULT WR 1[numRuns] VM/Relationship Number of runs for page faults due to reads vs. page faults due to writes

CR3 WRITE VS PAGE FAULT[MannWhitney] VM/Relationship Measure of the randomness between writes to CR3 and page fault events

CR3 WRITE VS TRAP[avgRunValue] VM/Relationship Average run value for writes to CR3 vs. trap events

Exchange.Light

PAGE FAULT P 1 VM/Rate Number of page faults caused by a page-level protection violation

PAGE FAULT US 0 VM/Rate Number of page faults that occur while in system mode

CR3 WRITE VS CR4 WRITE[avgRunValue] VM/Relationship Average run value for writes to CR3 vs. writes to CR4

TSS WRITE VS TLB FLUSH[MannWhitney] VM/Relationship Measure of the randomness between writes to the TSS and TLB flushes

Exchange.Heavy

PAGE FAULT P 1 VM/Rate Number of page faults caused by a page-level protection violation

PAGE FAULT US 0 VM/Rate Number of page faults that occur while in system mode

CR3 WRITE VS CR0 WRITE[maxRunValue] VM/Relationship Maximum run value for writes to CR3 vs. writes to CR0

CR3 WRITE VS CR4 WRITE[minRunValue] VM/Relationship Minimum run value for writes to CR3 vs. writes to CR4

CPL SET 0 VS DISKIO[MannWhitney] VM/Relationship Measure of the randomness between setting the current privilege level

to 0 events and disk IO events

EXECUTION MODE ENTER REM VS PAGE VMM/Relationship Measure of the randomness between events related to entering recompiled

FAULT US 0[MannWhitney] execution mode and page fault events that occur in system mode

Table 5: Features selected for each workload

tection violation. The MySQL and Apache workloads chosea rate feature that counts the number of writes to controlregister 3 (CR3), indicating that the rate of context switchescan be useful in recognizing malicious from normal activity.The Exchange workloads both selected relationship featuresthat measure the length of consecutive writes to differentcontrol registers when interleaved with writes to other con-trol registers (e.g., writes to CR3 vs. CR4). They also se-lected a rate feature that counts the number of page faultsthat occurred while in supervisor mode (i.e., ring 0).

We also observed that VM events are predominantly usedto construct the selected features. For the set of workloadsthat we studied, only one feature based on a VMM eventwas selected and this occurred for only one workload (Ex-change.Heavy). It is a relationship feature that applies theMann-Whitney test to evaluate the random interleaving ofthe following two events: (1) an event distinguishing thatthe VMM has entered a different state of execution whereinit intercepts system-level requests and recompiles code, and(2) page faults that occurred while executing in the super-visor mode.

In our analysis, we have seen correlation among VM andVMM events which may explain why the features selectedare mostly those based on VM events. In other words, fea-tures based on VMM events do not seem to provide muchadditional information. Since most of the VMM events arespecific to the VMM in our study (VirtualBox), by removingthese events and their corresponding features, we can stillachieve a good detection rate. This shows the robustness ofour IDS with respect to the underlying VMM. As it does notneed to be tied to a specific VMM and the majority of thecollected events can be found in VMMs that are to similarto Xen, it should produce comparable results across variousVMMs.

Based on the features selected, we determined that not allevents are equally important. Filtering out unnecessary orinfrequently-used events (with respect to the selected fea-tures) can optimize the execution performance of the VMMIDS front-end.

5.3 Experimental ResultsOnce the features are selected, we run experiments usingthe KNN and LOF algorithms to evaluate their ability todifferentiate abnormal from normal execution. In order tofind the best value for the parameter k (size of the neigh-borhood), an appropriate distance metric, and an accuratefilter configuration (threshold and effective filter width) touse for each technique, we run the data mining algorithms onthe validation set. The validation set consists of a randomlychosen 90% of the approximately 300 abnormal traces (eachcontaining a malicious attack). Using the validation set, wedetermine appropriate values to which we should set the pa-rameters of the IDS. The remaining 10% of the abnormaltraces comprise our test set. We use this data to evaluatethe ability of the IDS to accurately identify unseen malware.The training set consists of five normal traces for which nomalware has been inserted, for the purpose of building amodel of normal activity.

Once the parameters of the IDS have been set using thevalidation data, we are now ready to test our IDS againstunseen malicious attacks. We use the filter configurationfound from the validation set; we apply the IDS using thisfilter configuration to our test set of feature traces. As dis-cussed in section 5.2, we measure performance in terms ofthe true detection and false alarm trade-off achieved. In ad-dition, we are concerned with how quickly the IDS is ableto flag a malicious attack. To this end, we produce a plot ofthe true detection rate versus the time-to-detection (in unitsof windows). In an abnormal trace, an alarm raised before

49

Page 13: Virtual machine monitor-based lightweight intrusion detection

the injection of the malware will result in a false alarm. Thetime-to-detection for such a trace would effectively be a neg-ative value. Even if an alarm is also raised at some pointafter the injection of the malware, it still does not justifycounting it as a successful malware detection. Hence, whenproducing the time-to-detection results, we restrict ourselvesto showing the results for only traces in which the IDS wasable to do everything right: it raised an alarm after the in-jection of the malware and not before it. The x-axis repre-sents the delay from the point where the malware is injecteduntil the alarm is raised. The y-axis provides the fractionof traces for which an alarm is correctly raised during thattime. Figure 6 presents this timeliness plot for the MySQLworkload. For MySQL, the IDS using the LOF algorithmis able to detect 100% of the malware instances with a falsealarm rate of 6%. Thus, in the timeliness plot, we showthe time-to-detection for the remaining 94% of the traceswherein no false alarm was raised.

0

0.2

0.4

0.6

0.8

1

0 2 4 6 8 10 12 14 16 18

Det

ectio

n R

ate

Time-to-Detection (Windows)

MySQL Testing Timeliness Plot

Validation Configuration: Threshold = 2.08, N = 49

Figure 6: Timeliness plot for the MySQL test traces

Note that although the malware is injected at a specific pointin time during a normal workload execution, it may be thecase that the malware does not begin showing abnormal be-havior until a later time, delaying the time-to-detection bythe data mining algorithms. Thus, to describe what hap-pens in the vast majority (95%) of the cases, we calculatethe time-to-detection at the 95% level. This gives us an indi-cation of the delay (from the injection point of the malware)to detect a malicious attack in 95% of the tests where wecorrectly raise a true alarm, without raising a false alarm.The detection and false alarm results on the test set, alongwith the time-to-detection at the 95% level are presented inTable 6.

The overall LOF results are very promising. On the Apacheworkload, which exhibits a more stable normal execution be-havior, LOF achieves optimal results; 100% of the malwareare detected with no false alarms. In addition, for at least95% of the traces, the malware is detected within the firstwindow of its execution. LOF is also able to detect all ofthe malware executed on the MySQL workload, at a cost of6% false alarms and (at the 95% level) within 11 windows(about 22 seconds) from the introduction of the malware.For the Exchange workloads, over 80% of the malware aredetected with 3% false alarms and (at the 95% level) at

Accuracy Time-To-Detection

Server Detections False Alarms (95% Level)

Local Outlier Factor

Apache 100% 0% 1

MySQL 100% 6% 11

Exchange.Light 84% 3% 14

Exchange.Heavy 87% 3% 32

K-Nearest Neighbor

Apache 100% 13% 1

MySQL 100% 6% 9

Exchange.Light 94% 3% 14

Exchange.Heavy 73% 17% 79

Table 6: Accuracy and time-to-detection results ontest dataset

most 32 windows (64 seconds) from where the malware wasinjected.

Although the features chosen during the feature reductiontask are those which try to optimize for the LOF perfor-mance, we see that they are still able to provide KNN withthe ability to detect a large percentage of the malware, butat times, with a higher false alarm rate. In the case of theExchange.Light workload, KNN is able to identify a greaterpercentage of the malware with the same false alarm rate asLOF. The reason for this could be that the closest normalwindows to the malicious windows are in low-density ar-eas (due to the lighter load of Exchange.Light). This wouldcause them to be assigned a low LOF decision value, makingthem indistinguishable from normal behavior. KNN, on theother hand, takes a distance-based approach to identifyinganomalous windows and hence determines that, accordingto the distances to its k-nearest neighbors, the maliciouswindows warrant a high KNN decision value. This results ahigher value of the EWMA filter, increasing the probabilityof correctly raising an alarm. By combining the output ofKNN and LOF, it is possible to achieve a higher detectionrate, which we pursue in future work.

For the Exchange.Light workload, the IDS using the LOFalgorithm achieves a true positive rate of 84% with a falsealarm rate of 3%. This means that in the worst case, outof the 84% of the traces in which an alarm was correctlyraised after the injection of the malware, 3% of them mayhave also raised a false alarm.15 In this case, at least 81%of the true positives were instances where the IDS was ableto correctly raise an alarm without raising a false alarm.Considering the difficulty of the problem and the extremelylimited information available at the VMM level, these resultsare remarkably encouraging.

Given the operating point16 at which we present our results,we achieve a high detection rate with a moderate false alarmrate. We can also back off on the detection rate, in favor ofimproving the false alarm rate. In addition, we can applymore aggressive filtering and thresholding to further reduce

15Upon further examination of the results, there was onlyone trace which raised a false alarm and it happened tohave raised a true alarm as well.

16The operating point determines the threshold and filterconfiguration of the system.

50

Page 14: Virtual machine monitor-based lightweight intrusion detection

the false alarm rate. The ultimate choice of operating pointand detection/false alarm trade-off is determined by the ap-plication and can be selected accordingly.

6. DISCUSSIONIn this section we discuss several issues that came up duringour work.

Generalization of Results and IDS Robustness - Inour work, we trained the classifier on 90% of the dataand tested on the remaining 10% (over 30 malware).In future work, we will perform additional testing todetermine if the results are consistent and evaluate howwell they generalize. To study the robustness of theVMM IDS, we performed cross-validation experimentswherein we calibrated the IDS on three of the fourclasses of malware (described in section 5.1) and testedon the fourth class. The IDS had an average of 94%true detections at a cost of about 9% false alarms.This shows the robustness of our approach in terms ofidentifying unseen classes of malware.

VMM IDS Applicability - As described above, our in-tuition led us to start evaluating our IDS on controlledenvironments such as software appliances. These envi-ronments, while limited, are common in data centersand cloud computing. In future work, we will evaluateour IDS on more complex, heterogeneous, and versa-tile environments. For example, in a data center, theworkload (and consequently the behavior of the sys-tem) may change during the day. Also, even a simpleappliance will require software updates every so often.This introduces new, yet unmalicious, behavior. Webelieve that by applying incremental variants of themachine learning algorithms and by performing onlinetraining17, we will be able to achieve a good detectionrate in these more complex environments, at the costof only a slight increase in the false alarm rate. Thisshall be addressed in future work wherein we will testour IDS on variable workloads and the introduction ofsoftware updates. Current and future work will alsoaim at reducing the false positive rate. This can bedone by improving the predictor, the quality of thefeatures and the number of events extracted. In addi-tion, the IDS can be deployed as one component of asystem which can include multiple detectors (OS IDSsas well as application IDSs). In such a case, multiplesources will need to agree before raising an alarm.

Evading the VMM IDS - A weakness shared by mosttypes of IDSs is one in which an attacker can studythe defense methods and create new attacks evadingthe detectors. Although a VMM IDS is not immune tothis kind of attack, we believe it would be much harderto accomplish. An attacker would need to generate alow-level footprint such that it is either identical to thenormal workload executing, or is very light. This taskis both difficult to accomplish and is highly dependenton the target machine normal workload. To success-fully create a mimicry attack, the attacker would need

17To limit the performance penalty of online training, whichrequires constant updating of the model, online training canbe performed during non-peak hours of the day.

to not only know the normal workload, but also thetype of events that we monitor and the distribution ofthe events, which are reflected in rate features. Forthe relationship features, he would need to mimic theorder of the events, as well as the randomness withwhich the different events are interleaved. Many ofthe events that we track are generated by the OS, sonot only should the attack be similar to the workload,but so should the interaction between the OS and theattack. It is not enough for the attack to generate aworkload that is similar to that of the normal work-load. The footprint of the whole system with the at-tack should stay the same. For example in the case ofa MySQL server, the attack must maintain the samerate of reads and writes to the disk. With an attack,there is likely to be a larger number of page faults,context switches (due to the inclusion of more pro-cesses), and other system-related events. The systemas a whole will have a different footprint. If an at-tack is lightweight, it would be more difficult for theVMM IDS to identify. Nonetheless, based on the highdetection rate of our IDS on the diverse set of approx-imately 300 malware, which range in their intensity ofattack, we see that the VMM is still able to identifysome lightweight attacks. Future research will need tospecifically address lightweight attacks.

Timeliness - Timely detection is one of the main goals ofthis IDS. It is clear that a timely and early detectionis preferable. Our IDS is able to detect most malwarewithin a minute introduction. Although the detectionis not always immediate, it is better to detect an attackafter a few minutes than not at all. And while somedamage can be done in the meantime, it is confined toone VM.

Response - Generating a response to an attack is an issueleft for future work. This issue is not at all trivial aslittle semantics are available. Initial discussion lead usto believe that OS support can be used to interpretthe low level data, identify the malware, and generatea report useful for system administrators.

Additionally, several actions can be taken to resolvethe attack. For example, a breached guest VM canbe put offline while the attack is analyzed or it canbe discarded and destroyed. Moreover, in many cases,appliances can roll back to a last known good config-uration (a checkpoint). This action is especially easyto accomplish in a VM environment.

Execution Overhead - An analysis of the current execu-tion overhead of the VMM IDS shows that, in terms ofwall clock time, the event extraction currently resultsin about a 10% performance degradation. In otherwords, adding event extraction to the VMM results inapproximately 10% longer execution time. In futurework, we can further reduce the execution overhead.This can be accomplished by filtering out unimportantevents, thereby removing the need to extract them.Such a process can be guided by the feature reductiontask, by means of identifying which events were notused to construct any of the selected features.

Back-end Performance - Both KNN and LOF are com-pute intensive algorithms. While training can take

51

Page 15: Virtual machine monitor-based lightweight intrusion detection

minutes to hours to complete, much of this processingcan be done offline. When this system is deployed ona live system, the back-end classification should not bethe gating performance factor. A single classificationrun on an X86 dual core can take on the order of min-utes. We have moved this classification to a GraphicsProcessing Unit, and classification is now completed inunder a second.

7. CONCLUSIONS AND FUTURE WORKA VMM-based IDS increases the ease of deployment acrossdifferent operating systems and versions, and as part ofa VMM, offers high manageability for server appliances.VMM-based IDSs break the boundaries of current state-of-the-art IDSs. They represent a new point in the IDS designspace that trades a lack of program semantics for greatermalware resistance and ease of deployment.

In this work, we implemented and evaluated a VMM-basedIDS. The open source edition of VirtualBox was used toconstruct the front-end. We presented the types of informa-tion we were able to extract from the VMM and describedthe procedure used to build features. We also provided ananalysis of the important features selected by our feature re-duction technique, and discussed the corresponding eventsthat were integral for distinguishing normal from abnormalbehavior. Data mining algorithms were utilized as powerfultechniques to bridge the semantic gap between the low-levelarchitectural data and actual program behavior. Our resultsshowed that there is enough information embedded withinthe VMM-level data to be processed and mined to accuratelydetect an average of 93% of real-world malicious attacks onserver appliances and do so in a timely fashion, at a cost ofonly 3% false alarms.

Our VMM IDS offers a wealth of future research opportu-nities. Our front-end can be extended to other VMMs suchas Xen or ESX server and its performance can be signifi-cantly improved. Additional events can be extracted andused to build new features. Our back-end provides a wholenew research dimension. Many data mining algorithms canbe evaluated for effectiveness, as well as performance. Inparticular, we intend to study unsupervised data mining al-gorithms and an incremental version of the LOF algorithm.This will allow us to dynamically update our model of nor-mal behavior and build a more robust classifier with a po-tentially lower false alarm rate.

8. REFERENCES[1] The SNORT Network IDS. www.snort.org.

[2] The Apache Software Foundation. ab - Apache HTTPserver benchmarking tool. http://www.apache.org/.

[3] P. Barham, B. Dragovic, K. Fraser, S. Hand,T. Harris, A. Ho, R. Neugebauer, Ian Pratt, andAndrew Warfield. Xen and the art of virtualization. InSOSP ’03: Proceedings of the nineteenth ACMsymposium on Operating systems principles, pages164–177, New York, NY, USA, 2003. ACM Press.

[4] M. M. Breunig, H.-P. Kriegel, R. T. Ng, andJ. Sander. LOF: Identifying density-based localoutliers. SIGMOD Rec., 29(2):93–104, 2000.

[5] P. M. Chen and B. D. Noble. When virtual is better

than real. In HOTOS ’01: Proceedings of the EighthWorkshop on Hot Topics in Operating Systems, page133, Washington, DC, USA, 2001. IEEE ComputerSociety.

[6] D. A. S. de Oliveira, J. R. Crandall, G. Wassermann,S. Ye, S. F. Wu, Z. Su, and F. T. Chong. Bezoar:Automated Virtual Machine-based Full-SystemRecovery from Control-Flow Hijacking Attacks. InIEEE/IFIP Network Operations and ManagementSymposium, Salvador-Bahia, Brazil, April 2008.

[7] D. E. Denning. A lattice model of secure informationflow. Commun. ACM, 19(5):236–243, 1976.

[8] A. Dinaburg, P. Royal, M. Sharif, and W. Lee. Ether:malware analysis via hardware virtualizationextensions. In CCS ’08: Proceedings of the 15th ACMconference on Computer and communications security,pages 51–62, New York, NY, USA, 2008. ACM.

[9] T. Fawcett. An introduction to roc analysis. PatternRecogn. Lett., 27(8):861–874, June 2006.

[10] D. Gao, M. K. Reiter, and D. Song. On gray-boxprogram tracking for anomaly detection. InProceedings of the 13th USENIX Security Symposium,pages 103–118, San Diego, CA, USA, Aug. 9-13 2004.

[11] T. Garfinkel and M. Rosenblum. A virtual machineintrospection based architecture for intrusiondetection. In Proc. Network and Distributed SystemsSecurity Symposium, February 2003.

[12] J. L. Griffin, A. G. Pennington, J. S. Bucy,D. Choundappan, N. Muralidharan, and G. R.Ganger. On the feasibility of intrusion detection insideworkstation disks. Technical ReportCMU-PDL-03-106, Carnegie Mellon University, 2003.

[13] G. Gu, R. Perdisci, J. Zhang, and W. Lee. BotMiner:Clustering analysis of network traffic for protocol- andstructure-independent botnet detection. In Proceedingsof the 17th USENIX Security Symposium(Security’08), 2008.

[14] S. A. Hofmeyr, S. Forrest, and A. Somayaji. Intrusiondetection using sequences of system calls. Journal ofComputer Security, 6(3):151–180, 1998.

[15] Innotek. Innotek virtualbox.http://www.virtualbox.org/.

[16] ISC OARC. Project Malfease.https://malfease.oarci.net/.

[17] T. Ishioka. Evaluation of criteria on informationretrieval. Syst. Comput. Japan, 35(6):42–49, 2004.

[18] X. Jiang, X. Wang, and D. Xu. Stealthy malwaredetection through vmm-based “out-of-the-box”semantic view reconstruction. In CCS ’07: Proceedingsof the 14th ACM conference on Computer andcommunications security, pages 128–138, New York,NY, USA, 2007.

[19] H. Jin, G. Xiang, F. Zhao, D. Zou, M. Li, and L. Shi.Vmfence: a customized intrusion prevention system indistributed virtual computing environment. InICUIMC ’09: Proceedings of the 3rd InternationalConference on Ubiquitous Information Managementand Communication, pages 391–399, New York, NY,USA, 2009. ACM.

[20] S. T. Jones, A. C. Arpaci-Dusseau, and R. H.Arpaci-Dusseau. Antfarm: tracking processes in avirtual machine environment. In ATEC ’06:

52

Page 16: Virtual machine monitor-based lightweight intrusion detection

Proceedings of the annual conference on USENIX ’06Annual Technical Conference, Berkeley, CA, USA,2006. USENIX Association.

[21] S. T. Jones, A. C. Arpaci-Dusseau, and R. H.Arpaci-Dusseau. Vmm-based hidden process detectionand identification using lycosid. In VEE ’08:Proceedings of the fourth ACM SIGPLAN/SIGOPSinternational conference on Virtual executionenvironments, pages 91–100, New York, NY, USA,2008. ACM.

[22] E. Kirda, C. Kruegel, G. Banks, G. Vigna, andR. Kemmerer. Behavior-based Spyware Detection. InProceedings of the 15th USENIX Security Symposium,Vancouver, BC, Canada, Aug. 2006.

[23] A. P. Kosoresow and S. A. Hofmeyr. Intrusiondetection via system call traces. IEEE Softw.,14(5):35–42, 1997.

[24] M. Laureano, C. Maziero, and E. Jamhour. Protectinghost-based intrusion detectors through virtualmachines. Comput. Netw., 51(5):1275–1283, 2007.

[25] A. Lazarevic, L. Ertoz, V. Kumar, A. Ozgur, andJ. Srivastava. A comparative study of anomalydetection schemes in network intrusion detection. InProceedings of the Third SIAM InternationalConference on Data Mining, 2003.

[26] W. Lee, S. J. Stolfo, and P. K. Chan. Learningpatterns from unix process execution traces forintrusion detection. In In AAAI Workshop on AIApproaches to Fraud Detection and Risk Management,pages 50–56. AAAI Press, 1997.

[27] W. Lee, S. J. Stolfo, and K. W. Mok. A data miningframework for building intrusion detection models. InIn IEEE Symposium on Security and Privacy, pages120–132, 1999.

[28] Y. Liao and V. R. Vemuri. Use of k-nearest neighborclassifier for intrusion detection, 2002.

[29] D. A. Menasce. Virtualization: Concepts, applications,and performance modeling, 2005.

[30] Microsoft. Microsoft Exchange Server 2003 LoadSimulator (LoadSim). http://www.microsoft.com/.

[31] A. C. Myers. Jflow: practical mostly-staticinformation flow control. In POPL ’99: Proceedings ofthe 26th ACM SIGPLAN-SIGACT symposium onPrinciples of programming languages, pages 228–241,New York, NY, USA, 1999. ACM Press.

[32] NEI. Network Engines. http://www.nei.com/.

[33] J. Newsome and D. Song. Dynamic taint analysis forautomatic detection, analysis, and signaturegeneration of exploits on commodity software. In The12th Annual Network and Distributed System SecuritySymposium, Feb. 3-4, San Diego, CA, USA, 2005.

[34] S. Peddabachigari, A. Abraham, C. Grosan, andJ. Thomas. Modeling intrusion detection system usinghybrid intelligent systems. Journal of Network andComputer Applications, 30(1):114–132, 2007.

[35] D. Pokrajac, A. Lazarevic, and L. J. Latecki.Incremental local outlier detection for data streams. InCIDM, pages 504–515, 2007.

[36] P. Pudil, F. J. Ferri, J. Novovicova, and J. Kittler.Floating search methods for feature selection withnonmonotonic criterion functions. In In Proceedings of

the Twelveth International Conference on PatternRecognition, IAPR, pages 279–283, 1994.

[37] G. N. Purdy. Linux iptables - kurz & gut. O’Reilly, 082004.

[38] C. Sapuntzakis and M. S. Lam. Virtual appliances inthe collective: A road to hassle-free computing. InHOTOS ’03: Proceedings of the 9th conference on HotTopics in Operating Systems, pages 55–60, Berkeley,CA, USA, May 2003. USENIX Association.

[39] M. G. Schultz, E. Eskin, E. Zadok, and S. J. Stolfo.Data mining methods for detection of new maliciousexecutables. In In Proceedings of the IEEE Symposiumon Security and Privacy, pages 38–49, 2001.

[40] K. Scott and J. Davidson. Safe virtual execution usingsoftware dynamic translation. In ACSAC ’02:Proceedings of the 18th Annual Computer SecurityApplications Conference, page 209, Washington, DC,USA, 2002. IEEE Computer Society.

[41] J. Seward and N. Nethercote. Using valgrind to detectundefined value errors with bit-precision. In USENIX2005 Annual Technical Conference, pages 17–30, Apr.10-15, Anaheim, CA, USA, 2005.

[42] S. J. Stolfo, F. Apap, E. Eskin, K. Heller, S. Hershkop,A. Honig, and K. Svore. A comparative evaluation oftwo algorithms for windows registry anomalydetection. J. Comput. Secur., 13(4):659–693, 2005.

[43] Transaction Processing Performance Council (TPC).TPC-C, an on-line transaction processing benchmark.http://www.tpc.org/.

[44] UML. User-mode linux kernel.http://user-mode-linux.sourceforge.net/.

[45] N. Vachharajani, M. J. Bridges, J. Chang, R. Rangan,G. Ottoni, J. A. Blome, G. A. Reis, M. Vachharajani,and D. I. August. Rifle: An architectural frameworkfor user-centric information-flow security. In MICRO37: Proceedings of the 37th annual InternationalSymposium on Microarchitecture, pages 243–254,Washington, DC, USA, 2004. IEEE Computer Society.

[46] VMware. VMware. http://www.vmware.com/.

[47] D. Wagner and D. Dean. Intrusion detection via staticanalysis. In SP ’01: Proceedings of the 2001 IEEESymposium on Security and Privacy, page 156,Washington, DC, USA, 2001. IEEE Computer Society.

[48] T. Y. Wang, C. H. Wu, and C. C. Hsieh. A virusprevention model based on static analysis and datamining methods. In CITWORKSHOPS ’08:Proceedings of the 2008 IEEE 8th InternationalConference on Computer and Information TechnologyWorkshops, pages 288–293, Washington, DC, USA,2008. IEEE Computer Society.

[49] C. Warrender, S. Forrest, and B. A. Pearlmutter.Detecting intrusions using system calls: Alternativedata models. In IEEE Symposium on Security andPrivacy, pages 133–145, 1999.

[50] X. Zhang, Q. Li, S. Qing, and H. Zhang. VNIDA:Building an IDS architecture using VMM-basednon-intrusive approach. In WKDD ’08: Proceedings ofthe First International Workshop on KnowledgeDiscovery and Data Mining, pages 594–600,Washington, DC, USA, 2008. IEEE Computer Society.

53


Top Related