drebin: efficient and explainable detection of android malware in

16
GEORG-AUGUST-UNIVERSIT ¨ AT G ¨ OTTINGEN I NSTITUTE OF COMPUTER S CIENCE Technical Report IFI-TB-2013-02 D REBIN: Efficient and Explainable Detection of Android Malware in Your Pocket Daniel Arp, Michael Spreitzenbarth, Malte H ¨ ubner, Hugo Gascon, and Konrad Rieck TECHNICAL REPORTS OF THE I NSTITUTE OF COMPUTER S CIENCE ISSN 1611-1044 AUGUST 2013

Upload: nguyennguyet

Post on 03-Jan-2017

225 views

Category:

Documents


1 download

TRANSCRIPT

GEORG-AUGUST-UNIVERSITAT GOTTINGENINSTITUTE OF COMPUTER SCIENCE

Technical Report IFI-TB-2013-02

DREBIN: Efficient and Explainable Detection ofAndroid Malware in Your Pocket

Daniel Arp, Michael Spreitzenbarth, Malte Hubner,Hugo Gascon, and Konrad Rieck

TECHNICAL REPORTS OF THEINSTITUTE OF COMPUTER SCIENCE

ISSN 1611-1044

AUGUST 2013

Georg-August-Universitat GottingenInstitute of Computer Science

Goldschmidtstr. 737077 GottingenGermany

Phone: +49 (551) 39 - 172000Fax: +49 (551) 39 - 14403E-Mail: [email protected]: www.ifi.informatik.uni-goettingen.de

DREBIN: Effective and Explainable Detectionof Android Malware in Your Pocket

Daniel ArpUniversity of Gottingen

Gottingen, Germany

Michael SpreitzenbarthSiemens AG

Munchen, Germany

Malte HubnerUniversity of Gottingen

Gottingen, Germany

Hugo GasconUniversity of Gottingen

Gottingen, Germany

Konrad RieckUniversity of Gottingen

Gottingen, Germany

Abstract

Malicious applications pose a threat to the security ofthe Android platform. The growing amount and diversityof these applications render conventional defenses largelyineffective and thus Android smartphones often remain un-protected from novel malware. In this paper, we proposeDREBIN, a lightweight method for detection of Androidmalware that enables identifying malicious applications di-rectly on the smartphone. As the limited resources impedemonitoring applications at run-time, DREBIN performs abroad static analysis, gathering as many features of an ap-plication as possible. These features are embedded in ajoint vector space, such that typical patterns indicative formalware can be automatically identified and used for ex-plaining the decisions of our method. In an evaluation with123,453 applications and 5,560 malware samples DREBINoutperforms several related approaches and detects 94%of the malware with few false alarms, where the explana-tions provided for each detection reveal relevant propertiesof the detected malware. On five popular smartphones, themethod requires 10 seconds for an analysis on average, ren-dering it suitable for checking downloaded applications di-rectly on the device.

1 Introduction

Android is one of the most popular platforms for smart-phones today. With several hundred thousands of appli-cations in different markets, it provides a wealth of func-tionality to its users. Unfortunately, smartphones runningAndroid are increasingly targeted by attackers and infectedwith malicious software. In contrast to other platforms,Android allows for installing applications from unverifiedsources, such as third-party markets, which makes bundling

and distributing applications with malware easy for attack-ers. According to a recent study [16] over 55,000 maliciousapplications and 119 new malware families have been dis-covered in 2012 alone. It is evident that there is a need forstopping the proliferation of malware on Android marketsand smartphones.

The Android platform provides several security mea-sures that harden the installation of malware, most notablythe Android permission system. To perform certain taskson the device, such as sending a SMS message, each ap-plication has to explicitly request permission from the userduring the installation. However, many users tend to blindlygrant permissions to unknown applications and thereby un-dermine the purpose of the permission system. As a con-sequence, malicious applications are hardly constrained bythe Android permission system in practice.

A large body of research has thus studied methods foranalyzing and detecting Android malware prior to their in-stallation. These methods can be roughly categorized intoapproaches using static and dynamic analysis. For exam-ple, TaintDroid [9], DroidRanger [31] and DroidScope [29]are methods that can monitor the behavior of applications atrun-time. Although very effective in identifying maliciousactivity, run-time monitoring suffers from a significant over-head and can not be directly applied on mobile devices. Bycontrast, static analysis methods, such as Kirin [11], Stow-away [13], RiskRanker [18] or regular anti-virus scanners,usually induce only a short run-time burst. While theseapproaches are efficient and scalable, they mainly buildon manually crafted detection patterns which are often notavailable for new malware instances. Moreover, most ofthese methods do not provide explanations for their deci-sions and are thus opaque to the practitioner.

In this paper, we propose DREBIN, a lightweight methodfor detection of Android malware that infers detection pat-terns automatically and enables identifying malware di-

1

Android app(apk)

Used permissions

Suspicious API calls

Feature sets

...

Network addresses

(a) Broad static analysis (b) Embedding in vector space (c) Learning-based detection

Malicious (+)

(d) Explanation

Used permissions

Suspicious API calls

Feature sets

...

Network addressesBenign (–)

Linearmodel

Figure 1. Schematic depiction of the analysis steps performed by Drebin.

rectly on the smartphone. DREBIN performs a broad staticanalysis, gathering as many features from an application’scode and manifest as possible. These features are organizedin sets of strings (such as permissions, intents, API calls andnetwork addresses) and embedded in a joint vector space.As an example, an application sending premium SMS mes-sages is cast to a specific region in the vector space associ-ated with the corresponding permissions, intents and APIcalls. This geometric representation enables DREBIN toidentify combinations and patterns of features indicative formalware automatically using machine learning techniques.For each detected application the respective patterns can beextracted, mapped to meaningful descriptions and then pro-vided to the user as explanation for the detection. Asidefrom detection, DREBIN can thus also provide insights intoidentified malware samples.

Experiments with 123,453 applications from differentmarkets and 5,560 recent malware samples demonstrate theefficacy of our method: DREBIN outperforms related ap-proaches [11, 22, 26] as well 9 out of 10 popular anti-virusscanners. The method detects 94% of the malware sampleswith a false-positive rate of 1%, corresponding to one falsealarm in 100 installed applications. On average the analysisof an application requires less than a second on a regularcomputer and 10 seconds on popular smartphone models.To the best of our knowledge, DREBIN is the first methodwhich provides effective and explainable detection of An-droid malware directly on smartphone devices.

In summary, we make the following contributions to thedetection of Android malware in this paper:

• Effective detection. We introduce a method combiningstatic analysis and machine learning that is capable ofidentifying Android malware with high accuracy andfew false alarms, independent of manually crafted de-tection patterns.

• Explainable results. The proposed method providesan explainable detection. Patterns of features indica-tive for a detected malware instance can be traced backfrom the vector space and provide insights into the de-tection process.

• Lightweight analysis. For efficiency we apply linear-

time analysis and learning techniques that enable de-tecting malware directly on the smartphone as well asanalyzing large sets of applications in reasonable time.

We need to note here that DREBIN builds on concepts ofstatic analysis and thus cannot rule out the presence of ob-fuscated or dynamically loaded malware on mobile devices.We specifically discuss this limitation of our approach inSection 4. Due to the broad analysis of features however,our method raises the bar for attackers to infect smartphoneswith malicious applications and strengthens the security ofthe Android platform, as demonstrated in our evaluation.

The rest of this paper is organized as follows: DREBINand its detection methodology are introduced in Section 2.Experiments and a comparison with related approaches arepresented in Section 3. Limitations and related work arediscussed in Section 4 and Section 5, respectively. Section 6concludes.

2 Methodology

To detect malicious software on a smartphone, DREBINrequires a comprehensive yet lightweight representation ofapplications that enables determining typical indications ofmalicious activity. To this end, the method employs abroad static analysis that extracts feature sets from differentsources and analyzes these in an expressive vector space.This process is illustrated in Figure 1 and outlined in thefollowing:

1. Static analysis. In the first step, DREBIN statically in-spects a given Android application and extracts differ-ent feature sets from the application’s manifest and dexcode (Section 2.1).

2. Embedding in vector space. The extracted feature setsare then mapped to a joint vector space, where patternsand combinations of the features can be analyzed geo-metrically (Section 2.2).

3. Learning-based detection. The embedding of appli-cations enables us to identify malware using efficientlearning techniques, such as linear Support Vector Ma-chines (Section 2.3).

2

4. Explanation. In the last step, features contributing tothe detection of a malicious application are extractedand presented to the user for explaining the detectionprocess (Section 2.4).

In the following sections, we discuss these four steps inmore detail and provide necessary technical background ofthe analysis.

2.1 Static Analysis of Applications

As the first step, DREBIN performs a lightweight staticanalysis of a given Android application. Although appar-ently straightforward, the static extraction of features needsto run in a constrained environment and complete in atimely manner. If the analysis takes too long, the user mightskip the ongoing process and refuse the overall method. Ac-cordingly, it becomes essential to select features which canbe extracted efficiently.

We thus focus on the manifest and the disassembled dexcode of the application, which both can be obtained by alinear sweep over the application’s content. To allow for ageneric and extensible analysis, we represent all extractedfeatures as sets of strings, such as permissions, API callsand network addresses. In particular, we extract the follow-ing 8 sets of strings.

Feature sets from the manifest Every application devel-oped for Android must include a manifest file called An-droidManifest.xml which provides data supporting the in-stallation and later execution of the application. The infor-mation stored in this file can be efficiently retrieved on thedevice using the Android Asset Packaging Tool that enablesus to extract the following sets:

S1 Hardware components: This first feature set containsrequested hardware components. If an application re-quests access to the camera, touchscreen or the GPSmodule of the smartphone, these features need to bedeclared in the manifest file. Requesting access to spe-cific hardware has clearly security implications, as theuse of certain combinations of hardware often reflectharmful behavior. An application which has access toGPS and network modules is, for instance, able to col-lect location data and exfiltrate it to an attacker.

S2 Requested permissions: One of the most important se-curity mechanisms introduced in Android is the per-mission system. Permissions are actively granted bythe user at installation time and allow an application toaccess security-relevant resources. As shown by previ-ous work, malicious software tends to request certainpermissions more often than innocuous applications.For example, a great percentage of current malwaresends premium SMS messages and thus requests the

SEND SMS permission. We thus gather all permissionslisted in the manifest in a feature set.

S3 App components: There exist four different types ofcomponents in an application, each defining differentinterfaces to the system: activities, services, contentproviders and broadcast receivers. Every applicationcan declare several components of each type in themanifest. The names of these components are alsocollected in a feature set, as the names may help toidentify well-known components of malware. For ex-ample, several variants of the malware family Droid-KungFu share the name of particular services [see 21].

S4 Filtered intents: Inter-process and intra-process com-munication on Android is mainly performed throughintents: passive data structures exchanged as asyn-chronous messages and allowing information aboutevents to be shared between different components andapplications. We collect all intents listed in the mani-fest as another feature set, as malware often listens tospecific intents. A typical example of an intent mes-sage involved in malware is BOOT COMPLETED, whichis used to trigger malicious activity directly after re-booting the smartphone.

Feature sets from disassembled code Android applica-tions are developed in Java and compiled into optimizedbytecode for the Dalvik virtual machine. This bytecode canbe efficiently disassembled and provides DREBIN with in-formation about API calls and data used in an application.To achieve a low run-time, we implement a lightweight dis-assembler based on the dex libraries of the Android plat-form that can output all API calls and strings contained inan application. We use this information to construct the fol-lowing feature sets.

S5 Restricted API calls: The Android permission systemrestricts access to a series of critical API calls. Ourmethod searches for the occurrence of these calls inthe disassembled code in order to gain a deeper under-standing of the functionality of an application. A par-ticular case, revealing malicious behavior, is the use ofrestricted API calls for which the required permissionshave not been requested. This may indicate that themalware is using root exploits in order to surpass thelimitations imposed by the Android platform.

S6 Used permissions: The complete set of calls extractedin S5 is used as the ground for determining the sub-set of permissions that are both requested and actuallyused. For this purpose, we implement the method in-troduced by Felt et al. [13] to match API calls and per-missions. In contrast to S5, this feature set providesa more general view on the behavior of an application

3

as multiple API calls can be protected by a single per-mission (e.g., sendMultipartTextMessage() andsendTextMessage() both require that the SEND SMS

permission is granted to an application).

S7 Suspicious API calls: Certain API calls allow access tosensitive data or resources of the smartphone and arefrequently found in malware samples. As these callscan specially lead to malicious behavior, they are ex-tracted and gathered in a separated feature set. In par-ticular, we collect the following types of API calls:

(a) API calls for accessing sensitive data, such asgetDeviceId(), and getSubscriberId()

(b) API calls for communication over the net-work, for example execHttpRequest() andsetWifiEnabled().

(c) API calls for sending and receiving SMS mes-sages, such as sendTextMessage()

(d) API calls for execution of external commandslike Runtime.exec()

(e) API calls frequently used for obfuscation, suchas Cipher.getInstance()

S8 Network addresses: Malware regularly establishes net-work connections to retrieve commands or exfiltratedata collected from the device. Therefore, all IP ad-dresses, hostnames and URLs found in the disassem-bled code are included in the last set of features. Someof these addresses might be involved in botnets andthus present in several malware samples, which canhelp to improve the learning of detection patterns.

2.2 Embedding in Vector Space

Malicious activity is usually reflected in specific patternsand combinations of the extracted features. For example, amalware sending premiums SMS messages might containthe permission SEND SMS in set S2, and the hardware com-ponent android.hardware.telephony in set S1. Ide-ally, we would like to formulate Boolean expressions thatcapture these dependencies between features and return trueif a malware is detected. However, inferring Boolean ex-pressions from real-world data is a hard problem and diffi-cult to solve efficiently.

As a remedy, we aim at capturing the dependencies be-tween features using concepts from machine learning. Asmost learning methods operate on numerical vectors, wefirst need to map the extracted feature sets to a vector space.To this end, we define a joint set S that comprises all ob-servable strings contained in the 8 feature sets

S := S1 ∪ S2 ∪ · · · ∪ S8.

We ensure that elements of different sets do not collide byadding a unique prefix to all strings in each feature set. Inour evaluation the set S contains roughly 545,000 differentfeatures (see Section 3).

Using the set S, we define an |S|-dimensional vectorspace, where each dimension is either 0 or 1. An appli-cation x is mapped to this space by constructing a vectorϕ(x), such that for each feature s extracted from x the re-spective dimension is set to 1 and all other dimensions are 0.Formally, this map ϕ can be defined for a set of applicationsX as follows

ϕ : X → {0, 1}|S|, ϕ(x) 7→(I(x, s)

)s∈S

where the indicator function I(x, s) is simply defined as

I(x, s) =

{1 if the application x contains feature s

0 otherwise.

Applications sharing similar features lie close to each otherin this representation, whereas applications with mainly dif-ferent features are separated by large distances. Moreover,directions in this space can be used to describe combina-tions of features and ultimately enable us to learn explain-able detection models.

Let us, as an example, consider a malicious applicationthat sends premium SMS messages and thus needs to re-quest certain permissions and hardware components. A cor-responding vector ϕ(x) for this application looks like this

ϕ(x) 7→

· · ·01· · ·10· · ·

· · · }S1android.hardware.wifi

android.hardware.telephony

· · · }S2SEND SMS

DELETE PACKAGES

· · ·

At a first glance, the map ϕ seems inappropriate for thelightweight analysis of applications, as it embeds data intoa high-dimensional vector space. Fortunately, the num-ber of features extracted from an application is linear inits size. That is, an application x containing m bytes ofcode and data contains at most m feature strings. As aconsequence, only m dimensions are non-zero in the vec-tor ϕ(x)—irrespective of the dimension of the vector space.It thus suffices to only store the features extracted from anapplication for sparsely representing the vector ϕ(x), forexample, using hash tables [5] or Bloom filters [3].

2.3 Learning-based Detection

In the third step, we apply machine learning techniquesfor automatically learning a separation between malicious

4

and benign applications. The application of machine learn-ing spares us from manually constructing detection rules forthe extracted features.

While several learning methods can be applied to learna separation between two classes, only few methods are ca-pable of producing an efficient and explainable detectionmodel. We consider linear Support Vector Machines [SVM;6, 12] for this task. Given vectors of two classes as trainingdata, a linear SVM determines a hyperplane that separatesboth classes with maximal margin. In our setting, one ofthese classes is associated with malware, whereas the otherclass corresponds to benign applications. An unknown ap-plication is classified by mapping it to the vector space andchecking if it falls on the malicious (+) or benign (−) sideof the hyperplane.

hyperplane withmaximum margin

malicious applications

benign applications

φ(x)

}w〈φ(x),w〉

Figure 2. A separating hyperplane with maxi-mum margin (SVM).

Formally, the detection model of a linear SVM simplycorresponds to a vector w specifying the direction of thehyperplane, where the corresponding detection function fis given by

f(x) = 〈ϕ(x), w〉 =∑s∈S

I(x, s) · ws

and returns the orientation of ϕ(x) with respect to the hy-perplane. That is, f(x) > t indicates malicious activity,while f(x) ≤ t corresponds to benign applications for agiven threshold t.

To compute the function f efficiently, we again exploitthe sparse representation of the map ϕ. Given an applicationx, we know that only features extracted from x have non-zero entries in ϕ(x). All other dimensions are zero and donot contribute to the computation of f(x). Hence, we cansimplify the detection function f as follows

f(x) =∑s∈S

I(x, s) · ws =∑s∈x

ws.

Instead of an involved learning model, we finally arrive atsimple sum that can be efficiently computed by just addingthe weight ws for each feature s in an application x. Thisformulation enables us to apply a learned detection modelon a smartphone and also allows us to explain results ob-tained by the SVM.

Offline learning. In our implementation we do not learna detection model on the smartphone. Instead, we train theSupport Vector Machine offline on a dedicated system andonly transfer the learned model w to the smartphone for de-tecting malicious applications.

2.4 Explanation

In practice, a detection system must not only indicatemalicious activity, but also provide explanations for its de-tection results. It is a common shortcoming of learning-based approaches that they are black-box methods [27]. Inthe case of DREBIN, we address this problem and extendour learning-based detection, such that it can identify fea-tures of an application that contribute to a detection. More-over, an explainable detection may also help researchers toinspect patterns in malware and gain a deeper understandingof its functionality.

By virtue of the simple detection function of the linearSVM, we are able to determine the contribution of each sin-gle feature s to the function f(x). During the computationof f(x), we just need to store the largest k weights ws shift-ing the application to the malicious side of the hyperplane.Since each weight ws is assigned to a certain feature s, it isthen possible to explain why an application has been clas-sified as malicious or not. This approach can be efficientlyrealized by maintaining the k largest weights ws in a heapduring the computation of the function f(x) [5].

After extracting the top k features by their weights,DREBIN automatically constructs sentences which describethe functionality underlying these features. To achievethis goal we design sentence templates for each feature setwhich can be completed using the respective feature. Ta-ble 1 list these templates. For features frequently observedin malware, such as the permission SEND SMS, we provideindividual descriptions.

Feature set Explanation

S1 Hardware features App uses %s feature %s.S2 Requested permissions App requests permission to access %s.S3 App components App contains suspicious component %s.S4 Filtered intents Action is triggered by %s.S5 Restricted API calls App calls function %s to access %s.S6 Used permissions App uses permissions %s to access %s.S7 Suspicious API calls App uses suspicious API call %s.S8 Network addresses Communication with host %s.

Table 1. Templates for explanation.

For most of the feature sets, the construction of sentencesfrom the templates in Table 1 is straightforward. For exam-ple, for the hardware features we make use of their nam-ing scheme to construct meaningful sentences. If an appli-cation for instance uses the android.hardware.camera

5

feature, DREBIN presents the sentence ”App uses hardwarefeature camera.” to the user.

Similarly, we provide explanations for requested andused permissions. The explanation for a permissions canbe derived from the Android documentation which providesproper descriptions—at least for all system permissions. Weslightly modify these descriptions in order to present mean-ingful explanations to the user. However, due to the factthat application developers are able to define custom per-missions we also provide a generic sentence which is pre-sented to the user if no proper description exists. We followthe same approach for the restricted API calls that build onthe use of certain permissions. For all other feature sets thetemplates are directly filled with either the feature’s nameor a corresponding place holder.

An example of an explanation generated by DREBIN isshown in Figure 3. The presented sample belongs to theGoldDream family. DREBIN correctly identifies that themalware communicates with an external server and sendsSMS messages. The application requests 16 permissionsduring the installation process. Many users ignore such longlists of permissions and thereby fall victim to this type ofmalware. In contrast to the conventional permission-basedapproach DREBIN draws the user’s attention directly to rel-evant aspects indicating malicious activity. Furthermore,DREBIN presents a score to the user which tells him howconfident the decision is. As a result, the user is able todecide whether the presented functionality matches his ex-pectation or not.

In addition to the benefit for the user, the generated ex-planations can also help researchers to discover relevant pat-terns in common malware families. We discuss this aspectin more detail in the following section.

3 Evaluation

After presenting DREBIN in detail, we now proceed toan empirical evaluation of its efficacy. In particular, we con-duct the following three experiments:

1. Detection performance. First, we evaluate the detec-tion performance of DREBIN on a dataset of 5,560 mal-ware samples and 123,453 benign applications. Wecompare its performance against related approachesand anti-virus scanners (Section 3.2).

2. Explainability. In the second experiment, we analyzethe explanations provided by DREBIN in detail for dif-ferent malware families and verify whether they relateto actual characteristics of the malware (Section 3.3).

3. Run-time performance. Finally, we evaluate the run-time performance of DREBIN. For this experiment

Figure 3. Example of an explanation.

we conduct different measurements using five com-mon smartphone models as well as a regular desktopcomputer (Section 3.4).

3.1 Data sets

For all experiments, we consider a dataset of real An-droid applications and real malware. In particular, we haveacquired an initial dataset of 131,611 applications compris-ing benign as well as malicious software. The samples havebeen collected in the period from August 2010 to October2012. In detail, the dataset contains 96,150 applicationsfrom the GooglePlay Store, 19,545 applications from dif-ferent alternative Chinese Markets, 2,810 applications fromalternative Russian Markets and 13,106 samples from othersources, such as Android websites, malware forums and se-curity blogs. Additionally, the dataset includes all samplesfrom the Malgenome project [30].

To determine malicious and benign applications, we sendeach sample to the VirusTotal service and inspect the out-put of ten common anti-virus scanners (AntiVir, AVG, Bit-Defender, ClamAV, ESET, F-Secure, Kaspersky, McAfee,Panda, Sophos). We flag all applications as malicious thatare detected by at least two of the scanners. This procedureensures that our data is correctly split into benign and mali-cious samples even if one of the ten scanners falsely labelsa benign application as malicious.

Finally, we remove samples labeled as adware from ourdataset, as this type of software is in a twilight zone between

6

0.00 0.02 0.04 0.06 0.08 0.10False-positive rate

0.0

0.2

0.4

0.6

0.8

1.0

Tru

e-p

osi

tive r

ate

DREBINPeng et al.RCPKIRIN

(a) Detection performance as ROC curve.

A B C D E F G H I J K L M N O P Q R S TMalware families

0

20

40

60

80

100

Dete

ctio

n r

ate

Average detection rate (FP=0.01)Detection rate (FP=0.01)

(b) Detection per malware family.

Id Family # Id Family #

A FakeInstaller 925 K Adrd 91B DroidKungFu 667 L DroidDream 81C Plankton 625 M LinuxLotoor 70D Opfake 613 N GoldDream 69E GingerMaster 339 O MobileTx 69F BaseBridge 330 P FakeRun 61G Iconosys 152 Q SendPay 59H Kmin 147 R Gappusin 58I FakeDoc 132 S Imlog 43J Geinimi 92 T SMSreg 41

Table 1: Top 20 malware families in our dataset and thenumber of their occurrences (#). The family names arederived from the labels of the Kaspersky AV scanner.

3.1 Data sets

For all experiments, we consider a dataset of real An-droid applications and real malware. In particular, wehave acquired an initial dataset of 131,398 applicationscomprising benign as well as malicious software. Thesamples have been collected in the period from August2010 to October 2012 from several sources including theGoogle Play Store, Asian third-party markets and mal-ware forums. The dataset also includes all samples fromthe Malgenome project [? ].

To determine malicious and benign applications, wesend each sample to the VirusTotal service and inspectthe output of ten common anti-virus scanners (AntiVir,AVG, BitDefender, ClamAV, ESET, F-Secure, Kasper-sky, McAfee, Panda, Sophos). We flag all applicationsas malicious that are at least detected by one of the scan-ners. This procedure ensures that our data is correctlysplit into benign and malicious samples—leaving aside asmall fraction of applications that might be missed by allten scanners.

Finally, we remove samples labeled as adware fromour dataset, as this type of software is in a twilight zonebetween malware and benign functionality. The finaldataset contains 122,629 benign application and 6,526malware samples. To the best of our knowledge, this isone of the largest malware datasets that has been used toevaluate a malware detection method on Android.

An overview of the top 20 malware families in ourdataset is provided in Table 1 including several familiesthat are currently actively distributed in application mar-kets. Note that only the top 20 families are shown andour dataset contain 1,227 further malicious samples.

3.2 Detection Performance

In our first experiment, we evaluate the detection perfor-mance of DREBIN and related static approaches.

Experimental procedure. We randomly split thedataset into a known partition (66%) and an unknownpartition (33%). The detection model and respective pa-rameters of the support vector machine are determinedon the known partition, whereas the unknown partition isonly used for measuring the final detection performance.We repeat this procedure 10 times and average results.The partitioning ensures that reported results only referto malicious applications unknown during the learningphase of DREBIN. For the related approaches, such asKirin [? ] and RPC [? ] the experimental procedurediffers slightly, since not all methods require a separatetraining step.

Comparison with related approaches We first com-pare the performance of DREBIN against related staticapproaches for detection of Android malware. In partic-ular, we consider the methods Kirin [? ], RPC [? ] andthe approach by Peng et al. [? ], where we implementthe latter using support vector machines instead of a ba-sic Naive Bayes classier. The results of this experimentsare shown in Figure 3 as ROC curve, that is, the detec-tion of malware (true-positive rate) is plotted against thenumber of false alarms (false-positive rate) for differentthresholds of the detection methods.

Figure 3: Detection performance of DREBIN and the re-lated detection approaches.

DREBIN significantly outperforms the other ap-proaches and detects 93% of the malware samples ata false-positive rate of 1%, corresponding to one falsealarm when installing 100 applications. The other ap-proaches attain only a detection rate between 10%–50%at this false-positive rate. As Kirin and RPC both con-sider only a subset of the requested permissions, theyhave obvious limitations in detecting malicious applica-tions. Even the method by Peng et al. which considers all

6

(c) Top malware families in our dataset.

Figure 4. Detection performance of Drebin and related approaches.

malware and benign functionality. The final dataset con-tains 123,453 benign applications and 5,560 malware sam-ples. To the best of our knowledge, this is one of the largestmalware datasets that has been used to evaluate a malwaredetection method on Android.

An overview of the top 20 malware families in ourdataset is provided in Table 4(c) including several familiesthat are currently actively distributed in application markets.Note that only the top 20 families are shown and our datasetcontains 1,048 further malicious samples.

3.2 Detection Performance

In our first experiment, we evaluate the detection perfor-mance of DREBIN and related static detection approaches.For this experiment, we randomly split the dataset into aknown partition (66%) and an unknown partition (33%).The detection model and respective parameters of DREBINare determined on the known partition, whereas the un-known partition is only used for measuring the final detec-tion performance. We repeat this procedure 10 times andaverage results. The partitioning ensures that reported re-sults only refer to malicious applications unknown duringthe learning phase of DREBIN. For the related approaches,such as Kirin [11] and RCP [26] this experimental proce-dure differs slightly, since not all methods require a separatetraining step.

Comparison with related approaches. We first com-pare the performance of DREBIN against related static ap-proaches for detection of Android malware. In particular,we consider the methods Kirin [11], RCP [26] and the ap-proach by Peng et al. [22], where we implement the latterusing an SVM instead of a Naive Bayes classifier. The re-sults of this experiments are shown in Figure 4(a) as ROCcurve, that is, the detection of malware (true-positive rate)

is plotted against the number of false alarms (false-positiverate) for different thresholds of the detection methods.

DREBIN significantly outperforms the other approachesand detects 94% of the malware samples at a false-positiverate of 1%, corresponding to one false alarm when installing100 applications. The other approaches provide a detectionrate between 10%–50% at this false-positive rate. As Kirinand RCP both consider only a subset of the requested per-missions, they have obvious limitations in detecting mali-cious applications. Even the method by Peng et al. whichconsiders all permissions is unable to detect malware withsufficient accuracy in this experiment. The superior perfor-mance of DREBIN results from the different feature sets thatare used to model malicious activity. These sets include therequested permissions but also contain other relevant char-acteristics of applications, such as suspicious API calls, fil-tered intents and network addresses.

Comparison with AV scanners. Although DREBINshows a much better performance compared to related ap-proaches, in the end it has to compete with common anti-virus products in practice. Consequently, we also compareit against ten anti-virus scanners on our dataset. The detec-tion performance of each scanner is again taken from theVirusTotal service. We run two experiments where we firstconsider all malware samples of our dataset and then onlythose samples provided by the Malgenome project [30]. Wechoose a false-positive rate of 1% for DREBIN which wethink is sufficiently low for practical operation.

The results of the experiments are shown in Table 2. Thedetection rate of the anti-virus scanners varies considerably.While the best scanners detect over 90% of the malware,some scanners discover less than 10% of the malicious sam-ples, likely due to not being specialized in detecting An-droid malware. On the full dataset DREBIN provides thesecond best performance with a detection of 93.9% and out-

7

DREBIN AV1 AV2 AV3 AV4 AV5 AV6 AV7 AV8 AV9 AV10

Full dataset 93.90% 96.41% 93.71% 84.66% 84.54% 78.38% 64.16% 48.50% 48.34% 9.84% 3.99%Malgenome 95.90% 98.63% 98.90% 98.28% 98.07% 98.66% 96.49% 94.67% 84.23% 23.68% 1.12%

Table 2. Detection rates of Drebin and anti-virus scanners.

performs 9 of the commercial products. This observation isremarkable since, due to our test setting, at least two scan-ners should be able to detect each malware sample. There-fore, each sample has to be known for a certain amount timeand most anti-virus scanners should be equipped with a cor-responding signature. However, the automatically gener-ated detection model of DREBIN proves to be more effectivethan the manually crafted signatures of many scanners. Onthe Malgenome dataset the anti-virus scanners achieve bet-ter detection rates, since these samples are well-known for alonger period of time. Hence, almost all anti-virus scannersprovide proper signatures for this dataset.

The false-positive rates of the anti-virus scanners rangefrom 0% to 0.3% on our dataset of benign applications andthus are slightly below DREBIN’s performance. Despite thevast number of available Android applications, the averageuser only installs some dozens of applications on his de-vice. For example, according to Nielsen1, a market researchcompany, the average number of installed applications persmartphone in the U.S. has been 32 in 2011 and 41 in 2012.Consequently, we consider a false-positive rate of 1% ac-ceptable for operating DREBIN in practice.

Detection of malware families. Another important as-pect that should be considered when testing the detectionperformance of a method is the balance of malware fami-lies in the dataset [25]. If the number of samples of certainmalware families is much larger than of other families thedetection result mainly depends on these families. To ad-dress this problem one can use the same number of samplesfor each family. However, this leads to a distribution thatsignificantly differs from reality. Instead we evaluate thedetection performance for each of the 20 largest malwarefamilies separately. The family names and the number ofsamples for each family can be found in Table 4(c) and thedetection performance of DREBIN for each family is illus-trated in Figure 4(b).

DREBIN is able to reliably detect all families with an av-erage accuracy of 93% at a false-positive rate of 1%. In par-ticular, all families show a detection rate of more than 90%,where three of them can be identified perfectly (H, O, P).There is only one malware family which cannot be reliablydetected by DREBIN. This family is Gappusin (R) and wewill examine its low performance in the next section. Itshould be pointed out that there seems to be no dependency

1Nielsen Report: “State of the Appnation – A Year of Change andGrowth in U.S. Smartphones”

between the size of a malware family and its detection rateas long as the number of samples is sufficiently high andallows the SVM to generalize its features.

Detection of unknown malware families. DREBIN usesknown malware for learning its detection model. It is thusimportant to assess how many samples of a family need tobe known to reliable detect this family. To study this issue,we conduct two additional experiments where we limit thenumber of samples for a particular family in the training set.In the first experiment we provide no samples of the family,corresponding to a totally unknown malware strain. In thesecond experiment, we put 10 randomly chosen samples ofthe family back into the training set, thus simulating thestarting spread of a new family.

The results of the two experiments are shown in Figure 5,where the detection rate is shown for 0 and 10 availablesamples in the training set for each family. If no samplesare available for learning, it is difficult for DREBIN to de-tect a family, as no discriminative patterns can be discov-ered by the SVM. However, only very few samples are nec-essary to generalize the behavior of most malware families.With only 10 samples in the training set, the average detec-tion performance increases by more than 25 percent. Threefamilies can even be detected perfectly in this setting. Thereason for this is that members of a certain families are oftenjust repackaged applications with slight modifications. Dueto the generalization which is done by the SVM it is there-fore possible to detect variations of a family even thoughonly a very small set of samples is known.

A B C D E F G H I J K L M N O P Q R S TMalware Families

0

20

40

60

80

100

Dete

ctio

n R

ate

0 Samples Available10 Samples Available

Figure 5. Detection of unknown families.

In summary, DREBIN provides an effective detectionof Android malware and outperforms related detection ap-

8

proaches as well as several anti-virus scanners. WhileDREBIN can not spot unknown malware from the very start,only few samples of each family are required for achievinga reliable detection.

3.3 Explanations

Apart from its detection performance a strength ofDREBIN lies in its ability to the explain obtained results.This allows us to check whether the extracted featureswhich contribute to the detection fit to common malwarecharacteristics. In this section we first take a look at fourpopular malware families and analyze how features withhigh weights allow conclusions to be drawn about their be-havior. We then inspect false positives and false negativesof DREBIN in detail.

Explanation for malware families. To study the expla-nations provided by DREBIN we consider four well-knownmalware families, namely FakeInstaller, GoldDream [20],GingerMaster [19] and DroidKungFu [21]. For each sampleof these families we determine the features with the high-est contribution to the classification decision and averagethe results over all members of a family. The resulting topfive features for each malware family are shown in Table 3.For clarity we presents the exact features rather then the ex-plaining sentences introduced in Section 2.4.

• FakeInstaller is currently the most widespread mal-ware. The members of this family hide their mali-cious code inside repackaged versions of popular ap-plications. During the installation process the mal-ware send expensive SMS messages to premium ser-vices owned by the malware authors. Even onthe first sight, three of the extracted features indi-cate that the malware uses SMS functionality, whereandroid.hardware.telephony is implicitly addedto the manifest file as soon as an application requestspermissions to use SMS functionality.

• DroidKungFu tries to exploit several vulnerabilities inearlier Android versions to gain root access and stealsensitive data from the device. Its intention to gain rootaccess is reflected by the feature system/bin/su.The invocation of getSubscriberId() indicatesthat the malware tries to access sensitive data. Thetwo intents BATTERY CHANGED ACTION and SIG STR

are filtered by a broadcast receiver component which ispart of many DroidKungFu samples. Both intents areused to trigger malicious functionality when the appli-cation is running in the background.

• GoldDream is a Trojan which monitors an infecteddevice, collects sensitive data and records informa-tion from received SMS messages. The feature

Telephony.SMS RECEIVED directly hints us to thereading of SMS messages. After the malware has col-lected sufficient data, it sends the data to an externalserver, whose hostname is second ranked in the fea-ture list. Furthermore, the malware is able to installand delete packages as well as to send SMS messages,which is also reflected in the extracted features.

• GingerMaster is also a Trojan application which is of-ten bundled with benign applications and tries to gainroot access, steals sensitive data and sends it to a re-mote server. Similar to the DroidKungFu family themalware starts its malicious service as soon as it re-ceives a BOOT COMPLETED or USER PRESENT intent.Again a significant part of this behavior can be recon-structed just by looking at the top features.

To study the contribution of the different feature sets tothe detection of malware in general, we extract the top 5features for all of the malware families in our dataset. Theresults are presented in Table 4. Although the requestedpermissions occur in the top features of all families, it isevident that this feature set alone is not sufficient to ensurea reliable detection. In particular, each feature set occurs atleast once in the table which clearly indicates that all setsare necessary for the detection of Android malware.

False and missing detections. We finally examine benignapplications which are wrongly classified as malware byDREBIN. Similar to malicious applications, most of thesesamples use SMS functionality and access sensitive data,which is reflected in high weights of the corresponding fea-tures. Moreover, these samples often show only very littlebenign behavior and thus trigger false alarms. Fortunately,the ability of DREBIN to output explainable results can helpthe user to decide whether a suspicious looking functional-ity is indeed malicious or needed for the intented purposeof the application.

A similar situation occurs when DREBIN classifies sam-ples of the Gappusin family [17] as benign. Although itis in many cases possible to extract features which matchthe description of the Gappusin family—amongst others thehostname of the external server—there are too few mali-cious features to identify the samples as malware. Gap-pussin mainly acts as a downloader for further maliciousapplications and thus does not exhibit common maliciousfunctionality, such as theft of sensitive data.

3.4 Run-time Performance

While the computing power of mobile devices is rapidlyincreasing, it is still limited compared to regular desktopcomputers. Consequently, a detection method that is sup-posed to run directly on these devices has to carry out itstask very efficiently.

9

Malware family Top 5 features

Feature s Feature set Weight ws

FakeInstaller

sendSMS S7 Suspicious API Call 1.12SEND SMS S2 Requested permissions 0.84android.hardware.telephony S1 Hardware components 0.57sendTextMessage S5 Restricted API calls 0.52READ PHONE STATE S2 Requested permissions 0.50

DroidKungFu

SIG STR S4 Filtered intents 2.02system/bin/su S7 Suspicious API calls 1.30BATTERY CHANGED ACTION S4 Filtered intents 1.26READ PHONE STATE S2 Requested permissions 0.54getSubscriberId S7 Suspicious API calls 0.49

GoldDream

sendSMS S7 Suspicious API calls 1.07lebar.gicp.net S8 Network addresses 0.93DELETE PACKAGES S2 Requested permission 0.58android.provider.Telephony.SMS RECEIVED S4 Filtered intents 0.56getSubscriberId S7 Suspicious API calls 0.53

GingerMaster

USER PRESENT S4 Filtered intents 0.67getSubscriberId S7 Suspicious API calls 0.64READ PHONE STATE S2 Requested permissions 0.55system/bin/su S7 Suspicious API calls 0.44HttpPost S7 Suspicious API calls 0.38

Table 3. Top features for the malware families FakeInstaller, DroidKungFu, Geinimi and GingerMaster.

Feature sets Malware families

A B C D E F G H I J K L M N O P Q R S T

S1 Hardware components X X X XS2 Requested permissions X X X X X X X X X X X X X X X X X X X XS3 App components X X X X X X XS4 Filtered intents X X X X XS5 Restricted API calls X XS6 Used permissions XS7 Suspicious API calls X X X X X X X X X X X X X X XS8 Network addresses X X X X X X X X X X

Table 4. Contribution of the feature sets to the detection of malware families.

10

To analyze the run-time of DREBIN we implement astandalone Android application that receives a learned de-tection model and is able to perform the detection processdirectly on the smartphone. The size of the downloadedmodel is only about 280 kbytes. Using this application wemeasure the run-time of DREBIN on different devices us-ing 100 randomly selected popular applications from theGoogle Play Store. For this experiment, we choose de-vices which cover various widespread hardware configura-tions including four smartphone (Nexus 4, Galaxy S3, Xpe-ria Mini Pro and Nexus 3), a tablet (Nexus 7) and a regulardesktop computer (PC).

PC N4 S3 XMP N3 N7Devices

10-1

100

101

102

Run-t

ime (

sec)

Figure 6. Run-time performance of Drebin.

The results are presented in Figure 6. On average,DREBIN is able to analyze a given application in 10 secondson the five smartphones. Even on older models, such as theXperia Mini Pro, the method is able to analyze the applica-tion in less than 20 seconds on average. Overall, no analysistakes longer than 1 minute on all devices. On the desktopcomputer (2.26 GHz Core 2 Duo with 4GB RAM) DREBINachieves a remarkable analysis performance of 750 ms perapplication, which enables scanning 100,000 applicationsin less than a day.

A detailed run-time analysis for the desktop computerand the Galaxy S3 smartphone is presented in Figure 7,where the run-time per application is plotted against thesize of the analyzed code. Surprisingly, on both devicesDREBIN attains a sublinear run-time, that is, its perfor-mance increases with O(

√m) in the number of analyzed

bytes m. Apparently, the number of features does not in-crease linearly with the code and thus larger applications donot necessarily contain more features to analyze.

From this evaluation, we conclude that DREBIN does notonly reliably detect malicious applications but is further-more capable to perform this task in a time which clearlymeets practical requirements.

10-2 10-1 100 101

Size of dexcode (MB)

10-2

10-1

100

101

102

Run-t

ime (

sec)

O(n0.45 )

O(n0.49 )

Estimate (S3)

Estimate (PC)

Measurements

Figure 7. Detailed run-time analysis of Drebin.

4 Limitations

The previous evaluation demonstrates the efficacy of ourmethod in detecting recent malware on the Android plat-form. However, DREBIN cannot generally prohibit infec-tions with malicious applications, as it builds on conceptsof static analysis and lacks the capabilities of a run-timeanalysis. Some strains of malware make use of obfusca-tion or load code dynamically, which hinders any static in-spection. To alleviate the absence of a dynamic analysis,DREBIN thus extracts API calls related to obfuscation andloading of code, such as DexClassLoader.loadClass()and Cipher.getInstance(). These features enable us toat least spot the execution of hidden code—even if we can-not further analyze it. In combinations with other features,DREBIN is able to identify malware despite the use of someobfuscation techniques.

To avoid crafting detection patterns manually, we makeuse machine learning for generating detection models.While learning techniques provide a powerful tool for au-tomatically inferring models, they require a representa-tive basis of data for training. That is, the quality ofthe detection model of DREBIN critically depends on theavailability of representative malicious and benign applica-tions. While it is straightforward to collect benign appli-cations, gathering recent malware samples requires sometechnical effort. Fortunately, methods for offline analy-sis, such as DroidRanger [31], AppsPlayground [24] andRiskRanker [18], might help here to automatically acquiremalware and provide the basis for updating and maintaininga representative dataset for DREBIN over time.

5 Related Work

The analysis and detection of Android malware has beena vivid area of research in the last years. Several concepts

11

and techniques have been proposed to counter the growingamount and sophistication of this malware. An overview ofthe current malware landscape is provided in the studies ofFelt et al. [14] and Zhou & Jiang [30].

Detection using static analysis. The first approaches fordetecting Android malware have been inspired by conceptsfrom static program analysis. Several methods have beenproposed that statically inspect applications and disassem-ble their code [e.g., 10, 11, 13, 18]. For example, themethod Kirin [11] checks the permission of applications forindications of malicious activity. Similarly, Stowaway [13]analyzes API calls to detect overprivileged applications andRiskRanker [18] statically identifies applications with dif-ferent security risks. Common open-source tools for staticanalysis are Smali [15] and Androguard [8], which enabledissecting the content of applications with little effort.

Our method DREBIN builds on similar concepts for staticanalysis. However, it differs in two central aspects from pre-vious work: First, we abstain from crafting detection pat-terns manually and instead apply machine learning to an-alyze information extracted from static analysis. Second,the analysis of DREBIN is optimized for effectivity and ef-ficiency, which enables us to inspect application directly onthe smartphone.

Detection using dynamic analysis. A second branch ofresearch has studied the detection of Android malware atrun-time. Most notably, are the analysis system Taint-Droid [9] and DroidScope [29] that enable dynamicallymonitoring applications in a protected environment, wherethe first focuses on taint analysis and the later enables intro-spection at different layers of the platform. While both sys-tems provide detailed information about the behavior of ap-plications, they are technically too involved to be deployedon smartphones and detect malicious software directly.

As a consequence, dynamic analysis is mainly appliedfor offline detection of malware, such as scanning large col-lections of Android applications. For example, the meth-ods DroidRanger [31] and AppsPlayground [24] have beensuccessfully applied to identify applications with maliciousbehavior in different Android markets. A similar detec-tion system called Bouncer is currently operated by Google.These dynamic analysis systems are suitable for filteringmalicious applications from markets. Due to the opennessof the Android platform, however, applications may also beretrieved from other sources, such as web pages, which arenot covered by these approaches.

ParanoidAndroid [23] is one of the few detection sys-tems that employs dynamic analysis and can spot maliciousactivity on the smartphone. To this end, a virtual clone ofthe smartphone is run in parallel on a dedicated server andsynchronized with the activities of the device. This settingallows for monitoring the behavior of applications on the

clone without disrupting the functionality of the real device.The duplication of functionality, however, is involved andwith millions of smartphones in practice operating Para-noidAndroid at large scale is technically not feasible.

Detection using machine learning. The difficulty ofmanually crafting and updating detection patterns for An-droid malware has motivated the application of machinelearning techniques. Several methods have been pro-posed that analyze applications automatically using learn-ing methods [e.g., 2, 22, 26]. As an example, the methodof Peng et al. [22] applies probabilistic learning methodsto the permissions of applications for detecting malware.Similarly, the methods Crowdroid [4], DroidMat [28] andDroidAPIMiner[1] analyze the usage of system and APIcalls using machine learning techniques.

All of these approaches mainly focus on an accurate de-tection of malware. Additional aspects, such as the effi-ciency and the explainability of the detection, are not con-sidered. We address these aspects in this work and proposea method that provides an effective, efficient and explain-able detection of malicious applications.

6 Conclusion

Android malware is a new yet fast growing threat.Classic defenses, such as anti-virus scanners, increasinglyfail to cope with the amount and diversity of malware inapplication markets. While recent approaches, such asDroidRanger [31] and AppPlayground [24], support fil-tering such applications off these markets, they induce arun-time overhead that is prohibitive for directly protect-ing smartphones. As a remedy, we introduce DREBIN,a lightweight method for detection of Android malware.DREBIN combines concepts from static analysis and ma-chine learning, which enables it to better keep pace withmalware development. Our evaluation demonstrates the po-tential of this approach, where DREBIN outperforms relatedapproaches and identifies malicious applications with fewfalse alarms.

In practice, DREBIN provides two advantages for the se-curity of the Android platform: First, it enables efficientlyscanning large amounts of applications, such as from third-party markets. With an average run-time of 750 ms per ap-plication on a regular computer, it requires less than a day toanalyze 100,000 unknown applications. Second, DREBINcan be applied directly on smartphones, where the analysiscan be triggered when new applications are downloaded tothe device. Thereby, DREBIN can protect users that installapplications from untrusted sources, such as websites andthird-party markets.

Although DREBIN effectively identifies malicious soft-ware in our evaluation, it exhibits the inherent limitations

12

of static analysis. While it is able to detect indications ofobfuscation or dynamic execution, the retrieved code is notaccessible by the method. A similar setting has been suc-cessfully tackled for the analysis of JavaScript code [see 7]and dynamically triggering the static analysis of DREBINwhenever new code is loaded seems like a promising direc-tion of future work.

References

[1] Y. Aafer, W. Du, and H. Yin. DroidAPIMiner: Min-ing API-level features for robust malware detection inandroid. In Proc. of International Conference on Se-curity and Privacy in Communication Networks (Se-cureComm), 2013.

[2] D. Barrera, H. G. Kayacik, P. C. van Oorschot, andA. Somayaji. A methodology for empirical analysisof permission-based security models and its applica-tion to android. In Proc. of ACM Conference on Com-puter and Communications Security (CCS), pages 73–84, 2010.

[3] B. Bloom. Space/time trade-offs in hash codingwith allowable errors. Communication of the ACM,13(7):422–426, 1970.

[4] I. Burguera, U. Zurutuza, and S. Nadjm-Tehrani.Crowdroid: behavior-based malware detection systemfor android. In Proc. of ACM Worksgop on Secu-rity and Privacy in Smartphones and Mobile Devices(SPSM), pages 15–26, 2011.

[5] T. Cormen, C. Leiserson, and R. Rivest. Introductionto Algorithms. MIT Press, 1989.

[6] N. Cristianini and J. Shawe-Taylor. An Introductionto Support Vector Machines. Cambridge UniversityPress, Cambridge, UK, 2000.

[7] C. Curtsinger, B. Livshits, B. Zorn, and C. Seifert.Zozzle: Fast and precise in-browser JavaScript mal-ware detection. In Proc. of USENIX Security Sympo-sium, 2011.

[8] A. Desnos and G. Gueguen. Android: From reversingto decompilation. In Presentation at Black Hat AbuDhabi, 2011.

[9] W. Enck, P. Gilbert, B. gon Chun, L. P. Cox,J. Jung, P. McDaniel, and A. Sheth. Taintdroid: Aninformation-flow tracking system for realtime privacymonitoring on smartphones. In Proc. of USENIX Sym-posium on Operating Systems Design and Implemen-tation (OSDI), pages 393–407, 2010.

[10] W. Enck, D. Octeau, P. McDaniel, and S. Chaudhuri.A study of Android application security. In Proc. ofUSENIX Security Symposium, 2011.

[11] W. Enck, M. Ongtang, and P. D. McDaniel. Onlightweight mobile phone application certification. InProc. of ACM Conference on Computer and Commu-nications Security (CCS), pages 235–245, 2009.

[12] R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang,and C.-J. Lin. LIBLINEAR: A library for large linearclassification. Journal of Machine Learning Research(JMLR), 9:1871–1874, 2008.

[13] A. P. Felt, E. Chin, S. Hanna, D. Song, and D. Wagner.Android permissions demystified. In Proc. of ACMConference on Computer and Communications Secu-rity (CCS), pages 627–638, 2011.

[14] A. P. Felt, M. Finifter, E. Chin, S. Hanna, and D. Wag-ner. A survey of mobile malware in the wild. InProc. of ACM Worksgop on Security and Privacy inSmartphones and Mobile Devices (SPSM), pages 3–14, 2011.

[15] J. Freke. An assembler/disassembler for android’s dexformat. Google Code, http://code.google.com/p/smali/, visited February, 2013.

[16] Mobile threat report 2012 q3. F-Secure ResponseLabs, 2012.

[17] 2012. http://www.naked-security.com/malware/Android.Gappusin/.

[18] M. Grace, Y. Zhou, Q. Zhang, S. Zou, and X. Jiang.Riskranker: scalable and accurate zero-day androidmalware detection. In Proc. of International Confer-ence on Mobile Systems, Applications, and Services(MOBISYS), pages 281–294, 2012.

[19] X. Jiang. Security alert: Gingermaster, 2011.http://www.csc.ncsu.edu/faculty/jiang/GingerMaster/.

[20] X. Jiang. Security alert: Golddream, 2011.http://www.csc.ncsu.edu/faculty/jiang/GoldDream/.

[21] X. Jiang. Security alert: New droidkungfuvariant, 2011. http://www.csc.ncsu.edu/faculty/jiang/DroidKungFu3/.

[22] H. Peng, C. S. Gates, B. P. Sarma, N. Li, Y. Qi,R. Potharaju, C. Nita-Rotaru, and I. Molloy. Us-ing probabilistic generative models for ranking risksof android apps. In Proc. of ACM Conference onComputer and Communications Security (CCS), pages241–252, 2012.

13

[23] G. Portokalidis, P. Homburg, K. Anagnostakis, andH. Bos. Paranoid android: Versatile protection forsmartphones. In Proc. of Annual Computer SecurityApplications Conference (ACSAC), 2010.

[24] V. Rastogi, Y. Chen, and W. Enck. Appsplayground:Automatic security analysis of smartphone applica-tions. In Proc. ACM Conference on Data and Appli-cation Security and Privacy (CODASPY), 2013.

[25] C. Rossow, C. Dietrich, C. Gier, C. Kreibich, V. Pax-son, N. Pohlmann, H. Bos, and M. van Steen. Prudentpractices for designing malware experiments: Statusquo and outlook. In Proc. of IEEE Symposium on Se-curity and Privacy, 2012.

[26] B. P. Sarma, N. Li, C. Gates, R. Potharaju, C. Nita-Rotaru, and I. Molloy. Android permissions: a per-spective combining risks and benefits. In Proc. ofACM symposium on Access Control Models and Tech-nologies (SACMAT), pages 13–22, 2012.

[27] R. Sommer and V. Paxson. Outside the closed world:On using machine learning for network intrusion de-tection. In Proc. of IEEE Symposium on Security andPrivacy, pages 305–316, 2010.

[28] D.-J. Wu, C.-H. Mao, T.-E. Wei, H.-M. Lee, and K.-P.Wu. Droidmat: Android malware detection throughmanifest and API calls tracing. In Proc. of AsiaJoint Conference on Information Security (Asia JCIS),pages 62–69, 2012.

[29] L.-K. Yan and H. Yin. Droidscope: Seamlessly recon-structing os and dalvik semantic views for dynamicandroid malware analysis. In Proc. of USENIX Secu-rity Symposium, 2012.

[30] Y. Zhou and X. Jiang. Dissecting android malware:Characterization and evolution. In Proc. of IEEE Sym-posium on Security and Privacy, pages 95–109, 2012.

[31] Y. Zhou, Z. Wang, W. Zhou, and X. Jiang. Hey, you,get off of my market: Detecting malicious apps inofficial and alternative android markets. In Proc. ofNetwork and Distributed System Security Symposium(NDSS), 2012.

14