security evaluation of pattern classifiers under...

14
1 Security evaluation of pattern classifiers under attack Battista Biggio, Member, IEEE, Giorgio Fumera, Member, IEEE, Fabio Roli, Fellow, IEEE Abstract—Pattern classification systems are commonly used in adversarial applications, like biometric authentication, network intrusion detection, and spam filtering, in which data can be purposely manipulated by humans to undermine their operation. As this adversarial scenario is not taken into account by classical design methods, pattern classification systems may exhibit vulnerabilities, whose exploitation may severely affect their performance, and consequently limit their practical utility. Extending pattern classification theory and design methods to adversarial settings is thus a novel and very relevant research direction, which has not yet been pursued in a systematic way. In this paper, we address one of the main open issues: evaluating at design phase the security of pattern classifiers, namely, the performance degradation under potential attacks they may incur during operation. We propose a framework for empirical evaluation of classifier security that formalizes and generalizes the main ideas proposed in the literature, and give examples of its use in three real applications. Reported results show that security evaluation can provide a more complete understanding of the classifier’s behavior in adversarial environments, and lead to better design choices. Index Terms—Pattern classification, adversarial classification, performance evaluation, security evaluation, robustness evaluation. 1 I NTRODUCTION Pattern classification systems based on machine learning algorithms are commonly used in security-related appli- cations like biometric authentication, network intrusion detection, and spam filtering, to discriminate between a “legitimate” and a “malicious” pattern class (e.g., legit- imate and spam emails). Contrary to traditional ones, these applications have an intrinsic adversarial nature since the input data can be purposely manipulated by an intelligent and adaptive adversary to undermine clas- sifier operation. This often gives rise to an arms race between the adversary and the classifier designer. Well known examples of attacks against pattern classifiers are: submitting a fake biometric trait to a biometric authentication system (spoofing attack) [1], [2]; modify- ing network packets belonging to intrusive traffic to evade intrusion detection systems [3]; manipulating the content of spam emails to get them past spam filters (e.g., by misspelling common spam words to avoid their detection) [4]–[6]. Adversarial scenarios can also occur in intelligent data analysis [7] and information retrieval [8]; e.g., a malicious webmaster may manipulate search engine rankings to artificially promote her 1 web site. It is now acknowledged that, since pattern classifi- cation systems based on classical theory and design methods [9] do not take into account adversarial settings, they exhibit vulnerabilities to several potential attacks, Department of Electrical and Electronic Engineering, University of Cagliari Piazza d’Armi, 09123 Cagliari, Italy Battista Biggio: e-mail [email protected], phone +39 070 675 5776 Giorgio Fumera: e-mail [email protected], phone +39 070 675 5754 Fabio Roli (corresponding author): e-mail [email protected], phone +39 070 675 5779, fax (shared) +39 070 675 5782 1. The adversary is hereafter referred to as feminine due to the pop- ular interpretation as “Eve” or “Carol” in cryptography and computer security. allowing adversaries to undermine their effectiveness [6], [10]–[14]. A systematic and unified treatment of this issue is thus needed to allow the trusted adoption of pat- tern classifiers in adversarial environments, starting from the theoretical foundations up to novel design methods, extending the classical design cycle of [9]. In particular, three main open issues can be identified: (i) analyzing the vulnerabilities of classification algorithms, and the corresponding attacks [11], [14], [15]; (ii) developing novel methods to assess classifier security against these attacks, which is not possible using classical performance evaluation methods [6], [12], [16], [17]; (iii) developing novel design methods to guarantee classifier security in adversarial environments [1], [6], [10]. Although this emerging field is attracting growing interest [13], [18], [19], the above issues have only been sparsely addressed under different perspectives and to a limited extent. Most of the work has focused on application-specific issues related to spam filtering and network intrusion detection, e.g., [3]–[6], [10], while only a few theoretical models of adversarial classification problems have been proposed in the machine learn- ing literature [10], [14], [15]; however, they do not yet provide practical guidelines and tools for designers of pattern recognition systems. Besides introducing these issues to the pattern recog- nition research community, in this work we address issues (i) and (ii) above by developing a framework for the empirical evaluation of classifier security at design phase that extends the model selection and performance evaluation steps of the classical design cycle of [9]. In Sect. 2 we summarize previous work, and point out three main ideas that emerge from it. We then formalize and generalize them in our framework (Sect. 3). First, to pursue security in the context of an arms race it is not

Upload: doancong

Post on 28-Mar-2018

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Security evaluation of pattern classifiers under attackpralab.diee.unica.it/sites/default/files/biggio13-tkde.pdf1 Security evaluation of pattern classifiers under attack Battista

1

Security evaluation of pattern classifiersunder attack

Battista Biggio, Member, IEEE, Giorgio Fumera, Member, IEEE, Fabio Roli, Fellow, IEEE

Abstract—Pattern classification systems are commonly used in adversarial applications, like biometric authentication, networkintrusion detection, and spam filtering, in which data can be purposely manipulated by humans to undermine their operation. As thisadversarial scenario is not taken into account by classical design methods, pattern classification systems may exhibit vulnerabilities,whose exploitation may severely affect their performance, and consequently limit their practical utility. Extending pattern classificationtheory and design methods to adversarial settings is thus a novel and very relevant research direction, which has not yet been pursuedin a systematic way. In this paper, we address one of the main open issues: evaluating at design phase the security of pattern classifiers,namely, the performance degradation under potential attacks they may incur during operation. We propose a framework for empiricalevaluation of classifier security that formalizes and generalizes the main ideas proposed in the literature, and give examples of its usein three real applications. Reported results show that security evaluation can provide a more complete understanding of the classifier’sbehavior in adversarial environments, and lead to better design choices.

Index Terms—Pattern classification, adversarial classification, performance evaluation, security evaluation, robustness evaluation.

F

1 INTRODUCTION

Pattern classification systems based on machine learningalgorithms are commonly used in security-related appli-cations like biometric authentication, network intrusiondetection, and spam filtering, to discriminate between a“legitimate” and a “malicious” pattern class (e.g., legit-imate and spam emails). Contrary to traditional ones,these applications have an intrinsic adversarial naturesince the input data can be purposely manipulated byan intelligent and adaptive adversary to undermine clas-sifier operation. This often gives rise to an arms racebetween the adversary and the classifier designer. Wellknown examples of attacks against pattern classifiersare: submitting a fake biometric trait to a biometricauthentication system (spoofing attack) [1], [2]; modify-ing network packets belonging to intrusive traffic toevade intrusion detection systems [3]; manipulating thecontent of spam emails to get them past spam filters(e.g., by misspelling common spam words to avoid theirdetection) [4]–[6]. Adversarial scenarios can also occurin intelligent data analysis [7] and information retrieval[8]; e.g., a malicious webmaster may manipulate searchengine rankings to artificially promote her1 web site.

It is now acknowledged that, since pattern classifi-cation systems based on classical theory and designmethods [9] do not take into account adversarial settings,they exhibit vulnerabilities to several potential attacks,

Department of Electrical and Electronic Engineering, University of CagliariPiazza d’Armi, 09123 Cagliari, ItalyBattista Biggio: e-mail [email protected], phone +39 070 675 5776Giorgio Fumera: e-mail [email protected], phone +39 070 675 5754Fabio Roli (corresponding author): e-mail [email protected], phone +39 070675 5779, fax (shared) +39 070 675 5782

1. The adversary is hereafter referred to as feminine due to the pop-ular interpretation as “Eve” or “Carol” in cryptography and computersecurity.

allowing adversaries to undermine their effectiveness[6], [10]–[14]. A systematic and unified treatment of thisissue is thus needed to allow the trusted adoption of pat-tern classifiers in adversarial environments, starting fromthe theoretical foundations up to novel design methods,extending the classical design cycle of [9]. In particular,three main open issues can be identified: (i) analyzingthe vulnerabilities of classification algorithms, and thecorresponding attacks [11], [14], [15]; (ii) developingnovel methods to assess classifier security against theseattacks, which is not possible using classical performanceevaluation methods [6], [12], [16], [17]; (iii) developingnovel design methods to guarantee classifier security inadversarial environments [1], [6], [10].

Although this emerging field is attracting growinginterest [13], [18], [19], the above issues have only beensparsely addressed under different perspectives and toa limited extent. Most of the work has focused onapplication-specific issues related to spam filtering andnetwork intrusion detection, e.g., [3]–[6], [10], while onlya few theoretical models of adversarial classificationproblems have been proposed in the machine learn-ing literature [10], [14], [15]; however, they do not yetprovide practical guidelines and tools for designers ofpattern recognition systems.

Besides introducing these issues to the pattern recog-nition research community, in this work we addressissues (i) and (ii) above by developing a framework forthe empirical evaluation of classifier security at designphase that extends the model selection and performanceevaluation steps of the classical design cycle of [9].

In Sect. 2 we summarize previous work, and point outthree main ideas that emerge from it. We then formalizeand generalize them in our framework (Sect. 3). First, topursue security in the context of an arms race it is not

Page 2: Security evaluation of pattern classifiers under attackpralab.diee.unica.it/sites/default/files/biggio13-tkde.pdf1 Security evaluation of pattern classifiers under attack Battista

2

sufficient to react to observed attacks, but it is also nec-essary to proactively anticipate the adversary by predictingthe most relevant, potential attacks through a what-ifanalysis; this allows one to develop suitable countermea-sures before the attack actually occurs, according to theprinciple of security by design. Second, to provide practi-cal guidelines for simulating realistic attack scenarios, wedefine a general model of the adversary, in terms of hergoal, knowledge, and capability, which encompasses andgeneralizes models proposed in previous work. Third,since the presence of carefully targeted attacks may affectthe distribution of training and testing data separately,we propose a model of the data distribution that canformally characterize this behavior, and that allows us totake into account a large number of potential attacks; wealso propose an algorithm for the generation of trainingand testing sets to be used for security evaluation,which can naturally accommodate application-specificand heuristic techniques for simulating attacks.

In Sect. 4 we give three concrete examples of applica-tions of our framework in spam filtering, biometric au-thentication, and network intrusion detection. In Sect. 5,we discuss how the classical design cycle of patternclassifiers should be revised to take security into account.Finally, in Sect. 6, we summarize our contributions, thelimitations of our framework, and some open issues.

2 BACKGROUND AND PREVIOUS WORK

Here we review previous work, highlighting the con-cepts that will be exploited in our framework.

2.1 A taxonomy of attacks against pattern classifiersA taxonomy of potential attacks against pattern clas-sifiers was proposed in [11], [15], and subsequentlyextended in [14]. We will exploit it in our framework, aspart of the definition of attack scenarios. The taxonomyis based on two main features: the kind of influence ofattacks on the classifier, and the kind of security violationthey cause. The influence can be either causative, if itundermines the learning algorithm to cause subsequentmisclassifications; or exploratory, if it exploits knowl-edge of the trained classifier to cause misclassifications,without affecting the learning algorithm. Thus, causativeattacks may influence both training and testing data,or only training data, whereas exploratory attacks af-fect only testing data. The security violation can be anintegrity violation, if it allows the adversary to accessthe service or resource protected by the classifier; anavailability violation, if it denies legitimate users accessto it; or a privacy violation, if it allows the adversaryto obtain confidential information from the classifier.Integrity violations result in misclassifying malicioussamples as legitimate, while availability violations canalso cause legitimate samples to be misclassified as ma-licious. A third feature of the taxonomy is the specificityof an attack, that ranges from targeted to indiscriminate,depending on whether the attack focuses on a single

or few specific samples (e.g., a specific spam emailmisclassified as legitimate), or on a wider set of samples.

2.2 Limitations of classical performance evaluationmethods in adversarial classification

Classical performance evaluation methods, like k-foldcross validation and bootstrapping, aim to estimate theperformance that a classifier will exhibit during operation,by using data D collected during classifier design.2 Thesemethods are based on the stationarity assumption that thedata seen during operation follow the same distributionas D. Accordingly, they resample D to construct one ormore pairs of training and testing sets that ideally followthe same distribution as D [9]. However, the presence ofan intelligent and adaptive adversary makes the clas-sification problem highly non-stationary, and makes itdifficult to predict how many and which kinds of attacksa classifier will be subject to during operation, that is,how the data distribution will change. In particular,the testing data processed by the trained classifier canbe affected by both exploratory and causative attacks,while the training data can only be affected by causativeattacks, if the classifier is retrained online [11], [14], [15].3

In both cases, during operation, testing data may followa different distribution than that of training data, whenthe classifier is under attack. Therefore, security evalu-ation can not be carried out according to the classicalparadigm of performance evaluation.4

2.3 Arms race and security by design

Security problems often lead to a “reactive” arms racebetween the adversary and the classifier designer. Ateach step, the adversary analyzes the classifier defenses,and develops an attack strategy to overcome them. Thedesigner reacts by analyzing the novel attack samples,and, if required, updates the classifier; typically, by re-training it on the new collected samples, and/or addingfeatures that can detect the novel attacks (see Fig. 1, left).Examples of this arms race can be observed in spamfiltering and malware detection, where it has led to aconsiderable increase in the variability and sophistica-tion of attacks and countermeasures.

2. By design we refer to the classical steps of [9], which includefeature extraction, model selection, classifier training and performanceevaluation (testing). Operation denotes instead the phase which startswhen the deployed classifier is set to operate in a real environment.

3. “Training” data refers both to the data used by the learningalgorithm during classifier design, coming from D, and to the datacollected during operation to retrain the classifier through onlinelearning algorithms. “Testing” data refers both to the data drawn fromD to evaluate classifier performance during design, and to the dataclassified during operation. The meaning will be clear from the context.

4. Classical performance evaluation is generally not suitable fornon-stationary, time-varying environments, where changes in the datadistribution can be hardly predicted; e.g., in the case of concept drift[20]. Further, the evaluation process has a different goal in this case,i.e., to assess whether the classifier can “recover” quickly after a changehas occurred in the data distribution. Instead, security evaluation aimsat identifying the most relevant attacks and threats that should becountered before deploying the classifier (see Sect. 2.3).

Page 3: Security evaluation of pattern classifiers under attackpralab.diee.unica.it/sites/default/files/biggio13-tkde.pdf1 Security evaluation of pattern classifiers under attack Battista

3

Fig. 1. A conceptual representation of the arms race in adversarial classification. Left: the classical “reactive” armsrace. The designer reacts to the attack by analyzing the attack’s effects and developing countermeasures. Right: the“proactive” arms race advocated in this paper. The designer tries to anticipate the adversary by simulating potentialattacks, evaluating their effects, and developing countermeasures if necessary.

To secure a system, a common approach used inengineering and cryptography is security by obscurity,that relies on keeping secret some of the system detailsto the adversary. In contrast, the paradigm of securityby design advocates that systems should be designedfrom the ground-up to be secure, without assumingthat the adversary may ever find out some importantsystem details. Accordingly, the system designer shouldanticipate the adversary by simulating a “proactive” armsrace to (i) figure out the most relevant threats andattacks, and (ii) devise proper countermeasures, beforedeploying the classifier (see Fig. 1, right). This paradigmtypically improves security by delaying each step ofthe “reactive” arms race, as it requires the adversaryto spend a greater effort (time, skills, and resources) tofind and exploit vulnerabilities. System security shouldthus be guaranteed for a longer time, with less frequentsupervision or human intervention.

The goal of security evaluation is to address issue(i) above, i.e., to simulate a number of realistic attackscenarios that may be incurred during operation, andto assess the impact of the corresponding attacks on thetargeted classifier to highlight the most critical vulner-abilities. This amounts to performing a what-if analysis[21], which is a common practice in security. This ap-proach has been implicitly followed in several previousworks (see Sect. 2.4), but never formalized within a gen-eral framework for the empirical evaluation of classifiersecurity. Although security evaluation may also suggestspecific countermeasures, the design of secure classifiers,i.e., issue (ii) above, remains a distinct open problem.

2.4 Previous work on security evaluation

Many authors implicitly performed security evaluationas a what-if analysis, based on empirical simulationmethods; however, they mainly focused on a specificapplication, classifier and attack, and devised ad hocsecurity evaluation procedures based on the exploita-tion of problem knowledge and heuristic techniques[1]–[6], [15], [22]–[36]. Their goal was either to pointout a previously unknown vulnerability, or to evaluatesecurity against a known attack. In some cases, spe-cific countermeasures were also proposed, according toa proactive/security-by-design approach. Attacks weresimulated by manipulating training and testing samples

according to application-specific criteria only, withoutreference to more general guidelines; consequently, suchtechniques can not be directly exploited by a systemdesigner in more general cases.

A few works proposed analytical methods to evaluatethe security of learning algorithms or of some classesof decision functions (e.g., linear ones), based on moregeneral, application-independent criteria to model theadversary’s behavior (including PAC learning and gametheory) [10], [12], [14], [16], [17], [37]–[40]. Some of thesecriteria will be exploited in our framework for empiricalsecurity evaluation; in particular, in the definition ofthe adversary model described in Sect. 3.1, as high-levelguidelines for simulating attacks.

2.5 Building on previous workWe summarize here the three main concepts more orless explicitly emerged from previous work that will beexploited in our framework for security evaluation.

1) Arms race and security by design: since it is notpossible to predict how many and which kinds of attacksa classifier will incur during operation, classifier securityshould be proactively evaluated using a what-if analysis,by simulating potential attack scenarios.

2) Adversary modeling: effective simulation of attackscenarios requires a formal model of the adversary.

3) Data distribution under attack: the distribution oftesting data may differ from that of training data, whenthe classifier is under attack.

3 A FRAMEWORK FOR EMPIRICAL EVALUA-TION OF CLASSIFIER SECURITY

We propose here a framework for the empirical evalu-ation of classifier security in adversarial environments,that unifies and builds on the three concepts highlightedin Sect. 2.5. Our main goal is to provide a quantitativeand general-purpose basis for the application of thewhat-if analysis to classifier security evaluation, based onthe definition of potential attack scenarios. To this end,we propose: (i) a model of the adversary, that allowsus to define any attack scenario; (ii) a correspondingmodel of the data distribution; and (iii) a method forgenerating training and testing sets that are representa-tive of the data distribution, and are used for empiricalperformance evaluation.

Page 4: Security evaluation of pattern classifiers under attackpralab.diee.unica.it/sites/default/files/biggio13-tkde.pdf1 Security evaluation of pattern classifiers under attack Battista

4

3.1 Attack scenario and model of the adversary

Although the definition of attack scenarios is ultimatelyan application-specific issue, it is possible to give generalguidelines that can help the designer of a pattern recog-nition system. Here we propose to specify the attackscenario in terms of a conceptual model of the adversarythat encompasses, unifies, and extends different ideasfrom previous work. Our model is based on the assump-tion that the adversary acts rationally to attain a givengoal, according to her knowledge of the classifier, and hercapability of manipulating data. This allows one to derivethe corresponding optimal attack strategy.

Adversary’s goal. As in [17], it is formulated as theoptimization of an objective function. We propose todefine this function based on the desired security vio-lation (integrity, availability, or privacy), and on the attackspecificity (from targeted to indiscriminate), according tothe taxonomy in [11], [14] (see Sect. 2.1). For instance, thegoal of an indiscriminate integrity violation may be tomaximize the fraction of misclassified malicious samples[6], [10], [14]; the goal of a targeted privacy violationmay be to obtain some specific, confidential informationfrom the classifier (e.g., the biometric trait of a given userenrolled in a biometric system) by exploiting the classlabels assigned to some “query” samples, while mini-mizing the number of query samples that the adversaryhas to issue to violate privacy [14], [16], [41].

Adversary’s knowledge. Assumptions on the adver-sary’s knowledge have only been qualitatively discussedin previous work, mainly depending on the applicationat hand. Here we propose a more systematic schemefor their definition, with respect to the knowledge ofthe single components of a pattern classifier: (k.i) thetraining data; (k.ii) the feature set; (k.iii) the learningalgorithm and the kind of decision function (e.g., a linearSVM); (k.iv) the classifier’s decision function and itsparameters (e.g., the feature weights of a linear clas-sifier); (k.v) the feedback available from the classifier,if any (e.g., the class labels assigned to some “query”samples that the adversary issues to get feedback [14],[16], [41]). It is worth noting that realistic and minimalassumptions about what can be kept fully secret from theadversary should be done, as discussed in [14]. Examplesof adversary’s knowledge are given in Sect. 4.

Adversary’s capability. It refers to the control thatthe adversary has on training and testing data. Wepropose to define it in terms of: (c.i) the attack influence(either causative or exploratory), as defined in [11], [14];(c.ii) whether and to what extent the attack affects theclass priors; (c.iii) how many and which training andtesting samples can be controlled by the adversary ineach class; (c.iv) which features can be manipulated, andto what extent, taking into account application-specificconstraints (e.g., correlated features can not be modifiedindependently, and the functionality of malicious sam-ples can not be compromised [1], [3], [6], [10]).

Attack strategy. One can finally define the optimal

attack strategy, namely, how training and testing datashould be quantitatively modified to optimize the objec-tive function characterizing the adversary’s goal. Suchmodifications are defined in terms of: (a.i) how the classpriors are modified; (a.ii) what fraction of samples ofeach class is affected by the attack; and (a.iii) how fea-tures are manipulated by the attack. Detailed examplesare given in Sect. 4.

Once the attack scenario is defined in terms of theadversary model and the resulting attack strategy, ourframework proceeds with the definition of the corre-sponding data distribution, that is used to constructtraining and testing sets for security evaluation.

3.2 A model of the data distribution

We consider the standard setting for classifier designin a problem which consists of discriminating betweenlegitimate (L) and malicious (M) samples: a learningalgorithm and a performance measure have been chosen,a set D of n labelled samples has been collected, anda set of d features have been extracted, so that D ={(xi, yi)}ni=1, where xi denotes a d-dimensional featurevector, and yi ∈ {L,M} a class label. The pairs (xi, yi)are assumed to be i.i.d. samples of some unknown distri-bution pD(X, Y ). Since the adversary model in Sect. 3.1requires us to specify how the attack affects the classpriors and the features of each class, we consider theclassical generative model pD(X, Y ) = pD(Y )pD(X|Y ).5

To account for the presence of attacks during operation,which may affect either the training or the testing data,or both, we denote the corresponding training and test-ing distributions as ptr and pts, respectively. We will justwrite p when we want to refer to either of them, or both,and the meaning is clear from the context.

When a classifier is not under attack, according tothe classical stationarity assumption we have ptr(Y ) =pts(Y ) = pD(Y ), and ptr(X|Y ) = pts(X|Y ) = pD(X|Y ).We extend this assumption to the components of ptr

and pts that are not affected by the attack (if any), byassuming that they remain identical to the correspondingdistribution pD (e.g., if the attack does not affect the classpriors, the above equality also holds under attack).

The distributions p(Y ) and p(X|Y ) that are affectedby the attack can be defined as follows, according to thedefinition of the attack strategy, (a.i-iii).

Class priors. ptr(Y ) and pts(Y ) can be immediatelydefined based on assumption (a.i).

Class-conditional distributions. ptr(X|Y ) andpts(X|Y ) can be defined based on assumptions (a.ii-iii).First, to account for the fact that the attack may notmodify all training or testing samples, according to(a.ii), we model p(X|Y ) as a mixture controlled by aBoolean random variable A, which denotes whether

5. In this paper, for the sake of simplicity, we use the lowercasenotation p(·) to denote any probability density function, with bothcontinuous and discrete random variables.

Page 5: Security evaluation of pattern classifiers under attackpralab.diee.unica.it/sites/default/files/biggio13-tkde.pdf1 Security evaluation of pattern classifiers under attack Battista

5

Fig. 2. Generative model of the distributions ptr and pts.

a given sample has been manipulated (A = T) or not(A = F):

p(X|Y ) = p(X|Y,A = T)p(A = T|Y ) +

p(X|Y,A = F)p(A = F|Y ). (1)

We name the samples of the component p(X|Y,A = T)attack samples to emphasize that their distribution isdifferent from that of samples which have not beenmanipulated by the adversary, p(X|Y,A = F). Note thatp(A = T|Y = y) is the probability that an attack samplebelongs to class y, i.e., the percentage of samples of classy controlled by the adversary. It is thus defined by (a.ii).

For samples which are unaffected by the attack, thestationarity assumption holds:

p(X|Y,A = F) = pD(X|Y ). (2)

The distribution p(X|Y,A = T) depends on assump-tion (a.iii). Depending on the problem at hand, it may notbe possible to analytically define it. Nevertheless, for thepurpose of security evaluation, we will show in Sect. 3.3that it can be defined as the empirical distribution of aset of fictitious attack samples.

Finally, p(X|Y,A = F) can be defined similarly, ifits analytical definition is not possible. In particular,according to Eq. (2), it can be defined as the empiricaldistribution of D.

The above generative model of the training andtesting distributions ptr and pts is represented by theBayesian network in Fig. 2, which corresponds to fac-torize p(X, Y, A) as follows:

p(X, Y, A) = p(Y )p(A|Y )p(X|Y,A). (3)

Our model can be easily extended to take into accountconcurrent attacks involving m > 1 different kinds ofsample manipulations; for example, to model attacksagainst different classifiers in multimodal biometric sys-tems [1], [2]. To this end, one can define m differentBoolean random variables A = (A1, . . . , Am), and thecorresponding distributions. The extension of Eq. 3 isthen straightforward (e.g., see [32]). The dependencebetween the different attacks, if any, can be modeled bythe distribution p(A|Y ).

If one is also interested in evaluating classifier secu-rity subject to temporal, non-adversarial variations ofthe data distribution (e.g., the drift of the content oflegitimate emails over time), it can be assumed that the

distribution p(X, Y, A = F) changes over time, accordingto some model p(X, Y, A = F, t) (see, e.g., [20]). Then,classifier security at time t can be evaluated by modelingthe attack distribution over time p(X, Y, A = T, t) as afunction of p(X, Y, A = F, t).

The model of the adversary in Sect. 3.1, and the abovemodel of the data distribution provide a quantitative,well-grounded and general-purpose basis for the ap-plication of the what-if analysis to classifier securityevaluation, which advances previous work.

3.3 Training and testing set generation

Here we propose an algorithm to sample training (TR)and testing (TS) sets of any desired size from the distri-butions ptr(X, Y ) and pts(X, Y ).

We assume that k ≥ 1 different pairs of training andtesting sets (Di

TR,DiTS), i = 1, . . . , k, have been obtained

from D using a classical resampling technique, like cross-validation or bootstrapping. Accordingly, their samplesfollow the distribution pD(X, Y ). In the following, wedescribe how to modify each of the sets Di

TR to constructa training set TRi that follows the distribution ptr(X, Y ).For the sake of simplicity, we will omit the superscripti. An identical procedure can be followed to construct atesting set TSi from each of the Di

TS. Security evaluationis then carried out with the classical method, by averag-ing (if k > 1) the performance of the classifier trained onTRi and tested on TSi.

If the attack does not affect the training samples,i.e., ptr(X, Y ) = pD(X, Y ), TR is simply set equal toDTR. Otherwise, two alternatives are possible. (i) Ifptr(X|Y,A) is analytically defined for each Y ∈ {L,M}and A ∈ {T,F}, then TR can be obtained by samplingthe generative model of p(X, Y, A) of Eq. (3): first, aclass label y is sampled from ptr(Y ), then, a value afrom ptr(A|Y = y), and, finally, a feature vector x fromptr(X|Y = y,A = a). (ii) If ptr(X|Y = y,A = a) isnot analytically defined for some y and a, but a setof its samples is available, denoted in the following asDy,a

TR, it can be approximated as the empirical distributionof Dy,a

TR. Accordingly, we can sample with replacementfrom Dy,a

TR [42]. An identical procedure can be used toconstruct the testing set TS. The procedure to obtain TRor TS is formally described as Algorithm 1.6

Let us now discuss how to construct the sets Dy,aTR,

when ptr(X|Y = y,A = a) is not analytically defined.The same discussion holds for the sets Dy,a

TS . First, thetwo sets DL,F

TR and DM,FTR can be respectively set equal

to the legitimate and malicious samples in DTR, since

6. Since the proposed algorithm is based on classical resamplingtechniques such as cross-validation and bootstrapping, it is reasonableto expect that the bias and the variance of the estimated classificationerror (or of any other performance measure) will enjoy similar statis-tical properties to those exhibited by classical performance evaluationmethods based on the same techniques. In practice, these error compo-nents are typically negligible with respect to errors introduced by theuse of limited training data, biased learning/classification algorithms,and noisy or corrupted data.

Page 6: Security evaluation of pattern classifiers under attackpralab.diee.unica.it/sites/default/files/biggio13-tkde.pdf1 Security evaluation of pattern classifiers under attack Battista

6

Algorithm 1 Construction of TR or TS.Input: The number n of desired samples;the distributions p(Y ) and p(A|Y );for each y ∈ {L,M}, a ∈ {T,F}, the distribution p(X|Y =y,A = a), if analytically defined, or the set of samplesDy,a, otherwise.Output: A data set S (either TR or TS) drawn fromp(Y )p(A|Y )p(X|Y,A).

1: S ← ∅2: for i = 1, . . . , n do3: sample y from p(Y )4: sample a from p(A|Y = y)5: draw a sample x from p(X|Y = y,A = a), if analyt-

ically defined; otherwise, sample with replacementfrom Dy,a

6: S ← S⋃{(x, y)}

7: end for8: return S

the distribution of such samples is assumed to be justptr(X|Y = L, A = F) and ptr(X|Y = M, A = F) (seeEq. 2): DL,F

TR = {(x, y) ∈ DTR : y = L}, DM,FTR =

{(x, y) ∈ DTR : y = M}. The two sets of attacksamples Dy,T

TR , for y = L,M, must come instead fromptr(X|Y = y,A = T). They can thus be constructedaccording to the point (a.iii) of the attack strategy, usingany technique for simulating attack samples. Therefore,all the ad hoc techniques used in previous work forconstructing fictitious attack samples can be used asmethods to define or empirically approximate the dis-tribution p(X|Y = y,A = T), for y = L,M.

Two further considerations must be made on the sim-ulated attack samples Dy,T

TR . First, the number of distinctattack samples that can be obtained depends on thesimulation technique, when it requires the use of thedata in DTR; for instance, if it consists of modifyingdeterministically each malicious sample in DTR, no morethan |{(x, y) ∈ DTR : y = M}| distinct attack samplescan be obtained. If any number of attack samples canbe generated, instead (e.g., [5], [28], [34], [36]), then thesets Dy,T

TR , for y = L,M, do not need to be constructedbeforehand: a single attack sample can be generatedonline at step 5 of Algorithm 1, when a = T. Second, insome cases it may be necessary to construct the attacksamples incrementally; for instance, in the causativeattacks considered in [11], [30], [36], the attack samplesare added to the training data one at a time, sinceeach of them is a function of the current training set.This corresponds to a non-i.i.d. sampling of the attackdistribution. In these cases, Algorithm 1 can be modifiedby first generating all pairs y, a, then, the feature vectorsx corresponding to a = F, and, lastly, the attack samplescorresponding to a = T, one at a time.

3.4 How to use our framework

We summarize here the steps that the designer of apattern classifier should take to evaluate its securityusing our framework, for each attack scenario of interest.They extend the performance evaluation step of theclassical design cycle of [9], which is used as part of themodel selection phase, and to evaluate the final classifierto be deployed.

1) Attack scenario. The attack scenario should be de-fined at the conceptual level by making specific assump-tions on the goal, knowledge (k.i-v), and capability of theadversary (c.i-iv), and defining the corresponding attackstrategy (a.i-iii), according to the model of Sect. 3.1.

2) Data model. According to the hypothesized attackscenario, the designer should define the distributionsp(Y ), p(A|Y ), and p(X|Y,A), for Y ∈ {L,M}, A ∈ {F,T},and for training and testing data. If p(X|Y,A) is notanalytically defined for some Y = y and A = a, eitherfor training or testing data, the corresponding set Dy,a

TR

or Dy,aTS must be constructed. The sets Dy,F

TR (Dy,FTS ) are

obtained from DTR (DTS). The sets Dy,TTR and Dy,T

TS canbe generated, only if the attack involves sample ma-nipulation, using an attack sample simulation techniqueaccording to the attack strategy (a.iii).

3) Construction of TR and TS. Given k ≥ 1 pairs(Di

TR,DiTS), i = 1, . . . , k, obtained from classical resam-

pling techniques like cross-validation or bootstrapping,the size of TR and TS must be defined, and Algorithm1 must be run with the corresponding inputs to obtainTRi and TSi. If the attack does not affect the training(testing) data, TRi (TSi) is set to Di

TR (DiTS).

4) Performance evaluation. The classifier performanceunder the simulated attack is evaluated using the con-structed (TRi, TSi) pairs, as in classical techniques.

4 APPLICATION EXAMPLES

While previous work focused on a single application, weconsider here three different application examples of ourframework in spam filtering, biometric authentication,and network intrusion detection. Our aim is to showhow the designer of a pattern classifier can use ourframework, and what kind of additional informationhe can obtain from security evaluation. We will showthat a trade-off between classifier accuracy and secu-rity emerges sometimes, and that this information canbe exploited for several purposes; e.g., to improve themodel selection phase by considering both classificationaccuracy and security.

4.1 Spam filtering

Assume that a classifier has to discriminate betweenlegitimate and spam emails on the basis of their textualcontent, and that the bag-of-words feature representationhas been chosen, with binary features denoting the oc-currence of a given set of words. This kind of classifier

Page 7: Security evaluation of pattern classifiers under attackpralab.diee.unica.it/sites/default/files/biggio13-tkde.pdf1 Security evaluation of pattern classifiers under attack Battista

7

Sect. 4.1 Sect. 4.2 Sect. 4.3Attack scenario Indiscrim.

ExploratoryIntegrity

TargetedExploratoryIntegrity

Indiscrim.CausativeIntegrity

ptr(Y ) pD(Y ) pD(Y ) ptr(M)=pmax

ptr(A = T|Y = L) 0 0 0ptr(A = T|Y = M) 0 0 1ptr(X|Y = L, A = T) - - -ptr(X|Y = M, A = T) - - empiricalptr(X|Y = L, A = F) pD(X|L) pD(X|L) pD(X|L)ptr(X|Y = M, A = F) pD(X|M) pD(X|M) -pts(Y ) pD(Y ) pD(Y ) pD(Y )pts(A = T|Y = L) 0 0 0pts(A = T|Y = M) 1 1 0pts(X|Y = L, A = T) - - -pts(X|Y = M, A = T) empirical empirical -pts(X|Y = L, A = F) pD(X|L) pD(X|L) pD(X|L)pts(X|Y = M, A = F) - - pD(X|M)

TABLE 1Parameters of the attack scenario and of the data model

for each application example. ‘Empirical’ means thatp(X|Y = y,A = a) was approximated as the empiricaldistribution of a set of samples Dy,a; and ‘-’ means that

no samples from this distribution were needed tosimulate the attack.

has been considered by several authors [6], [28], [43], andit is included in several real spam filters.7

In this example, we focus on model selection. Weassume that the designer wants to choose between asupport vector machine (SVM) with a linear kernel,and a logistic regression (LR) linear classifier. He alsowants to choose a feature subset, among all the wordsoccurring in training emails. A set D of legitimate andspam emails is available for this purpose. We assumethat the designer wants to evaluate not only classifieraccuracy in the absence of attacks, as in the classicaldesign scenario, but also its security against the well-known bad word obfuscation (BWO) and good wordinsertion (GWI) attacks. They consist of modifying spamemails by inserting “good words” that are likely toappear in legitimate emails, and by obfuscating “badwords” that are typically present in spam [6]. The attackscenario can be modeled as follows.

1) Attack scenario. Goal. The adversary aims at max-imizing the percentage of spam emails misclassified aslegitimate, which is an indiscriminate integrity violation.

Knowledge. As in [6], [10], the adversary is assumedto have perfect knowledge of the classifier, i.e.: (k.ii) thefeature set, (k.iii) the kind of decision function, and (k.iv)its parameters (the weight assigned to each feature, andthe decision threshold). Assumptions on the knowledgeof (k.i) the training data and (k.v) feedback from theclassifier are not relevant in this case, as they do notprovide any additional information.

Capability. We assume that the adversary: (c.i) is only

7. SpamAssassin, http://spamassassin.apache.org/; Bogofilter,http://bogofilter.sourceforge.net/; SpamBayes http://spambayes.sourceforge.net/

able to influence testing data (exploratory attack); (c.ii)can not modify the class priors; (c.iii) can manipulateeach malicious sample, but no legitimate ones; (c.iv)can manipulate any feature value (i.e., she can insert orobfuscate any word), but up to a maximum number nmax

of features in each spam email [6], [10]. This allows usto evaluate how gracefully the classifier performance de-grades as an increasing number of features is modified,by repeating the evaluation for increasing values of nmax.

Attack strategy. Without loss of generality, let us fur-ther assume that x is classified as legitimate if g(x) =∑n

i=1 wixi +w0 < 0, where g(·) is the discriminant func-tion of the classifier, n is the feature set size, xi ∈ {0, 1}are the feature values (1 and 0 denote respectively thepresence and the absence of the corresponding term), wi

are the feature weights, and w0 is the bias.Under the above assumptions, the optimal attack strat-

egy can be attained by: (a.i) leaving the class priorsunchanged; (a.ii) manipulating all testing spam emails;and (a.iii) modifying up to nmax words in each spam tominimize the discriminant function of the classifier.

Each attack sample (i.e., modified spam) can be thusobtained by solving a constrained optimization problem.As in [10], the generation of attack samples can berepresented by a function A : X 7→ X , and the numberof modified words can be evaluated by the Hammingdistance. Accordingly, for any given x ∈ X , the optimalattack strategy amounts to finding the attack sampleA(x) which minimizes g(A(x)), subject to the constraintthat the Hamming distance between x and A(x) is nogreater than nmax, that is:

A(x) = argminx′

n∑i=1

wix′i

s.t.

n∑i=1

|x′i − xi| ≤ nmax. (4)

The solution to problem (4) is straightforward. First,note that inserting (obfuscating) a word results in switch-ing the corresponding feature value from 0 to 1 (1 to0), and that the minimum of g(·) is attained when allfeatures that have been assigned a negative (positive)weight are equal to 1 (0). Accordingly, for a given x, thelargest decrease of g(A(x)) subject to the above constraintis obtained by analyzing all features for decreasingvalues of |wi|, and switching from 0 to 1 (from 1 to0) the ones corresponding to wi < 0 (wi > 0), untilnmax changes have been made, or all features have beenanalyzed. Note that this attack can be simulated bydirectly manipulating the feature vectors of spam emailsinstead of the emails themselves.

2) Data model. Since the adversary can only manipu-late testing data, we set ptr(Y ) = pD(Y ), and ptr(X|Y ) =pD(X|Y ). Assumptions (a.i-ii) are directly encoded as:(a.i) pts(Y ) = pD(Y ); and (a.ii) pts(A = T|Y = M) = 1,and pts(A = T|Y = L) = 0. The latter implies thatpts(X|Y = L) = pD(X|Y = L). Assumption (a.iii) isencoded into the above function A(x), which empirically

Page 8: Security evaluation of pattern classifiers under attackpralab.diee.unica.it/sites/default/files/biggio13-tkde.pdf1 Security evaluation of pattern classifiers under attack Battista

8

defines the distribution of the attack samples pts(X|Y =M, A = T). Similarly, p(X|Y = y,A = F) = pD(X|y), fory = L,M, will be approximated as the empirical distri-bution of the set Dy,F

TS obtained from DTS, as describedbelow; the same approximation will be made in all theconsidered application examples.

The definition of the attack scenario and the datamodel is summarized in Table 1 (first column).

3) Construction of TR and TS. We use a publiclyavailable email corpus, TREC 2007. It consists of 25,220legitimate and 50,199 real spam emails.8 We select thefirst 20,000 emails in chronological order to create thedata set D. We then split D into two subsets DTR andDTS, respectively made up of the first 10,000 and thenext 10,000 emails, in chronological order. We use DTR toconstruct TR and DTS to construct TS, as in [5], [6]. Sincethe considered attack does not affect training samples,we set TR = DTR. Then, we define DL,F

TS and DM,FTS

respectively as the legitimate and malicious samples inDTS, and construct the set of attack samples DM,T

TS bymodifying all the samples in DM,F

TS according to A(x),for any fixed nmax value. Since the attack does not affectthe legitimate samples, we set DL,T

TS = ∅. Finally, we setthe size of TS to 10,000, and generate TS by runningAlgorithm 1 on DL,F

TS , DM,FTS and DM,T

TS .The features (words) are extracted from TR using the

SpamAssassin tokenization method. Four feature subsetswith size 1,000, 2,000, 10,000 and 20,000 have beenselected using the information gain criterion [44].

4) Performance evaluation. The performance measurewe use is the area under the receiver operating char-acteristic curve (AUC) corresponding to false positive(FP) error rates in the range [0, 0.1] [6]: AUC10% =∫ 0.1

0TP(FP)dFP ∈ [0, 0.1], where TP is the true positive

error rate. It is suited to classification tasks like spamfiltering, where FP errors (i.e., legitimate emails mis-classified as spam) are much more harmful than falsenegative (FN) ones.

Under the above model selection setting (two classi-fiers, and four feature subsets) eight different classifiermodels must be evaluated. Each model is trained on TR.SVMs are implemented with the LibSVM software [45].The C parameter of their learning algorithm is chosenby maximizing the AUC10% through a 5-fold cross-validation on TR. An online gradient descent algorithmis used for LR. After classifier training, the AUC10%

value is assessed on TS, for different values of nmax. Inthis case, it is a monotonically decreasing function ofnmax. The more graceful its decrease, the more robustthe classifier is to the considered attack. Note that,for nmax = 0, no attack samples are included in thetesting set: the corresponding AUC10% value equals thatattained by classical performance evaluation methods.

The results are reported in Fig. 3. As expected, theAUC10% of each model decreases as nmax increases.It drops to zero for nmax values between 30 and 50

8. http://plg.uwaterloo.ca/∼gvcormac/treccorpus07

Fig. 3. AUC10% attained on TS as a function of nmax, forthe LR (top) and SVM (bottom) classifier, with 1,000 (1K),2,000 (2K), 10,000 (10K) and 20,000 (20K) features. TheAUC10% value for nmax = 0, corresponding to classicalperformance evaluation, is also reported in the legendbetween square brackets.

(depending on the classifier): this means that all testingspam emails got misclassified as legitimate, after addingor obfuscating from 30 to 50 words.

The SVM and LR classifiers perform very similarlywhen they are not under attack (i.e., for nmax = 0),regardless of the feature set size; therefore, according tothe viewpoint of classical performance evaluation, thedesigner could choose any of the eight models. However,security evaluation highlights that they exhibit a verydifferent robustness to the considered attack, since theirAUC10% value decreases at very different rates as nmax

increases; in particular, the LR classifier with 20,000 fea-tures clearly outperforms all the other ones, for all nmax

values. This result suggests the designer a very differentchoice than the one coming from classical performanceevaluation: the LR classifier with 20,000 features shouldbe selected, given that it exhibit the same accuracy asthe other ones in the absence of attacks, and a highersecurity under the considered attack.

4.2 Biometric authenticationMultimodal biometric systems for personal identityrecognition have received great interest in the past fewyears. It has been shown that combining informationcoming from different biometric traits can overcome thelimits and the weaknesses inherent in every individualbiometric, resulting in a higher accuracy. Moreover, itis commonly believed that multimodal systems alsoimprove security against spoofing attacks, which consistof claiming a false identity and submitting at least one

Page 9: Security evaluation of pattern classifiers under attackpralab.diee.unica.it/sites/default/files/biggio13-tkde.pdf1 Security evaluation of pattern classifiers under attack Battista

9

fake biometric trait to the system (e.g., a “gummy”fingerprint or a photograph of a user’s face). The reasonis that, to evade a multimodal system, one expectsthat the adversary should spoof all the correspondingbiometric traits. In this application example, we showhow the designer of a multimodal system can verify ifthis hypothesis holds, before deploying the system, bysimulating spoofing attacks against each of the matchers.To this end, we partially exploit the analysis in [1], [2].

We consider a typical multimodal system, made upof a fingerprint and a face matcher, which operates asfollows. The design phase includes the enrollment ofauthorized users (clients): reference templates of theirbiometric traits are stored into a database, together withthe corresponding identities. During operation, each userprovides the requested biometric traits to the sensors,and claims the identity of a client. Then, each matchercompares the submitted trait with the template of theclaimed identity, and provides a real-valued matchingscore: the higher the score, the higher the similarity. Wedenote the score of the fingerprint and the face matcherrespectively as xfing and xface. Finally, the matchingscores are combined through a proper fusion rule todecide whether the claimed identity is the user’s identity(genuine user) or not (impostor).

As in [1], [2], we consider the widely used likelihoodratio (LLR) score fusion rule [46], and the matchingscores as its features. Denoting as x the feature vector(xfing, xface), the LLR can be written as:

LLR(x) =

{L if p(x|Y = L)/p(x|Y = M) ≥ t,M if p(x|Y = L)/p(x|Y = M) < t,

(5)

where L and M denote respectively the “genuine” and“impostor” class, and t is a decision threshold set ac-cording to application requirements. The distributionsp(xfing, xface|Y ) are usually estimated from training data,while t is estimated from a validation set.

1) Attack scenario. Goal. In this case, each malicioususer (impostor) aims at being accepted as a legitimate(genuine) one. This corresponds to a targeted integrityviolation, where the adversary’s goal is to maximize thematching score.

Knowledge. As in [1], [2], we assume that each impostorknows: (k.i) the identity of the targeted client; and (k.ii)the biometric traits used by the system. No knowledgeof (k.iii) the decision function and (k.iv) its parametersis assumed, and (k.v) no feedback is available from theclassifier.

Capability. We assume that: (c.i) spoofing attacks affectonly testing data (exploratory attack); (c.ii) they do notaffect the class priors;9 and, according to (k.i), (c.iii)each adversary (impostor) controls her testing malicioussamples. We limit the capability of each impostor byassuming that (c.iv) only one specific trait can be spoofedat a time (i.e., only one feature can be manipulated), and

9. Further, since the LLR considers only the ratio between the class-conditional distributions, changing the class priors is irrelevant.

that it is the same for all impostors. We will thus repeatthe evaluation twice, considering fingerprint and facespoofing, separately. Defining how impostors can manip-ulate the features (in this case, the matching scores) in aspoofing attack is not straightforward. The standard wayof evaluating the security of biometric systems againstspoofing attacks consists of fabricating fake biometrictraits and assessing performance when they are submit-ted to the system. However, this is a cumbersome task.Here we follow the approach of [1], [2], in which theimpostor is assumed to fabricate “perfect” fake traits,i.e., fakes that produce the same matching score of the“live” trait of the targeted client. This allows one to usethe available genuine scores to simulate the fake scores,avoiding the fabrication of fake traits. On the other hand,this assumption may be too pessimistic. This limitationcould be overcome by developing more realistic modelsof the distribution of the fake traits; however, this isa challenging research issue on its own, that is out ofthe scope of this work, and part of the authors’ ongoingwork [47], [48].

Attack strategy. The above attack strategy modifies onlythe testing data, and: (a.i) it does not modify the classpriors; (a.ii) it does not affect the genuine class, but allimpostors; and (a.iii) any impostor spoofs the consideredbiometric trait (either the face or fingerprint) of theknown identity. According to the above assumption, asin [1], [2], any spoofing attack is simulated by replacingthe corresponding impostor score (either the face orfingerprint) with the score of the targeted genuine user(chosen at random from the legitimate samples in thetesting set). This can be represented with a functionA(x), that, given an impostor x = (xfing, xface), returnseither a vector (x′fing, xface) (for fingerprint spoofing) or(xfing, x

′face) (for face spoofing), where x′fing and x′face are

the matching scores of the targeted genuine user.2) Data model. Since training data is not affected, we

set ptr(X, Y ) = pD(X, Y ). Then, we set (a.i) pts(Y ) =pD(Y ); and (a.ii) pts(A = T|Y = L) = 0 and pts(A =T|Y = M) = 1. Thus, pts(X|Y = L) = pD(X|Y = L),while all the malicious samples in TS have to be ma-nipulated by A(x), according to (a.iii). This amounts toempirically defining pts(X|Y = M, A = T). As in the spamfiltering case, pts(X|Y = y,A = F), for y = L,M, will beempirically approximated from the available data in D.

The definition of the above attack scenario and datamodel is summarized in Table 1 (second column).

3) Construction of TR and TS. We use the NISTBiometric Score Set, Release 1.10 It contains raw simi-larity scores obtained on a set of 517 users from twodifferent face matchers (named ‘G’ and ‘C’), and fromone fingerprint matcher using the left and right indexfinger. For each user, one genuine score and 516 impostorscores are available for each matcher and each modality,for a total of 517 genuine and 266,772 impostor samples.We use the scores of the ‘G’ face matcher and the ones

10. http://www.itl.nist.gov/iad/894.03/biometricscores/

Page 10: Security evaluation of pattern classifiers under attackpralab.diee.unica.it/sites/default/files/biggio13-tkde.pdf1 Security evaluation of pattern classifiers under attack Battista

10

Fig. 4. ROC curves of the considered multimodal bio-metric system, under a simulated spoof attack against thefingerprint or the face matcher.

of the fingerprint matcher for the left index finger, andnormalize them in [0, 1] using the min-max technique.The data set D is made up of pairs of face and finger-print scores, each belonging to the same user. We firstrandomly subdivide D into a disjoint training and testingset, DTR and DTS, containing respectively 80% and 20%of the samples. As the attack does not affect the trainingsamples, we set TR= DTR. The sets DL,F

TS and DM,FTS are

constructed using DTS, while DL,FTS = ∅. The set of attack

samples DM,TTS is obtained by modifying each sample of

DM,FTS with A(x). We finally set the size of TS as |DTS|,

and run Algorithm 1 to obtain it.4) Performance evaluation. In biometric authentica-

tion tasks, the performance is usually measured in termsof genuine acceptance rate (GAR) and false acceptancerate (FAR), respectively the fraction of genuine andimpostor attempts that are accepted as genuine by thesystem. We use here the complete ROC curve, whichshows the GAR as a function of the FAR for all valuesof the decision threshold t (see Eq. 5).

To estimate p(X|Y = y), for y = L,M, in the LLRscore fusion rule (Eq. 5), we assume that xfing and xface

are conditionally independent given Y , as usually donein biometrics. We thus compute a maximum likelihoodestimate of p(xfing, xface|Y ) from TR using a product oftwo Gamma distributions, as in [1].

Fig. 4 shows the ROC curve evaluated with the stan-dard approach (without spoofing attacks), and the onescorresponding to the simulated spoofing attacks. TheFAR axis is in logarithmic scale to focus on low FARvalues, which are the most relevant ones in securityapplications. For any operational point on the ROC curve(namely, for any value of the decision threshold t), theeffect of the considered spoofing attack is to increase theFAR, while the GAR does not change. This correspondsto a shift of the ROC curve to the right. Accordingly, theFAR under attack must be compared with the originalone, for the same GAR.

Fig. 4 clearly shows that the FAR of the biometric sys-tem significantly increases under the considered attackscenario, especially for fingerprint spoofing. As far asthis attack scenario is deemed to be realistic, the system

designer should conclude that the considered systemcan be evaded by spoofing only one biometric trait,and thus does not exhibit a higher security than eachof its individual classifiers. For instance, in applicationsrequiring relatively high security, a reasonable choicemay be to chose the operational point with FAR=10−3

and GAR=0.90, using classical performance evaluation,i.e., without taking into account spoofing attacks. Thismeans that the probability that the deployed systemwrongly accepts an impostor as a genuine user (the FAR)is expected to be 0.001. However, under face spoofing,the corresponding FAR increases to 0.10, while it jumpsto about 0.70 under fingerprint spoofing. In other words,an impostor who submits a perfect replica of the faceof a client has a probability of 10% of being acceptedas genuine, while this probability is as high as 70%, ifshe is able to perfectly replicate the fingerprint. Thisis unacceptable for security applications, and providesfurther support to the conclusions of [1], [2] against thecommon belief that multimodal biometric systems areintrinsically more robust than unimodal systems.

4.3 Network intrusion detection

Intrusion detection systems (IDSs) analyze network traf-fic to prevent and detect malicious activities like in-trusion attempts, port scans, and denial-of-service at-tacks.11 When suspected malicious traffic is detected, analarm is raised by the IDS and subsequently handledby the system administrator. Two main kinds of IDSsexist: misuse detectors and anomaly-based ones. Misusedetectors match the analyzed network traffic against adatabase of signatures of known malicious activities(e.g., Snort).12 The main drawback is that they are notable to detect never-before-seen malicious activities, oreven variants of known ones. To overcome this issue,anomaly-based detectors have been proposed. They builda statistical model of the normal traffic using machinelearning techniques, usually one-class classifiers (e.g.,PAYL [49]), and raise an alarm when anomalous traf-fic is detected. Their training set is constructed, andperiodically updated to follow the changes of normaltraffic, by collecting unsupervised network traffic duringoperation, assuming that it is normal (it can be filteredby a misuse detector, and should be discarded if somesystem malfunctioning occurs during its collection). Thiskind of IDS is vulnerable to causative attacks, since anattacker may inject carefully designed malicious trafficduring the collection of training samples to force the IDSto learn a wrong model of the normal traffic [11], [17],[29], [30], [36].

11. The term “attack” is used in this field to denote a maliciousactivity, even when there is no deliberate attempt of misleading anIDS. In adversarial classification, as in this paper, this term is used tospecifically denote the attempt of misleading a classifier, instead. Toavoid any confusion, in the following we will refrain from using theterm “attack” with the former meaning, using paraphrases, instead.

12. http://www.snort.org/

Page 11: Security evaluation of pattern classifiers under attackpralab.diee.unica.it/sites/default/files/biggio13-tkde.pdf1 Security evaluation of pattern classifiers under attack Battista

11

Here we assume that an anomaly-based IDS is beingdesigned, using a one-class ν-SVM classifier with aradial basis function (RBF) kernel and the feature vectorrepresentation proposed in [49]. Each network packetis considered as an individual sample to be labeled asnormal (legitimate) or anomalous (malicious), and isrepresented as a 256-dimensional feature vector, definedas the histogram of byte frequencies in its payload (this isknown as “1-gram” representation in the IDS literature).We then focus on the model selection stage. In theabove setting, it amounts to choosing the values of the νparameter of the learning algorithm (which is an upperbound on the false positive error rate on training data[50]), and the γ value of the RBF kernel. For the sake ofsimplicity, we assume that ν is set to 0.01 as suggestedin [24], so that only γ has to be chosen.

We show how the IDS designer can select a model(the value of γ) based also on the evaluation of classifiersecurity. We focus on a causative attack similar to theones considered in [38], aimed at forcing the learnedmodel of normal traffic to include samples of intrusionsto be attempted during operation. To this end, the attacksamples should be carefully designed such that theyinclude some features of the desired intrusive traffic,but do not perform any real intrusion (otherwise, thecollected traffic may be discarded, as explained above).

1) Attack scenario. Goal. This attack aims to causean indiscriminate integrity violation by maximizing thefraction of malicious testing samples misclassified aslegitimate.

Knowledge. The adversary is assumed to know: (k.ii)the feature set; and (k.iii) that a one-class classifier isused. No knowledge of (k.i) the training data and (k.iv)the classifiers’ parameters is available to the adversary,as well as (k.v) any feedback from the classifier.

Capability. The attack consists of injecting malicioussamples into the training set. Accordingly, we assumethat: (c.i) the adversary can inject malicious samplesinto the training data, without manipulating testing data(causative attack); (c.ii) she can modify the class priors byinjecting a maximum fraction pmax of malicious samplesinto the training data; (c.iii) all the injected malicioussamples can be manipulated; and (c.iv) the adversaryis able to completely control the feature values of themalicious attack samples. As in [17], [28]–[30], we repeatthe security evaluation for pmax ∈ [0, 0.5], since it isunrealistic that the adversary can control the majorityof the training data.

Attack strategy. The goal of maximizing the percentageof malicious testing samples misclassified as legitimate,for any pmax value, can be attained by manipulating thetraining data with the following attack strategy: (a.i) themaximum percentage of malicious samples pmax will beinjected into the training data; (a.ii) all the injected mali-cious samples will be attack samples, while no legitimatesamples will be affected; and (a.iii) the feature values ofthe attack samples will be equal to that of the malicioustesting samples. To understand the latter assumption,

note that: first, the best case for the adversary is whenthe attack samples exhibit the same histogram of thepayload’s byte values as the malicious testing samples,as this intuitively forces the model of the normal trafficto be as similar as possible to the distribution of thetesting malicious samples; and, second, this does notimply that the attack samples perform any intrusion,allowing them to avoid detection. In fact, one mayconstruct an attack packet which exhibit the same featurevalues of an intrusive network packet (i.e., a malicioustesting sample), but looses its intrusive functionality, bysimply shuffling its payload bytes. Accordingly, for ourpurposes, we directly set the feature values of the attacksamples equal to those of the malicious testing samples,without constructing real network packets.

2) Data model. Since testing data is not affected bythe considered causative attack, we set pts(X, Y ) =pD(X, Y ). We then encode the attack strategy as fol-lows. According to (a.ii), ptr(A = T|Y = L) = 0and ptr(A = T|Y = M) = 1. The former assumptionimplies that ptr(X|Y = L) = pD(X|Y = L). Since thetraining data of one-class classifiers does not contain anymalicious samples (see explanation at the beginning ofthis section), i.e., ptr(A = F|Y = M) = 0, the latterassumption implies that the class prior ptr(Y = M)corresponds exactly to the fraction of attack samples inthe training set. Therefore, assumption (a.i) amounts tosetting ptr(Y = M) = pmax. Finally, according to (a.iii),the distribution of attack samples ptr(X|Y = M, A =T) will be defined as the empirical distribution of themalicious testing samples, namely, DM,T

TR = DM,FTS . The

distribution ptr(X|Y = L, A = F) = pD(X|L) will also beempirically defined as the set DL,F

TR .The definition of the attack scenario and data model

is summarized in Table 1 (third column).3) Construction of TR and TS. We use the data set of

[24]. It consists of 1,699,822 legitimate samples (networkpackets) collected by a web server during five daysin 2006, and a publicly available set of 205 malicioussamples coming from intrusions which exploit the HTTPprotocol [51].13 To construct TR and TS, we take intoaccount the chronological order of network packets asin Sect. 4.1, and the fact that in this application all themalicious samples in D must be inserted in the testingset [27], [51]. Accordingly, we set DTR as the first 20,000legitimate packets of day one, and DTS as the first 20,000legitimate samples of day two, plus all the malicioussamples. Since this attack does not affect testing samples,we set TS = DTS. We then set DL,F

TR = DTR, andDM,F

TR = DL,TTR = ∅. Since the feature vectors of attack

samples are identical to those of the malicious samples,as above mentioned, we set DM,T

TR = DM,FTS (where the

latter set clearly includes the malicious samples in DTS).The size of TR is initially set to 20,000. We then considerdifferent attack scenarios by increasing the number ofattack samples in the training data, up to 40,000 samples

13. http://www.i-pi.com/HTTP-attacks-JoCN-2006/

Page 12: Security evaluation of pattern classifiers under attackpralab.diee.unica.it/sites/default/files/biggio13-tkde.pdf1 Security evaluation of pattern classifiers under attack Battista

12

Fig. 5. AUC10% as a function of pmax, for the ν-SVMs withRBF kernel. The AUC10% value for pmax = 0, correspond-ing to classical performance evaluation, is also reportedin the legend between square brackets. The standarddeviation (dashed lines) is reported only for γ = 0.5, sinceit was negligible for γ = 0.01.

in total, that corresponds to pmax = 0.5. For each valueof pmax, TR is obtained by running Algorithm 1 with theproper inputs.

4) Performance evaluation. Classifier performance isassessed using the AUC10% measure, for the same rea-sons as in Sect. 4.1. The performance under attack isevaluated as a function of pmax, as in [17], [28], whichreduces to the classical performance evaluation whenpmax = 0. For the sake of simplicity, we consider onlytwo values of the parameter γ, which clearly point outhow design choices based only on classical performanceevaluation methods can be unsuitable for adversarialenvironments.

The results are reported in Fig. 5. In the absence ofattacks (pmax = 0), the choice γ = 0.5 appears slightlybetter than γ = 0.01. Under attack, the performance forγ = 0.01 remains almost identical as the one withoutattack, and starts decreasing very slightly only when thepercentage of attack samples in the training set exceeds30%. On the contrary, for γ = 0.5 the performancesuddenly drops as pmax increases, becoming lower thanthe one for γ = 0.01 when pmax is as small as about 1%.The reason for this behavior is the following. The attacksamples can be considered outliers with respect to thelegitimate training samples, and, for large γ values ofthe RBF kernel, the SVM discriminant function tends tooverfit, forming a “peak” around each individual train-ing sample. Thus, it exhibits relatively high values also inthe region of the feature space where the attack sampleslie, and this allows many of the corresponding testingintrusions. Conversely, this is not true for lower γ values,where the higher spread of the RBF kernel leads to asmoother discriminant function, which exhibits muchlower values for the attack samples.

According to the above results, the choice of γ =0.5, suggested by classical performance evaluation forpmax = 0 is clearly unsuitable from the viewpoint ofclassifier security under the considered attack, unless it isdeemed unrealistic that an attacker can inject more than

1% attack samples into the training set. To summarize,the designer should select γ = 0.01 and trade a smalldecrease of classification accuracy in the absence ofattacks for a significant security improvement.

5 SECURE DESIGN CYCLE: NEXT STEPS

The classical design cycle of a pattern classifier [9]consists of: data collection, data pre-processing, featureextraction and selection, model selection (including thechoice of the learning and classification algorithms, andthe tuning of their parameters), and performance evalu-ation. We pointed out that this design cycle disregardsthe threats that may arise in adversarial settings, and ex-tended the performance evaluation step to such settings.Revising the remaining steps under a security viewpointremains a very interesting issue for future work. Here webriefly outline how this open issue can be addressed.

If the adversary is assumed to have some control overthe data collected for classifier training and parametertuning, a filtering step to detect and remove attacksamples should also be performed (see, e.g., the datasanitization method of [27]).

Feature extraction algorithms should be designed tobe robust to sample manipulation. Alternatively, featureswhich are more difficult to be manipulated should beused. For instance, in [52] inexact string matching wasproposed to counteract word obfuscation attacks in spamfiltering. In biometric recognition, it is very commonto use additional input features to detect the presence(“liveness detection”) of attack samples coming fromspoofing attacks, i.e., fake biometric traits [53]. The ad-versary could also undermine feature selection, e.g., toforce the choice of a set of features that are easier tomanipulate, or that are not discriminant enough withrespect to future attacks. Therefore, feature selection al-gorithms should be designed by taking into account notonly the discriminant capability, but also the robustnessof features to adversarial manipulation.

Model selection is clearly the design step that is moresubject to attacks. Selected algorithms should be robustto causative and exploratory attacks. In particular, robustlearning algorithms should be adopted, if no data saniti-zation can be performed. The use of robust statistics hasalready been proposed to this aim; in particular, to deviselearning algorithms robust to a limited amount of datacontamination [14], [29], and classification algorithmsrobust to specific exploratory attacks [6], [23], [26], [32].

Finally, a secure system should also guarantee the pri-vacy of its users, against attacks aimed at stealing confi-dential information [14]. For instance, privacy preservingmethods have been proposed in biometric recognitionsystems to protect the users against the so-called hill-climbing attacks, whose goal is to get information aboutthe users’ biometric traits [41], [53]. Randomization ofsome classifiers parameters has been also proposed topreserve privacy in [14], [54].

Page 13: Security evaluation of pattern classifiers under attackpralab.diee.unica.it/sites/default/files/biggio13-tkde.pdf1 Security evaluation of pattern classifiers under attack Battista

13

6 CONTRIBUTIONS, LIMITATIONSAND OPEN ISSUES

In this paper we focused on empirical security evalu-ation of pattern classifiers that have to be deployed inadversarial environments, and proposed how to revisethe classical performance evaluation design step, whichis not suitable for this purpose.

Our main contribution is a framework for empiricalsecurity evaluation that formalizes and generalizes ideasfrom previous work, and can be applied to differentclassifiers, learning algorithms, and classification tasks.It is grounded on a formal model of the adversary,and on a model of data distribution that can representall the attacks considered in previous work; provides asystematic method for the generation of training andtesting sets that enables security evaluation; and canaccommodate application-specific techniques for attacksimulation. This is a clear advancement with respect toprevious work, since without a general framework mostof the proposed techniques (often tailored to a givenclassifier model, attack, and application) could not bedirectly applied to other problems.

An intrinsic limitation of our work is that securityevaluation is carried out empirically, and it is thus data-dependent; on the other hand, model-driven analyses[12], [17], [38] require a full analytical model of the prob-lem and of the adversary’s behavior, that may be verydifficult to develop for real-world applications. Anotherintrinsic limitation is due to fact that our method is notapplication-specific, and, therefore, provides only high-level guidelines for simulating attacks. Indeed, detailedguidelines require one to take into account application-specific constraints and adversary models. Our futurework will be devoted to develop techniques for simulat-ing attacks for different applications.

Although the design of secure classifiers is a distinctproblem than security evaluation, our framework couldbe also exploited to this end. For instance, simulatedattack samples can be included into the training data toimprove security of discriminative classifiers (e.g., SVMs),while the proposed data model can be exploited todesign more secure generative classifiers. We obtainedencouraging preliminary results on this topic [32].

ACKNOWLEDGMENTS

The authors are grateful to Davide Ariu, Gavin Brown,Pavel Laskov, and Blaine Nelson for discussions andcomments on an earlier version of this paper. This workwas partly supported by a grant awarded to Battista Big-gio by Regione Autonoma della Sardegna, PO SardegnaFSE 2007-2013, L.R. 7/2007 “Promotion of the scientificresearch and technological innovation in Sardinia”, bythe project CRP-18293 funded by Regione Autonomadella Sardegna, L.R. 7/2007, Bando 2009, and by theTABULA RASA project, funded within the 7th Frame-work Research Programme of the European Union.

REFERENCES

[1] R. N. Rodrigues, L. L. Ling, and V. Govindaraju, “Robustness ofmultimodal biometric fusion methods against spoof attacks,” J.Vis. Lang. Comput., vol. 20, no. 3, pp. 169–179, 2009.

[2] P. Johnson, B. Tan, and S. Schuckers, “Multimodal fusion vulnera-bility to non-zero effort (spoof) imposters,” in IEEE Int’l Workshopon Inf. Forensics and Security, 2010, pp. 1–5.

[3] P. Fogla, M. Sharif, R. Perdisci, O. Kolesnikov, and W. Lee,“Polymorphic blending attacks,” in Proc. 15th Conf. on USENIXSecurity Symp. CA, USA: USENIX Association, 2006.

[4] G. L. Wittel and S. F. Wu, “On attacking statistical spam filters,”in 1st Conf. on Email and Anti-Spam, CA, USA, 2004.

[5] D. Lowd and C. Meek, “Good word attacks on statistical spamfilters,” in 2nd Conf. on Email and Anti-Spam, CA, USA, 2005.

[6] A. Kolcz and C. H. Teo, “Feature weighting for improved classifierrobustness,” in 6th Conf. on Email and Anti-Spam, CA, USA, 2009.

[7] D. B. Skillicorn, “Adversarial knowledge discovery,” IEEE Intell.Syst., vol. 24, pp. 54–61, 2009.

[8] D. Fetterly, “Adversarial information retrieval: The manipulationof web content,” ACM Computing Reviews, 2007.

[9] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification.Wiley-Interscience Publication, 2000.

[10] N. Dalvi, P. Domingos, Mausam, S. Sanghai, and D. Verma,“Adversarial classification,” in 10th ACM SIGKDD Int’l Conf. onKnowl. Discovery and Data Mining, WA, USA, 2004, pp. 99–108.

[11] M. Barreno, B. Nelson, R. Sears, A. D. Joseph, and J. D. Tygar,“Can machine learning be secure?” in Proc. Symp. Inf., Computerand Commun. Sec. (ASIACCS). NY, USA: ACM, 2006, pp. 16–25.

[12] A. A. Cardenas and J. S. Baras, “Evaluation of classifiers: Practicalconsiderations for security applications,” in AAAI Workshop onEvaluation Methods for Machine Learning, MA, USA, 2006.

[13] P. Laskov and R. Lippmann, “Machine learning in adversarialenvironments,” Machine Learning, vol. 81, pp. 115–119, 2010.

[14] L. Huang, A. D. Joseph, B. Nelson, B. Rubinstein, and J. D. Tygar,“Adversarial machine learning,” in 4th ACM Workshop on ArtificialIntelligence and Security, IL, USA, 2011, pp. 43–57.

[15] M. Barreno, B. Nelson, A. Joseph, and J. Tygar, “The security ofmachine learning,” Machine Learning, vol. 81, pp. 121–148, 2010.

[16] D. Lowd and C. Meek, “Adversarial learning,” in Proc. 11th ACMSIGKDD Int’l Conf. on Knowl. Discovery and Data Mining, A. Press,Ed., IL, USA, 2005, pp. 641–647.

[17] P. Laskov and M. Kloft, “A framework for quantitative securityanalysis of machine learning,” in Proc. 2nd ACM Workshop onSecurity and Artificial Intelligence. NY, USA: ACM, 2009, pp. 1–4.

[18] P. Laskov and R. Lippmann, Eds., NIPS Workshop on MachineLearning in Adversarial Environments for Computer Security, 2007.[Online]. Available: http://mls-nips07.first.fraunhofer.de/

[19] A. D. Joseph, P. Laskov, F. Roli, and D. Tygar, Eds., DagstuhlPerspectives Workshop on Mach. Learning Methods for Computer Sec.,2012. [Online]. Available: http://www.dagstuhl.de/12371/

[20] A. M. Narasimhamurthy and L. I. Kuncheva, “A framework forgenerating data to simulate changing environments,” in ArtificialIntell. and Applications. IASTED/ACTA Press, 2007, pp. 415–420.

[21] S. Rizzi, “What-if analysis,” Enc. of Database Systems, pp. 3525–3529, 2009.

[22] J. Newsome, B. Karp, and D. Song, “Paragraph: Thwarting sig-nature learning by training maliciously,” in Recent Advances inIntrusion Detection, ser. LNCS. Springer, 2006, pp. 81–105.

[23] A. Globerson and S. T. Roweis, “Nightmare at test time: robustlearning by feature deletion,” in Proc. 23rd Int’l Conf. on MachineLearning. ACM, 2006, pp. 353–360.

[24] R. Perdisci, G. Gu, and W. Lee, “Using an ensemble of one-class SVM classifiers to harden payload-based anomaly detectionsystems,” in Int’l Conf. Data Mining. IEEE CS, 2006, pp. 488–498.

[25] S. P. Chung and A. K. Mok, “Advanced allergy attacks: does acorpus really help,” in Recent Advances in Intrusion Detection, ser.RAID ’07. Berlin, Heidelberg: Springer-Verlag, 2007, pp. 236–255.

[26] Z. Jorgensen, Y. Zhou, and M. Inge, “A multiple instance learningstrategy for combating good word attacks on spam filters,” Journalof Machine Learning Research, vol. 9, pp. 1115–1146, 2008.

[27] G. F. Cretu, A. Stavrou, M. E. Locasto, S. J. Stolfo, and A. D.Keromytis, “Casting out demons: Sanitizing training data foranomaly sensors,” in IEEE Symp. on Security and Privacy. CA,USA: IEEE CS, 2008, pp. 81–95.

Page 14: Security evaluation of pattern classifiers under attackpralab.diee.unica.it/sites/default/files/biggio13-tkde.pdf1 Security evaluation of pattern classifiers under attack Battista

14

[28] B. Nelson, M. Barreno, F. J. Chi, A. D. Joseph, B. I. P. Rubinstein,U. Saini, C. Sutton, J. D. Tygar, and K. Xia, “Exploiting machinelearning to subvert your spam filter,” in Proc. 1st Workshop onLarge-Scale Exploits and Emergent Threats. CA, USA: USENIXAssociation, 2008, pp. 1–9.

[29] B. I. Rubinstein, B. Nelson, L. Huang, A. D. Joseph, S.-h. Lau,S. Rao, N. Taft, and J. D. Tygar, “Antidote: understanding anddefending against poisoning of anomaly detectors,” in Proc. 9thACM SIGCOMM Internet Measurement Conf., ser. IMC ’09. NY,USA: ACM, 2009, pp. 1–14.

[30] M. Kloft and P. Laskov, “Online anomaly detection under ad-versarial impact,” in Proc. 13th Int’l Conf. on Artificial Intell. andStatistics, 2010, pp. 405–412.

[31] O. Dekel, O. Shamir, and L. Xiao, “Learning to classify withmissing and corrupted features,” Machine Learning, vol. 81, pp.149–178, 2010.

[32] B. Biggio, G. Fumera, and F. Roli, “Design of robust classifiers foradversarial environments,” in IEEE Int’l Conf. on Systems, Man,and Cybernetics, 2011, pp. 977–982.

[33] ——, “Multiple classifier systems for robust classifier design inadversarial environments,” Int’l Journal of Machine Learning andCybernetics, vol. 1, no. 1, pp. 27–41, 2010.

[34] B. Biggio, I. Corona, G. Fumera, G. Giacinto, and F. Roli, “Baggingclassifiers for fighting poisoning attacks in adversarial environ-ments,” in Proc. 10th Int’l Workshop on Multiple Classifier Systems,ser. LNCS, vol. 6713. Springer-Verlag, 2011, pp. 350–359.

[35] B. Biggio, G. Fumera, F. Roli, and L. Didaci, “Poisoning adaptivebiometric systems,” in Structural, Syntactic, and Statistical PatternRecognition, ser. LNCS, vol. 7626. Springer, 2012, pp. 417–425.

[36] B. Biggio, B. Nelson, and P. Laskov, “Poisoning attacks againstsupport vector machines,” in Proc. 29th Int’l Conf. on MachineLearning, 2012.

[37] M. Kearns and M. Li, “Learning in the presence of maliciouserrors,” SIAM J. Comput., vol. 22, no. 4, pp. 807–837, 1993.

[38] A. A. Cardenas, J. S. Baras, and K. Seamon, “A framework for theevaluation of intrusion detection systems,” in Proc. IEEE Symp. onSecurity and Privacy. DC, USA: IEEE CS, 2006, pp. 63–77.

[39] B. Biggio, G. Fumera, and F. Roli, “Multiple classifier systemsfor adversarial classification tasks,” in Proc. 8th Int’l Workshop onMultiple Classifier Systems, ser. LNCS, vol. 5519. Springer, 2009,pp. 132–141.

[40] M. Bruckner, C. Kanzow, and T. Scheffer, “Static prediction gamesfor adversarial learning problems,” J. Mach. Learn. Res., vol. 13, pp.2617–2654, 2012.

[41] A. Adler, “Vulnerabilities in biometric encryption systems,” in 5thInt’l Conf. on Audio- and Video-Based Biometric Person Authentication,ser. LNCS, vol. 3546. NY, USA: Springer, 2005, pp. 1100–1109.

[42] B. Efron and R. J. Tibshirani, An Introduction to the Bootstrap. NewYork: Chapman & Hall, 1993.

[43] H. Drucker, D. Wu, and V. N. Vapnik, “Support vector machinesfor spam categorization,” IEEE Trans. on Neural Networks, vol. 10,no. 5, pp. 1048–1054, 1999.

[44] F. Sebastiani, “Machine learning in automated text categoriza-tion,” ACM Comput. Surv., vol. 34, pp. 1–47, 2002.

[45] C.-C. Chang and C.-J. Lin, “LibSVM: a library for support vectormachines,” 2001. [Online]. Available: http://www.csie.ntu.edu.tw/∼cjlin/libsvm/

[46] K. Nandakumar, Y. Chen, S. C. Dass, and A. Jain, “Likelihoodratio-based biometric score fusion,” IEEE Trans. on Pattern Analysisand Machine Intell., vol. 30, pp. 342–347, February 2008.

[47] B. Biggio, Z. Akhtar, G. Fumera, G. Marcialis, and F. Roli, “Robust-ness of multi-modal biometric verification systems under realisticspoofing attacks,” in Int’l Joint Conf. on Biometrics, 2011, pp. 1–6.

[48] B. Biggio, Z. Akhtar, G. Fumera, G. L. Marcialis, and F. Roli,“Security evaluation of biometric authentication systems underreal spoofing attacks,” IET Biometrics, vol. 1, no. 1, pp. 11–24, 2012.

[49] K. Wang and S. J. Stolfo, “Anomalous payload-based networkintrusion detection,” in RAID, ser. LNCS, vol. 3224. Springer,2004, pp. 203–222.

[50] B. Scholkopf, A. J. Smola, R. C. Williamson, and P. L. Bartlett,“New support vector algorithms,” Neural Comput., vol. 12, no. 5,pp. 1207–1245, 2000.

[51] K. Ingham and H. Inoue, “Comparing anomaly detection tech-niques for http,” in Recent Advances in Intrusion Detection, ser.LNCS. Springer, 2007, pp. 42–62.

[52] D. Sculley, G. Wachman, and C. E. Brodley, “Spam filtering usinginexact string matching in explicit feature space with on-linelinear classifiers,” in 15th Text Retrieval Conf. NIST, 2006.

[53] S. Z. Li and A. K. Jain, Eds., Enc. of Biometrics. Springer US, 2009.[54] B. Biggio, G. Fumera, and F. Roli, “Adversarial pattern classifica-

tion using multiple classifiers and randomisation,” in Structural,Syntactic, and Statistical Pattern Recognition, ser. LNCS, vol. 5342.FL, USA: Springer-Verlag, 2008, pp. 500–509.

Battista Biggio received the M. Sc. degree inElectronic Eng., with honors, and the Ph. D. inElectronic Eng. and Computer Science, respec-tively in 2006 and 2010, from the University ofCagliari, Italy. Since 2007 he has been workingfor the Dept. of Electrical and Electronic Eng.of the same University, where he holds now apostdoctoral position. From May 12th, 2011 toNovember 12th, 2011, he visited the Universityof Tubingen, Germany, and worked on the secu-rity of machine learning algorithms to contami-

nation of training data. His research interests currently include: secure /robust machine learning and pattern recognition methods, multiple clas-sifier systems, kernel methods, biometric authentication, spam filtering,and computer security. He serves as a reviewer for several internationalconferences and journals, including Pattern Recognition and PatternRecognition Letters. Dr. Biggio is a member of the IEEE Institute ofElectrical and Electronics Engineers (Computer Society), and of theItalian Group of Italian Researchers in Pattern Recognition (GIRPR),affiliated to the International Association for Pattern Recognition.

Giorgio Fumera received the M. Sc. degreein Electronic Eng., with honors, and the Ph.D.degree in Electronic Eng. and Computer Sci-ence, respectively in 1997 and 2002, from theUniversity of Cagliari, Italy. Since February 2010he is Associate Professor of Computer Eng.at the Dept. of Electrical and Electronic Eng.of the same University. His research interestsare related to methodologies and applications ofstatistical pattern recognition, and include mul-tiple classifier systems, classification with the

reject option, adversarial classification and document categorization.On these topics he published more than sixty papers in internationaljournal and conferences. He acts as reviewer for the main interna-tional journals in this field, including IEEE Trans. on Pattern Analysisand Machine Intelligence, Journal of Machine Learning Research, andPattern Recognition. Dr. Fumera is a member of the IEEE Institute ofElectrical and Electronics Engineers (Computer Society), of the ItalianAssociation for Artificial Intelligence (AI*IA), and of the Italian Groupof Italian Researchers in Pattern Recognition (GIRPR), affiliated to theInternational Association for Pattern Recognition.

Fabio Roli received his M. Sc. degree, with hon-ors, and Ph. D. degree in Electronic Eng. fromthe University of Genoa, Italy. He was a memberof the research group on Image Processing andUnderstanding of the University of Genoa, Italy,from 1988 to 1994. He was adjunct professorat the University of Trento, Italy, in 1993 and1994. In 1995, he joined the Dept. of Electricaland Electronic Eng. of the University of Cagliari,Italy, where he is now professor of computerengineering and head of the research group on

pattern recognition and applications. His research activity is focusedon the design of pattern recognition systems and their applicationsto biometric personal identification, multimedia text categorization, andcomputer security. On these topics, he has published more than twohundred papers at conferences and on journals. He was a very activeorganizer of international conferences and workshops, and establishedthe popular workshop series on multiple classifier systems. Dr. Roli isa member of the governing boards of the International Association forPattern Recognition and of the IEEE Systems, Man and CyberneticsSociety. He is Fellow of the IEEE, and Fellow of the InternationalAssociation for Pattern Recognition.