expert elicitation of adversary preferences using...

This article was downloaded by: [128.125.124.17] On: 14 August 2015, At: 11:59Publisher: Institute for Operations Research and the Management Sciences (INFORMS)INFORMS is located in Maryland, USA

Operations Research

Publication details, including instructions for authors and subscription information:http://pubsonline.informs.org

Expert Elicitation of Adversary Preferences Using OrdinalJudgmentsChen Wang, Vicki M. Bier,

To cite this article:Chen Wang, Vicki M. Bier, (2013) Expert Elicitation of Adversary Preferences Using Ordinal Judgments. Operations Research61(2):372-385. http://dx.doi.org/10.1287/opre.2013.1159

Full terms and conditions of use: http://pubsonline.informs.org/page/terms-and-conditions

This article may be used only for the purposes of research, teaching, and/or private study. Commercial useor systematic downloading (by robots or other automatic processes) is prohibited without explicit Publisherapproval, unless otherwise noted. For more information, contact [email protected].

The Publisher does not warrant or guarantee the article’s accuracy, completeness, merchantability, fitnessfor a particular purpose, or non-infringement. Descriptions of, or references to, products or publications, orinclusion of an advertisement in this article, neither constitutes nor implies a guarantee, endorsement, orsupport of claims made of that product, publication, or service.

Copyright © 2013, INFORMS

Please scroll down for article—it is on subsequent pages

INFORMS is the largest professional society in the world for professionals in the fields of operations research, managementscience, and analytics.For more information on INFORMS, its publications, membership, or meetings visit http://www.informs.org

http://pubsonline.informs.org

http://dx.doi.org/10.1287/opre.2013.1159

http://pubsonline.informs.org/page/terms-and-conditions

http://www.informs.org

OPERATIONS RESEARCHVol. 61, No. 2, March–April 2013, pp. 372–385ISSN 0030-364X (print) � ISSN 1526-5463 (online) http://dx.doi.org/10.1287/opre.2013.1159

© 2013 INFORMS

Expert Elicitation of Adversary PreferencesUsing Ordinal Judgments

Chen Wang, Vicki M. BierDepartment of Industrial and Systems Engineering, University of Wisconsin–Madison, Madison, Wisconsin 53706

{[email protected], [email protected]}

We introduce a simple elicitation process where subject-matter experts provide only ordinal judgments of the attractivenessof potential targets, and the adversary utility of each target is assumed to involve multiple attributes. Probability distributionsover the various attribute weights are then mathematically derived (using either probabilistic inversion or Bayesian densityestimation). This elicitation process reduces the burden of time-consuming orientation and training in traditional methodsof attribute weight elicitation, and explicitly captures the existing uncertainty and disagreement among experts, rather thanattempts to achieve consensus by eliminating them. We identify the relationship between the two methods and conductsensitivity analysis to elucidate how these methods handle expert consensus or disagreement. We also present a real-worldapplication on elicitation of adversarial preferences over various attack scenarios to show the applicability of our proposedmethods.

Subject classifications : expert elicitation; adversarial preference; ordinal judgment; probabilistic inversion; Bayesiandensity estimation.

Area of review : Decision Analysis.History : Received March 2012; revision received September 2012; accepted December 2012.

1. IntroductionMany adversarial decision-making problems, based oneither probabilistic risk assessment or game-theoretic mod-els, require the quantification of adversary objectives. Thisis a difficult task for numerous reasons. First of all, terroristattacks (at least those of greatest concern) have occurredrelatively rarely, and pure statistical analysis of this issueis far from ready for use in practice. Some exploratorywork exists, including Enders and Sandler (2000), Barrosand Proença (2005), and Mohtadi and Murshid (2009), butnone of these studies explicitly attempts to use historicaldata to quantify adversary utility functions. It is thereforenecessary to turn to subject-matter experts (e.g., intelli-gence experts, policy makers, security observers, and etc.)for such inputs.

However, many intelligence experts are not quantitativelytrained and may be reluctant to express their knowledgein probabilistic form. Furthermore, they often place greatweight on achieving consensus, which is not conducive toaccurately characterizing the level of uncertainty that mayexist. As a result, risk analysts sometimes have unrealis-tic expectations concerning the ability and willingness ofintelligence experts to provide quantitative risk estimates(Baker et al. 2009). The objective of this paper is to bridgethis gap by providing a simple expert-elicitation process inwhich intelligence experts are asked to give only ordinaljudgments (e.g., to rank the attractiveness of selected poten-tial targets or attack strategies), but these ordinal rankings

can be used to mathematically derive cardinal estimates formodeling adversary preferences.

An adversary’s objective could be a univariate function,such as maximizing the dollar-equivalent damage from anattack (e.g., Bier and Abhichandani 2003, Bier et al. 2005).However, there is reason to believe that more complex mul-tivariate measures of target attractiveness will be more real-istic (Paté-Cornell and Guikema 2002, Beitel et al. 2004,Rosoff and John 2009, Bier et al. 2013), taking into accountboth the resources required for an attack, and the return onthose investments (e.g., fatalities, property damage, and thesymbolic values of the targets). One crucial elicitation taskis then to use expert judgments to derive estimates for thevarious adversary attribute weights.

In this paper, we focus on two mathematical methodsthat can infer probability distributions over the variousadversary attribute weights from the experts’ ordinal judg-ments: probabilistic inversion (Cooke 1994, Bedford andCooke 2001, Kraan and Bedford 2005, Kurowicka et al.2010, Neslo et al. 2011); and Bayesian density estimation(Ferguson 1973, 1974, 1983; Escobar and West 1995). Inparticular, we extend the work on probabilistic inversion byNeslo et al. to include an unobserved attribute that is notknown to the defender but may be important to the adver-sary, which solves the infeasibility problem encountered inmany previous applications of probabilistic inversion.

We also extend Bayesian density estimation to the caseof ordinal data in a rigorous manner. Erkanli et al. (1993)

372

Dow

nloa

ded

from

info

rms.

org

by [

128.

125.

124.

17]

on 1

4 A

ugus

t 201

5, a

t 11:

59 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.

Wang and Bier: Elicitation Using Ordinal Preference RankingsOperations Research 61(2), pp. 372–385, © 2013 INFORMS 373

apply Bayesian density estimation to estimation using ordi-nal inputs. However, they first convert the rank orderingsto cardinal values (a process that may introduce additionalinformation and biases), and then treat those cardinal valuesas if they were independent (even though in fact they can-not be, since the underlying ordinal rankings are of coursenot independent). By contrast, our use of Bayesian densityestimation avoids these pitfalls by treating the entire set ofrank orderings from a given expert as a single observation(thus inherently accounting for the lack of independenceof rank orderings), and using the rank orderings directly(rather than converting them into cardinal values first).

Although the motivation for our work was the need formethods of estimating adversary preferences from ordinaldata, our work also makes methodological contributions tothe field of expert elicitation in general, especially throughthe use of unobserved attributes to ensure the feasibility ofprobabilistic inversion, and by elucidating the relationshipbetween probabilistic inversion and Bayesian density esti-mation. Moreover, our work can also be applied to othermultiattribute decision-making problems, e.g., as an alter-native to conjoint analysis in marketing, which is widelyused to quantify how customers value different productfeatures based on their ordinal preferences over products(Shocker and Srinivasan 1979, Green and Srinivasan 1990).

The rest of this paper is structured as follows. We firstreview the literature on elicitation of attribute weights in §2,and provide some background on the two methods exploredin this paper. We then introduce our basic model in §3, anddiscuss the use of probabilistic inversion (PI) and Bayesiandensity estimation (BDE) to elicit adversary preferencesusing ordinal judgments in §§4 and 5, respectively, fol-lowed by a discussion of the relationship between the twomethods in §6. We also present a real-world application ofPI to elicitation of adversarial preferences in §7. Then, §8exhibits how PI and BDE handle unobserved attributes. Weprovide sensitivity analysis on how PI and BDE behave inthe face of expert consensus or disagreement in §9. Finally,§10 concludes the paper and describes some directions forfuture work.

2. Literature ReviewTraditional methods for direct elicitation of the variousattribute weights in decision analysis include the ratiomethod (Edwards 1977), the swing-weighting method (vonWinterfeldt and Edwards 1986), and the trade-off andpricing-out methods (Keeney and Raiffa 1976). However,as noted by Edwards (1977), direct elicitation methods areoften expensive and time consuming. Moreover, assessinguncertainty over the attribute weights would require theestimation of subjective probability distributions. Althoughthis approach has a long history of successful application(Edwards 1961, Cooke 1991, Hora and Jensen 2002), itgenerally requires extensive training and orientation, espe-cially for elicitees with relatively nonquantitative back-grounds (Rosoff and John 2009).

Providing rankings rather than precise cardinal assess-ments is widely believed to be easier and more reliable(see, for example, Eckenrode 1965). Therefore, Edwardsand Barron (1994) developed a simple rank-based weight-ing scheme, SMARTER, to reduce the elicitation burden byasking for only rank orderings of the attribute weights. Thismethod has been found to perform reasonably well in awide variety of cases (Barron and Barrett 1996), but yieldsonly point estimates of attribute weights. A related methodby Abbas (2004, 2006) uses maximum entropy to convertrank orderings of alternatives into utility assessments, butdoes not provide explicit estimates of attribute weights, andtherefore cannot be applied to additional alternatives thathave not been ranked.

Conjoint analysis in marketing is another example of amethod that asks respondents only for ordinal judgments.Here, surveyed customers compare products in a facto-rial design; logistic regression is then used to estimatethe relative importance of each product attribute (Shockerand Srinivasan 1979, Green and Srinivasan 1990). Sim-ilarly, contingent valuation has been used for elicitingpublic opinion about nonmarket items, such as environ-mental goods. This method typically asks dichotomousquestions, such as “Are you willing to pay $X to keepthe status quo unchanged?” and uses logistic regressionto estimate the relative importance of various environmen-tal attributes (Hanemann 1984, McFadden 1994). Unfortu-nately, like SMARTER, the above ranking-based elicitationmethods yield only point estimates for the attribute weights.Mixed-logit models have been developed to incorporateuncertainty into conjoint analysis, but such models gen-erally assume that the covariate coefficients (i.e., attributeweights) follow the normal distribution (Revelt and Train1998, McFadden and Train 2000, Sándor and Wedel 2002).

In this paper, we investigate two methods, probabilis-tic inversion and Bayesian density estimation, to generateprobability distributions over the attribute weights insteadof just point estimates, but without distributional assump-tions such as normality.

The goal of probabilistic inversion is to find a prob-ability distribution over the input quantities of interest(e.g., attribute weights, in our case) that can reproducethe stated (theoretical or empirical) marginal distributionsover the model outputs (e.g., experts’ rank orderings oftarget attractiveness); see Cooke (1994), and Kraan andBedford (2005). The idea of using probabilistic inversion toelicit attribute weights from ordinal judgments comes fromNeslo et al. (2011). However, because Neslo et al. do notinclude unobserved attributes in their model, their approachfrequently yields no feasible solution, suggesting that theavailable set of attribute(s) is not adequate to explain thegiven expert judgments. By contrast, we explicitly accountfor any unobserved attributes, which should in theory makeit possible to obtain a perfect match between the distribu-tion over uncertain adversary preferences and the empiricaldistribution of expert rankings.

Dow

nloa

ded

from

info

rms.

org

by [

128.

125.

124.

17]

on 1

4 A

ugus

t 201

5, a

t 11:

59 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.

Wang and Bier: Elicitation Using Ordinal Preference Rankings374 Operations Research 61(2), pp. 372–385, © 2013 INFORMS

It is worth noting that the logic of probabilistic inversionis analogous to earlier work by Kadane et al. (1980), whoelicited subjective conjugate distributions for the covari-ate coefficients in a multiple linear regression model usingquantile estimates of the response variable. However, webelieve that probabilistic inversion can be applied to abroader range of problems, because it does not require theuse of conjugate priors.

Bayesian density estimation (Ferguson 1973, 1974)allows the decision maker (e.g., the defender, in our case)to assign prior probability distributions to the quantities ofinterest, and use observations of these quantities to updatethe prior distributions. In addition, the defender can alsospecify a degree of reliance on his or her own judgment,with higher reliance on the defender’s prior knowledgecorresponding to less trust in the data (i.e., expert judg-ments). Unlike in traditional Bayesian density estimation(e.g., Ferguson 1983, Escobar and West 1995), however,our data consist of expert rank orderings, so a particu-lar rank ordering corresponds not to a single point in theparameter space, but rather to a truncated region of it. Notealso that the posterior distributions in our case are oftentoo complicated to be expressed in closed form, but can beeasily simulated using Gibbs sampling (Geman and Geman1984, Gelfand et al. 1992).

3. Basic ModelIn this paper, we assume that the adversary’s target valu-ations are represented by a multiattribute utility function,which may in particular include unobserved attributes thatare important to the adversary, but have not been identi-fied by the defender (Jenelius et al. 2010, Wang and Bier2011). For simplicity, we assume that the adversary’s util-ity is linear in each of the various attributes, and theseattributes are additively independent of each other. In par-ticular, the adversary’s target valuation Un for a particulartarget n (n= 11 0 0 0 1N ) is given by

Un =

M−1∑

m=1

anmWm + YnWM (1)

where

N = number of potential targets (or attack strategies);M = number of adversary attributes;anm = adversary utility of target n on attribute m (m =

11 0 0 0 1M − 1), where anm takes on values in 60117,with 1 representing the best possible value of themth attribute and 0 representing the worst possiblevalue;

Wm = weight on attribute m, where Wm ¾ 0 for m =

11 0 0 0 1M , and∑M

m=1 Wm = 1; andYn = utility of the unobserved attribute for target n, also

taking on values in 60117.

The attribute weights, W = 4W11 0 0 0 1WM ), and the util-ities of the unobserved attribute for the various targets,

Y = 4Y11 0 0 0 1 YN ), are assumed to be uncertain (as theywill be to the defender). We use lower-case letters wand y for realizations of the vector random variables Wand Y , respectively. Note that the Yn are introduced torepresent the effects of any additional attributes that areunobserved by the defender, but could nonetheless beimportant to the adversary. There may of course be depen-dence among the Yn. However, because we generally do notknow what the unobserved attributes are, we assume a pri-ori that the Yn are independent and identically distributed.

Let ì be the space of all possible values of (W , Y ),as given by

ì=ãM415× 60117N (2)

where ãM415 is the simplex defined by 8w ∈ <M+

�∑M

m=1 wm = 19. The task of expert elicitation is then tomathematically derive probability distributions over ì thatcan match the rank orderings of target valuations Un pro-vided by the experts.

Note that the linear-additive form in (1) is often adoptedas a reasonable approximation of the true preference func-tion; see, for example, Gigerenzer and Todd (1999). In fact,our methods can also be extended in a straightforward wayto accommodate multilinear utility functions (Keeney andRaiffa 1976). However, they are not directly applicable tomore complex preference structures such as utility depen-dence (e.g., Keeney and Raiffa 1976, Abbas and Bell 2011).

4. Probabilistic InversionIn this section, we first formalize the method used by Nesloet al. (2011) to convert ordinal judgments into cardinal esti-mates using probabilistic inversion (PI), and extend thatmethod to include unobserved attributes that are not knownto the defender, but may be important to the adversary. Wethen present a Monte Carlo-based approximation of the PIproblem, and present two possible solution approaches tosolving it.

4.1. Mathematical Basis of PI

Suppose that we ask K experts to rank the top R outof N targets based on their attractiveness to the adver-sary, no ties of targets being allowed. Note that whenR = N , experts are asked to give a complete rank order-ing of all targets. (The methods presented in this papercan be extended in a straightforward manner to the casewhere the experts also provide rankings for some num-ber R′ of the least attractive targets.) We then specify anR-by-N empirical distribution matrix of expert rankings, P ,where element Prn represents the probability that target nis ranked at the r th place by a randomly chosen one ofthe K experts (as in Neslo et al. 2011). For example, sup-pose that three experts are asked to compare two targets.One of these experts thinks that target 1 is more attractiveto the adversary, whereas the other two experts both thinkthat target 2 is more attractive. In this case, the empiricaldistribution matrix for the three experts is P =

[1/3 2/32/3 1/3

]

.

Dow

nloa

ded

from

info

rms.

org

by [

128.

125.

124.

17]

on 1

4 A

ugus

t 201

5, a

t 11:

59 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


Note that because ties are not allowed, each expert can rankexactly one target in the r th place, so the row sums of Psatisfy

∑Nn=1 Prn = 1 for all r = 11 0 0 0 1R.

To illustrate the PI approach, we consider the uncer-tain adversary parameters (W1Y ) as “input,” and treat theexpert rank orderings as an “output” that depends on thevalues of (W1Y ). PI aims to find the distribution Q over ì,the space of all possible values of (W1Y ), that can matchthe empirical distribution matrix of expert rankings P ,and has the smallest Kullback-Leibler (K-L) divergence toa predetermined (e.g., noninformative) starting probabilitymeasure, Q0. The use of K-L divergence to measure thecloseness of two probability distributions in PI is recom-mended by Cooke (1994). In particular, the optimizationproblem is given by

minQ

∫

ì

dQ

dQ0

ln(

dQ

dQ0

)

dQ0

s.t.∫

ìJrn4w1 y5dQ= Prn

for r = 11 0 0 0 1R3 n= 11 0 0 0 1N 1 (3)

where J 4w1 y5 is an R-by-N indicator matrix. For a givenset of attribute weights w and utilities y of the unob-served attribute, Jrn4w1 y5 equals 1 if target n ranks inthe r th place and 0 otherwise. In addition, dQ/dQ0 is theRadon-Nikodym derivative of the probability measure Qwith respect to the starting measure Q0 (Seppäläinen 2010).Note that to have finite K-L divergence in (3), we mustassume that Q is absolutely continuous with respect to Q0.Moreover, we use the convention that if dQ/dQ0 = 0, thendQ/dQ0 ln4dQ/dQ05dQ0 = 0, following Csiszár (1975).

We could choose the starting measure Q0 to be nonin-formative if the defender had little or no prior knowledgeabout the adversary preferences before any expert judgmentbecame available. An easy choice for Q0 in that case wouldbe to adopt a “flat” starting measure, i.e., to assign equalprobability to every possible value in the adversary parame-ter space ì. In particular, the attribute weights W would beassumed to follow the Dirichlet distribution with all param-eters equal to 1; moreover, the utilities of the unobservedattribute for the various targets Y would be independentlyuniformly distributed in 60117, and also independent of theattribute weights W . If desired, of course, we could alsoconsider other types of noninformative starting measures(e.g., U -shaped instead of uniform), or an informative start-ing measure.

4.2. Monte Carlo-Based Approximation

The probability distribution Q∗ that solves Equation (3),which is a convex optimization problem, has densitysatisfying

dQ∗

dQ0

4w1 y5= c · exp{

−∑

r1 n

�rnJrn4w1 y5

}

1 (4)

where the �rn are Lagrange multipliers for the constraintsin (3), and c is a normalizing constant. However, it is dif-ficult to obtain the Lagrange multipliers �rn analyticallyin the important case of multiple experts. Therefore, weinstead resort to Monte Carlo simulation. In particular, werandomly generate S independent samples for the adversaryparameters (W1Y ) from the starting measure Q0, and let ì̄be the set of all simulated values (w4s5, y4s5), s = 11 0 0 0 1 S.We construct the discretized starting measure Q̄0 by plac-ing equal mass on every element (w4s5, y4s5) of ì̄. Theapproximate PI problem is then to find a discrete distribu-tion q = 4q11 0 0 0 1 qS) over the elements of ì̄ that yields thesmallest K-L divergence from the discretized starting mea-sure Q̄0, given that the mapping of q to the space of targetrank orderings matches the empirical distribution matrix P .In particular, the approximated PI problem is given by

minq∈ãS 415

S∑

s=1

qs ln4Sqs5

s.t.S∑

s=1

qsJrn4w4s51 y4s55= Prn

for r = 11 0 0 0 1R3 n= 11 0 0 0 1N 1 (5)

where ãS415 is the simplex defined by 8q ∈ <S+

�∑S

s=1 ·

qs = 19. Proposition 1 provides a sufficient condition toguarantee the feasibility of the approximate PI problemin (5).

Proposition 1. Suppose that a set of independent randomsamples 4w4s51 y4s55, s = 11 0 0 0 1 S, has been drawn fromthe starting measure Q0. If for each expert, ∃ s such that4w4s51 y4s55 yields the rank ordering of targets specified bythat expert, then the optimization program in (5) is feasibleand has a unique optimal solution.

Proof. See the online appendix (available as supplementalmaterial at http://dx.doi.org/10.1287/opre.2013.1159). �

When the approximate PI problem in (5) is feasible, wemay employ the iterative proportional fitting (IPF; Csiszár1975), which is a general algorithm to find the smallest K-Ldivergence between two discrete probability measures sub-ject to linear constraints. In particular, the procedure beginswith qs = 1/S for s = 11 0 0 0 1 S, and iteratively adjusts q tosatisfy exactly one of the linear equations in (5) at a time.If we have a sufficiently large number of samples S, thenfor each expert, we are ensured to have at least one sample(w4s51 y4s5) that can match that expert’s rank ordering of tar-gets. However, when it takes too many samples to ensurethe feasibility of (5) (e.g., for large numbers of targets),we could also use the iterative PARFUM algorithm devel-oped by Du et al. (2006) to get a probability distributionfor (W1Y ) corresponding to a marginal ranking distributionthat is “close” to the empirical expert ranking matrix P .

Dow

nloa

ded

from

info

rms.

org

by [

128.

125.

124.

17]

on 1

4 A

ugus

t 201

5, a

t 11:

59 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


5. Bayesian Density EstimationWe now discuss another elicitation method, Bayesian den-sity estimation (BDE). First, we explain how BDE can beapplied to our elicitation process. We then describe howGibbs sampling could be used to obtain the (expected) pos-terior distribution over the parameters of the adversary’sutility function, given ordinal judgments of adversary pref-erences by experts.

5.1. Mathematical Basis of BDE

BDE allows the defender to specify a prior distributionQp over ì, the space of adversary parameters (W1Y ), andthen treat expert judgments as observations to update thatprior, leading to a posterior distribution Q. In particular, weassume that the defender’s prior Qp is chosen in accordanceto a Dirichlet process whose expectation is the startingmeasure Q0. This simplifies the Bayesian updating, becausethe posterior Q will still be a Dirichlet process, but withdifferent parameters. See Ferguson (1973) for the definitionof a Dirichlet process. Moreover, the defender can specify aself-trust degree � (�> 0) to reflect the level of reliance onhis or her own knowledge about the adversary preferences,as opposed to the experts’.

Suppose again that we ask K experts to rank the top Rout of N targets; no ties among targets being allowed. Wecan associate the (partial) rank ordering provided by expertk (k = 11 0 0 0 1K) with a subset of ì by excluding all valuesof (W1Y ) that are inconsistent with that expert’s judgment.In particular, we denote the rank ordering of expert k by anordered set of target indices RO4k5

= 8n1k1 0 0 0 1 n

Rk 9, where

nrk is the index of the r th most attractive target according

to expert k, and define the active region AR4k5 for expert kas given by

AR4k5=

{

4w1 y5 ∈ì such that

Unrk4w1 y5¾Unr+1

k4w1 y51 r = 11 0 0 0 1R− 1

UnRk4w1 y5¾Un4w1 y51ny RO4k5

}

0 (6)

We randomly sample an observation O4k5 ∈ AR4k5⊆ ì

under the starting measure Q0 to represent expert k’s rankordering RO4k5. We then condition on O4k5 instead of RO4k5;in other words, we treat the rank ordering RO4k5 as if it wereequivalent to a single random point O4k5 ∈ì that generatesrank ordering RO4k5. The distribution Q

4k50 for random point

O4k5 is therefore proportional to

Q0 · 184w1 y5∈AR4k591 (7)

where 184w1 y5∈AR4k59 equals 1 if 4w1 y5 ∈ AR4k5 and 0 oth-erwise; i.e., Q

4k50 is a truncated version of Q0 that puts

nonzero mass on only those values (w1y) that are in theactive region AR4k5. In what follows, we use the lower-caseletter o4k5 to denote a realization of O4k5.

Suppose that the random observations O4k5 (k = 110 0 0 1K) for the various experts are independent of eachother, and also independent of the prior Dirichlet pro-cess Qp. Then the posterior distribution Q conditionalon the observations O4k5 (k = 11 0 0 0 1K) is a mixture ofDirichlet processes (Antoniak 1974), as given by

Q �O4151 0 0 0 1O4K5

∼

∫

· · ·

∫

DP

{

�Q0 +

K∑

k=1

�4o4k55

}

dQ4150 0 0 0 dQ

4K50 1

where �4o4k55 is the probability distribution giving unitmass to the point o4k5.

Moreover, the expectation E6Q7 of the posterior distri-bution is a weighted sum of probability distributions asgiven by

E6Q7=�

�+KQ0 +

1�+K

K∑

k=1

Q4k50 0 (8)

In traditional BDE, each of the Q4k50 in (8) would reduce to

a unit probability mass at a single point; however, in ourapplication, Q4k5

0 is instead a truncated version of the start-ing measure, maintaining nonzero mass over that portionof the domain consistent with the rank orderings given byexpert k. Note that larger values of � correspond to higheremphasis on the defender’s prior guess Q0 and lower trustin the expert judgments. In particular, one can interpret thisas the defender weighting his or her judgment as equivalentto the judgments of some number � of experts.

5.2. Gibbs Sampling for BDE

Clearly, (8) is a linear pool of K + 1 probability distribu-tions; i.e.,

4W1Y 5∼

Q0 with probability�

�+K1

Q4k50 with probability

1�+K

for k=110001K1

which can be simulated by drawing random samplesfrom the starting measure Q0 with probability �/4�+K5,and from each of its truncated variants Q4k5

0 with probability1/4�+K5.

Note that for a given vector of utilities y of the unob-served attribute, the active region AR4k5 for expert k isgenerally a polyhedron of the attribute weights w (andvice versa). Therefore, we propose to use Gibbs sampling,which is simple to implement, and is popular in the fieldof Bayesian analysis with constrained parameters (Gelfandet al. 1992).

In particular, Gibbs sampling generates random sam-ples for the adversary parameters (W11 0 0 0 1WM ; Y11 0 0 0 1 YN )cyclically, from the univariate conditional distribution forone parameter at a time, while keeping the values of other

Dow

nloa

ded

from

info

rms.

org

by [

128.

125.

124.

17]

on 1

4 A

ugus

t 201

5, a

t 11:

59 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


parameters fixed. It is generally much easier to sample fromthe univariate conditionals than from the joint distribution.For example, take the utility Yn of the unobserved attributefor target n. With the values of all other parameters fixed,the possible values for Yn consistent with the active regionAR4k5 are constrained to a bounded interval. Sampling ofthe attribute weights Wm is similar, except that we need toconsider the fact that

∑Mm=1 Wm = 1.

Gibbs sampling is shown to produce random samplesthat converge in distribution to the target joint probabilitydistribution (Tierney 1994). However, the convergence ratecan be slow, and it is not always clear when to stop theprocedure in practice. In this paper, we simply predefine alarge number of iterations (106) and remove the first 10% ofthe samples to get rid of the influence of the starting point.

6. Relationship Between PI and BDEIn this section, we explore the relationship between PI andBDE when applied to ordinal preference rankings. In par-ticular, PI exploits only marginal rank orderings, whereasBDE captures correlations among the judgments providedby the experts.

Most of our results assume zero weight on the defender’sjudgment in BDE. Note that this is done only for compar-ison purposes, because PI does not allow for an equivalentself-trust parameter. However, this does not mean that wewould advocate putting zero weight on the defender’s judg-ment in practice.

Hereafter, we choose the starting measure Q0 to guaran-tee a feasible solution to (3).

Proposition 2. Assume that the defender’s self-trustdegree �→ 0. If there is only one expert, or the experts allgive the same rank ordering, then PI and BDE will yield thesame probability distributions for all adversary parameters.

Proof. See the online appendix. �However, if the experts give different rank orderings,

PI and BDE generally produce different results, even ifwe assume that the defender’s self-trust degree � → 0. Toinvestigate this discrepancy, we define the “composition ofexpert rank orderings” as a vector of the proportion ofexperts giving each possible (partial or full) rank ordering.For example, if one expert thinks that the top three out offive targets are targets 1, 2, and 3 (in that order), whereasthree other experts all rank targets 1, 4, and 5 as the topthree (in that order), then the composition of expert rankorderings is 25% for the rank ordering 1 > 2 > 3 > 84159and 75% for the rank ordering 1 > 4 > 5 > 82139.

It is trivial to see that if multiple compositions ofexpert rank orderings yield the same empirical distribu-tion matrix P , then PI will give the same results for allsuch compositions. By contrast, BDE can generate differentresults for different compositions of expert rank orderings,even when they correspond to the same empirical distribu-tion matrix P . We show this by an example.

Table 1. Values of adversary attributesfor Example 1.

Attribute 1 Attribute 2

Target 1 1 0Target 2 0 1Target 3 005 005

Example 1. Suppose that two groups of experts are askedto give full rank orderings of three targets described by twoknown adversary attributes. Assumed attribute values andhypothetical expert rank orderings are given in Tables 1and 2, respectively. We assume that the starting measure Q0

is flat. For comparison purposes, we also let the defender’sself-trust degree �→ 0.

Figure 1 presents the resulting probability distributionsfor the first known attribute weight W1 for both methods(PI and BDE) and both expert groups (A and B), alongwith the corresponding mean values. PI gives identical dis-tributions for both groups (as shown in the upper panel ofFigure 1), because they have identical empirical distributionmatrices; i.e.,

PA = PB =

1/3 1/3 1/31/3 1/3 1/31/3 1/3 1/3

0

By contrast, BDE provides different results for the twogroups (as shown in the lower panel of Figure 1). Note thatthe first expert in group B gives a rank ordering of targetsthat is perfectly consistent with their values on attribute 1,reflecting a high weight on that attribute, and thus result-ing in a broader distribution over W1 for group B than forgroup A. The following proposition describes the funda-mental relationship between PI and BDE.

Proposition 3. Assume that the defender’s self-trustdegree � → 0. Consider all possible compositions ofexpert rank orderings that can yield the given empiricaldistribution matrix P . If BDE gives multiple probabilitydistributions for those compositions, then among those BDEdistributions, the one that has the smallest K-L divergencefrom the starting measure Q0 coincides with the result givenby PI using the same Q0.

Proof. See the online appendix. �

Table 2. Hypothetical expert rankorderings for Example 1.

Rank orderingGroup Expert of targets

Group A 1 1 > 2 > 32 2 > 3 > 13 3 > 1 > 2

Group B 1 1 > 3 > 22 2 > 1 > 33 3 > 2 > 1

Dow

nloa

ded

from

info

rms.

org

by [

128.

125.

124.

17]

on 1

4 A

ugus

t 201

5, a

t 11:

59 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


Figure 1. Elicited probability distributions for W1 in Example 1.

PI: Group A

W1 W1

W1 W1

Fre

quen

cy

Fre

quen

cy

0.0 0.2 0.4 0.6 0.8 1.0

0

500

1,500

2,500

0

500

1,500

2,500

Fre

quen

cy

0

500

1,500

2,500

Fre

quen

cy

0

500

1,500

2,500

0.29

PI: Group B

0.0 0.2 0.4 0.6 0.8 1.0

0.29

BDE: Group A

0.0 0.2 0.4 0.6 0.8 1.0

0.25

BDE: Group B

0.0 0.2 0.4 0.6 0.8 1.0

0.32

BDE essentially just uses a linear opinion pool basedon the chosen starting measure to aggregate the probabilitydistributions that are elicited from the individual experts.By contrast, PI is more complicated. If we find every pos-sible composition of expert rank orderings that satisfies thegiven empirical distribution matrix P (of which there canbe infinitely many!) and apply BDE to all of them, thenPI will pick the resulting BDE probability distribution thatis closest to the chosen starting measure over the adver-sary parameters (e.g., by choosing the maximum entropydistribution if we adopt a flat starting measure). In Exam-ple 1, the distribution for W1 generated by PI (see the upperpanel of Figure 1) in fact coincides with the BDE resultfor a composition that matches the given P , but is differ-ent from that of either group A or B, with equal propor-tions of experts giving all 3! = 6 possible rank orderings ofthree targets. (In other words, with six experts, BDE andPI will give the same results if each expert gives a differentrank ordering.)

Thus, we can see that PI exploits only marginal informa-tion about target rankings and fails to take into account cor-relations among subgroups of experts (e.g., if those experts

ranking target 1 higher than target 2 may also rank target 3higher than target 4). By contrast, BDE is able to utilizethat correlational information. However, one should notethat using more information may not necessarily make BDEperform more sensibly if, for example, the results seemoverly sensitive to minor changes in expert rank orderings.

We would ideally like an elicitation method that is sen-sitive to the absolute amount of information provided byexperts. For example, with only a small number of experts,we may want a method that yields a flatter distributionthan when a large number of experts is available. Unfor-tunately, neither PI nor BDE is able to explicitly capturethat idea (at least if we set the defender’s self-trust degree�→ 0). However, when applying BDE, the defender couldassign a relatively high self-trust degree (i.e., large �) toat least qualitatively account for the lack of reliability inexpert judgments when only a small number of experts isavailable. We use the following example to illustrate this.

Example 2. Suppose that two experts are asked to giverank orderings of targets described by just one knownadversary attribute. Suppose also that the elicited densities

Dow

nloa

ded

from

info

rms.

org

by [

128.

125.

124.

17]

on 1

4 A

ugus

t 201

5, a

t 11:

59 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


Figure 2. BDE densities of W1 for different levels of defender self-trust.

0.0 0.2 0.4 0.6 0.8 1.0

0

1

2

3

4

5

W1 W1 W1

Den

sity

Expert 1 Expert 2 � → 0 � = 2

0.0 0.4 0.8

0.0

1.0

2.0

3.0

0.0

1.0

2.0

3.0

Den

sity

0.0

1.0

2.0

3.0

Den

sity

0.0

1.0

2.0

3.0

Den

sity

Den

sity

� = 10

0.0 0.4 0.8

W1 W1

0.0 0.4 0.8 0.0 0.4 0.8

� = 20

for the known attribute weight (W1) using judgments ofthe two individual experts are given by Beta(2012) andBeta(2120), respectively. Figure 2 then shows the BDE den-sities under a flat starting measure, considering various lev-els of self-trust � for the defender. As � increases, theaggregated probability density places less reliance on theexpert judgments and becomes less informative.

The computational complexity of BDE based on Gibbssampling grows only linearly with the number of experts,the number of targets, and the number of uncertain adver-sary parameters. Therefore, one advantage of using BDEis that we can always anticipate obtaining a good solutionwithin a controllable time constraint.

As for PI, if the experts do not deviate too much fromthe available set of known adversary attributes when giv-ing their rank orderings, then the run time will be roughlyequal for any number of experts. This favorable featureconstitutes an advantage for PI when handling large num-bers of experts. However, it sometimes requires too manysamples from Q0 to ensure a perfect match between theexpert judgments and the distribution of rankings producedby the PI method. This is especially likely to occur ifsome expert judgments cannot possibly be explained by theknown attributes. This makes the computational behaviorof PI somewhat difficult to analyze, because it may varyfrom case to case.

7. Real-World ApplicationWe have successfully applied probabilistic inversion toadversary preference elicitation in Center for Risk and Eco-nomic Analysis of Terrorism Events (CREATE) (2011).In that project, “proxy” experts (graduate students knowl-edgeable about terrorism, from countries where support forterrorism is relatively common) were asked to rank eightattack scenarios based on their attractiveness to the adver-sary, where attack scenarios are characterized by seven

known attributes plus an unknown attribute. Figure 3 showshow the expected utilities of the various scenarios differdepending on whether a proxy’s judgments were elicitedusing PI, or by direct elicitation using the random utilitymethod of Rosoff and John (2009). Both methods identi-fied the same three least attractive scenarios (PneumonicPlague, Dirty Bomb, and Blister Agent) and assigned rel-atively high utilities to another three scenarios (ChlorineTank Explosion, Improvised Explosive Device, and FoodContamination). The only discrepancies are in Nerve Agentand Aerosol Anthrax (which were rated high using PI, butmuch lower using direct elicitation). However, we couldtake advantage of such discrepancies as input for conver-gent validation—e.g., by asking the proxy expert whetherhe puts more credence in his scenario rankings or hisassessed attribute weights.

Moreover, the results in CREATE (2011) also suggestthat applying PI to partial rather than full rank orderingscan give reasonably reliable results. Figure 4 compares theresults of PI using the complete set of eight scenario rank-ings given by one proxy expert, versus only four rankings(the top three most attractive scenarios and the least attrac-tive scenario). The two sets of expected utilities are quiteclose. If similar results are obtained in more extensive stud-ies, this would support the idea that attribute weights canbe estimated by asking experts to rank only a modest subsetof alternatives, with no need to rank all alternatives.

8. Treatment of Unobserved AttributesThe elicited weight for the unobserved attribute (fromeither PI or BDE) can be used as a measure of how well thegiven expert judgments can be explained by the assumedadversary attributes. In particular, for a given set of tar-gets and their attribute values, the larger the weight we getfor the unobserved attribute, the less capable the knownattributes are of matching the expert opinions, and the more

Dow

nloa

ded

from

info

rms.

org

by [

128.

125.

124.

17]

on 1

4 A

ugus

t 201

5, a

t 11:

59 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


Figure 3. Expected utilities of attack scenarios obtained by PI vs. direct elicitation.

Probabilistic inversion0.8

0.6

0.4

0.2

Atta

cker

’s e

xpec

ted

utili

ty

0.0

0.8

0.6

0.4

0.2

Atta

cker

’s e

xpec

ted

utili

ty0.0

Random utility

Aerosol anthrax

Blister agent

Chlorine tank explosion

Dirty bomb

Food contamination

Improvised explosive device

Nerve agent

Pneumonic plague

we need to investigate the nature of any possible unob-served attributes.

One caveat of our model is that even if the rank order-ings of targets are perfectly consistent with their values onthe known attributes, we could still get a nonzero weightfor the unobserved attribute. Nonetheless, the mean of thatweight generally decays as the number of targets N grows,and may become arbitrarily small when N gets sufficientlylarge. To illustrate this, consider the following example.

Example 3. Suppose that an expert is asked to rank N tar-gets described by just one known adversary attribute, wherewe assume a flat starting measure. Note that when thereis only one expert, PI and BDE will yield the same result(if we set the defender’s self-trust degree �→ 0). We alsoassume that the target values on the known attribute (a·1)form an arithmetic series with maximum 1 and minimum 0.For example, when we have N = 4 targets, the sortedattribute values are 4a111 a211 a311 a415= 4112/311/3105. Ifthe expert gives a rank ordering of targets that is per-fectly consistent with their values on the known attribute,then the mean elicited weight E6W27 for the unobservedattribute will be strictly decreasing in the number of tar-gets N (at least for N ¶ 11000). Moreover, E6W27 gets

Figure 4. Expected utilities based on full vs. partial ranks.

Full ranks0.8

0.6

0.4

0.2Atta

cker

’s e

xpec

ted

utili

ty

0.0

0.8

0.6

0.4

0.2

Atta

cker

’s e

xpec

ted

utili

ty

0.0

Partial ranks

Aerosol anthrax

Blister agent

Chlorine tank explosion

Dirty bomb

Food contamination

Improvised explosive device

Nerve agent

Pneumonic plague

arbitrarily close to zero as N → � (see proof in the onlineappendix).

However, it is important to know how quickly the weightfor the unobserved attribute declines in practice. To inves-tigate this, we now randomly generate 500 sets of valuesfor the known attribute for each given number of targets N .Figure 5 reports the 90% confidence interval of E6W27 asa function of N , assuming that the expert bases his judg-ment entirely on the simulated attribute values. As a ruleof thumb, we may regard the weight for the unobservedattribute as being negligible in the case of perfect consis-tency when the number of targets is sufficiently large (e.g.,N ¾ 15). (Of course, if the rank orderings of target attrac-tiveness provided by experts cannot be well explained bythe known adversary attributes, no matter whether the judg-ments are based on a small or large number of targets, wewill get a high weight for the unobserved attribute.)

Note that assigning a moderate weight to the unob-served attribute in the case of perfect consistency for smallN may not actually be a flaw of our method. In fact, agiven set of known attributes that can well explain the rankorderings of 10 targets should essentially be more reli-able than if those same attributes can explain the relative

Dow

nloa

ded

from

info

rms.

org

by [

128.

125.

124.

17]

on 1

4 A

ugus

t 201

5, a

t 11:

59 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


Figure 5. Confidence interval for the mean elicitedweight of the unobserved attribute in the caseof perfect consistency.

0.0

0.2

0.4

0.6

E[W

2]

0.8

1.0Mean

90% C.I.

Number of targets (N )

2015105

attractiveness of only 2 targets. In particular, when there are2 targets to compare, the known attributes could easily givea perfect match just by coincidence, something that is lesslikely to happen for larger N . Therefore, the results of ourmethod are conservative in the sense that they avoid placingtoo much weight on the known attributes when there areonly a small number of targets whose rankings are beingexplained.

9. Sensitivity Analysis: ExpertConsensus and Disagreement

In this section, we conduct sensitivity analysis to explorethe behavior of both PI and BDE in the face of expertconsensus or disagreement. In particular, we investigate thefollowing questions: (1) If there are different “schools ofthought” between experts (i.e., subgroups of experts whohold similar views), do the elicitation methods tend togenerate multimodal probability distributions, or do theygenerate distributions that assign most of their mass inthe middle between the different expert views? (2) Do theelicited probability distributions given by these methodsadequately reflect the level of consensus or disagreementamong the rank orderings given by the experts?

9.1. Tendency to GenerateMultimodal Distributions

Hypothetically, one might speculate that whether PI andBDE will tend to produce multimodal distributions for theadversary parameters would depend on how far apart thediffering expert views are from each other. Moreover, PIand BDE are anticipated to behave most differently whensubsets of the ordinal judgments provided by the expertsare correlated (e.g., experts ranking target 1 higher thantarget 2 would also rank target 3 higher than target 4, andvice versa), because in that case, PI will not be able to

Table 3. Values of adversary attributes (anm)for the sensitivity analysis.

Attribute 1 Attribute 2

Target 1 1 0Target 2 0 1Target 3 0.75 0.25Target 4 0.25 0.75Target 5 0.55 0.45Target 6 0.45 0.55

capture such correlations. We now conduct a Monte Carlo-based sensitivity analysis to test these hypotheses.

Consider a case where four experts are asked to ranksix targets described by two known adversary attributes(with values of the known attributes given in Table 3). Thischoice of problem scale is complicated enough to illustratethe question of interest reasonably well, and yet is compu-tationally inexpensive. In addition, we choose a flat startingmeasure for both PI and BDE.

We now randomly simulate expert rank orderings asinput for our analysis. We first introduce a set of randomvariables ukn ranging between 0 and 1 to reflect the util-ity of target n according to expert k (k = 11 0 0 0 14; n =

11 0 0 0 16), and then derive rank orderings from the ran-domly generated target utilities ukn. In this way, we caninduce a dependency structure among the ukn by adoptingthe Gaussian copula with desired levels of pairwise corre-lations (Bier and Yi 1995, Clemen and Reilly 1999, Hora2010). In particular, we assume that experts 1 and 2 andexperts 3 and 4 form two different schools of thought, andset the Pearson correlation coefficients according to

cor6u1n1 u2n7= cor6u3n1 u4n7= ��3 and

cor6u1n1 u3n7= cor6u1n1 u4n7= cor6u2n1 u3n7

= cor6u2n1 u4n7= �1 for n= 11 0 0 0 161

where � ∈ 4−1115 controls the “similarity” of the two dif-fering schools of expert judgments. We also set the Pearsoncorrelation coefficients between the utilities of targets 1and 3 and targets 2 and 4 for a given expert to both equal� ∈ 40115; i.e.,

cor6uk11 uk37= cor6uk21 uk47= �1 for k = 11 0 0 0 141

where higher � means that experts who rank target 1 higherthan 2 are likely to rank target 3 higher than 4, and viceversa. We can then construct a valid correlation matrix(i.e., symmetric and positive definite with all diagonal ele-ments equal to 1) for the target utilities ukn that satisfy bothof the above conditions (correlations between experts andbetween targets).

For each level of expert similarity � and target correla-tion �, we randomly generate 500 sets of values for the ukn

and derive the rank orderings accordingly. Applying eitherPI or BDE, we obtain 500 elicited distributions for thefirst known attribute weight W1, and count the occurrence

Dow

nloa

ded

from

info

rms.

org

by [

128.

125.

124.

17]

on 1

4 A

ugus

t 201

5, a

t 11:

59 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


Figure 6. Proportions of multimodal distributionsresulting from PI vs. BDE.

0

5

10

15

20

25

30

35

–1.0 –0.5 0 0.5 1.0

Pro

port

ion

of m

ultim

odal

dis

trib

utio

ns (

%)

Similarity of the two schools of thought (�)

(a) Probabilistic inversion

� = 0.95

� = 0.5

� = 0.05

0

5

10

15

20

25

30

35

–1.0 –0.5 0 0.5 1.0

Pro

port

ion

of m

ultim

odal

dis

trib

utio

ns (

%)

Similarity of the two schools of thought (�)

(b) Bayesian density estimation

of multimodal distributions. Figure 6 shows the propor-tions of multimodal distributions (out of 500) for differentchoices of � and �.

As we expected, the elicited distributions are more likelyto have multiple peaks when the two subgroups of expertjudgments are farther apart (corresponding to more nega-tive values of �), using either PI and BDE on a flat startingmeasure. However, there is a less than 30% chance of mul-timodal distributions using either PI or BDE, even when thetwo differing schools of expert views are almost oppositeof each other (� = −0095).

Moreover, for a fixed level of expert similarity �, morehighly correlated rank orderings (corresponding to largervalues of �) generally lead to more frequent occurrence ofmultimodal distributions. This effect is more pronouncedfor BDE than for PI, as expected. In general, BDE tends togive more multimodal distributions than PI. This tendency

is especially significant when target rankings given by eachexpert are highly correlated (i.e., �= 0095).

Whether we actually want to see a multimodal distri-bution from elicitation may depend on the problem underinvestigation. For example, we may prefer multimodal dis-tributions when a large number of experts seem to form dif-fering subgroups (in which similar views are held), becausewe might be fairly confident that any new expert wouldthen give judgments that fall into some existing school ofthought. By contrast, if a small number of experts dis-agree with each other, we may not want the elicited dis-tributions to be too sensitive to differences between theirjudgments.

9.2. Expert Disagreement and Dispersion ofElicited Distributions

We now discuss another important issue. Ideally, proba-bility distributions provided by a good elicitation methodshould adequately reflect the level of consensus or disagree-ment among the rank orderings given by experts. We there-fore conduct another Monte Carlo-based sensitivity analysisto explore whether higher levels of disagreement betweenthe experts lead to broader probability distributions for theattribute weights.

In particular, we consider a case where two experts areasked to give full rank orderings to 10 targets describedby one known adversary attribute. We assume that the tar-get values on the known attribute (a·1) form an arithmeticseries with maximum 1 and minimum 0. Target utilities ukn

according to the two experts (k = 112; n = 11 0 0 0 110) arethen randomly generated, from which rank orderings arederived. Dependency between judgments of the two expertsis induced by setting the Pearson correlation coefficientsbetween the target utilities as

cor6u1n1 u2n7= �1 for n= 11 0 0 0 1101

where � ∈ 4−1115 again controls the level of agreementbetween the two experts. The unrestricted correlations arethen properly set to ensure that the correlation matrix forukn is valid.

We use the normalized variance to measure dispersionof the elicited distributions over the adversary attributeweights. Note that the normalized variance for a ran-dom variable X ∈ 60117 is defined as NV6X7 = Var6X7/E6X741 −E6X75 (Bier and Yi 1995). In particular, we haveVar6X7 ¶ E6X741 − E6X75 for X ∈ 60117, with equalityachieved when E6X7= 0 or 1, so NV6X7 gives variance asa fraction of its maximum value.

We randomly generate 200 sets of target utilities ukn

that satisfy the correlation requirements, with expert rankorderings derived accordingly. This number of simulationruns seems reasonable, because the simulation errors forthe quantity of interest (e.g., the normalized variance ofthe known attribute weight NV6W17) are always less than±5%. Figure 7 shows the average and the 90% confidence

Dow

nloa

ded

from

info

rms.

org

by [

128.

125.

124.

17]

on 1

4 A

ugus

t 201

5, a

t 11:

59 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


Figure 7. Average and confidence interval of NV6W17resulting from PI vs. BDE.

I

1.0

1.0

0.5

0.5

0.0

0.0

Expert similarity (�)

Expert similarity (�)

–0.5

–0.5

–1.0

–1.0

0.0

0.2

0.4

0.6

0.8

Nor

mal

ized

var

ianc

e of

W1

Nor

mal

ized

var

ianc

e of

W1

0.0

0.2

0.4

0.6

0.8

(a) Probabilistic inversion

(b) Bayesian density estimation

Mean90% C.I.

interval of the normalized variance NV6W17 from both PIand BDE for varying levels of expert similarity �, againadopting a flat starting measure.

In general, both PI and BDE conform to the pre-dicted trend of generating probability distributions withhigher normalized variance for higher levels of disagree-ment between experts. However, such a trend is less signif-icant when experts strongly disagree with each other (i.e.,� < 0) than when they give similar judgments (i.e., � > 0).In other words, the breadth of the elicited probability dis-tributions does decrease with expert similarity, but is lesssensitive to the precise degree of similarity when expertsare highly likely to give opposite judgments.

10. Conclusions and Future WorkIn this paper, we develop a simple elicitation process togenerate probability distributions for uncertain adversary

preferences using only rank orderings of target attractive-ness provided by experts. To accomplish this task, we dis-cuss two mathematical methods, probabilistic inversion andBayesian density estimation. One novel feature of our workis the inclusion of unobserved attributes in PI, ensuring theexistence of a feasible solution to the inversion problem.Other contributions include the application of BDE to ordi-nal data in a rigorous manner, and our elucidation of therelationship between PI and BDE.

PI exploits only marginal expert rankings. This featuremakes PI suitable to use when expert judgments are lessreliable (e.g., when using only a small number of expertsrepresenting a much larger population), in which case wemay not want to put too much weight on the observeddifferences between experts. We could also use BDE in thatcase, by assigning a large weight to the defender’s priorknowledge. However, when we wish to explicitly accountfor correlated rank orderings (e.g., for large numbers ofexperts), BDE will be more appropriate, because PI is notable to capture such correlations.

The elicitation methods in this paper have been devel-oped based on the assumption that experts give partial (orfull) rank orderings of targets without ties. However, inpractice, experts may find various targets equally attrac-tive to the adversary, and thus give tied rank orderings forthem. Therefore, there is a need to investigate how to allowfor ties in expert rank orderings when applying both PIand BDE.

Moreover, experts may disagree on whether a highervalue of a particular attribute would make a target moreor less attractive to the adversary. This could be accom-modated by extending the elicitation methods to allow fornegative attribute weights, to take such conflicting expertopinions into account in an automated way.

In other work, we have shown that when the adversaryattributes are highly correlated, the elicitation results canbe unstable in the face of small changes in the attributevalues. Thus, another future task is to study the effect ofattribute collinearity or the instability of the elicited quan-tities. The effect of attribute collinearity here is analogousto that in a multiple regression model; i.e., when predictorvariables in a regression model are highly correlated, theregression coefficients may change erratically in responseto small changes in the data. Ideally, we hope to develop atype of “significance test” to investigate whether removinga particular attribute will significantly change the relativeweights of the remaining attributes (and the importance ofthe unobserved attribute).

Overall, however, the simplicity and computational easeof our proposed elicitation process make it promising forelicitation problems that involve large numbers of elici-tees with nonquantitative backgrounds. For example, theapproach might be suitable for use in quantifying customerpreferences in marketing, using online surveys.

Dow

nloa

ded

from

info

rms.

org

by [

128.

125.

124.

17]

on 1

4 A

ugus

t 201

5, a

t 11:

59 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


Supplemental Material

Supplemental material to this paper is available at http://dx.doi.org/10.1287/opre.1120.1159.

Acknowledgments

The authors are thankful to the area editor of Operations Research(Kevin McCardle), an associate editor, and three anonymousreferees for their insightful comments, which were helpful inimproving this paper. This research was supported by the U.S.Department of Homeland Security through the National Center forRisk and Economic Analysis of Terrorism Events [Grant 2010-ST-061-RE0001]. However, any opinions, findings, and conclusionsor recommendations in this document are those of the authorsand do not necessarily reflect views of the U.S. Department ofHomeland Security.

ReferencesAbbas AE (2004) Entropy method for adaptive utility elicitation. IEEE

Trans. Systems, Sci. Cybernetics 32(2):169–178.Abbas AE (2006) Maximum entropy utility. Oper. Res. 54(2):277–290.Abbas AE, Bell DE (2011) One-switch independence for multiattribute

utility functions. Oper. Res. 59(3):764–771.Antoniak CE (1974) Mixtures of Dirichlet processes with applications to

Bayesian nonparametric problems. Ann. Statist. 2:1152–1174.Baker J, Wool M, Smith A, Kahan J, Ansel C, Hammar P, McGar-

vey D, Phillips M, Lark R (2009) Risk analysis and intelligencecommunities collaborative framework. Report, Homeland SecurityInstitute, Arlington, VA. Accessed January 4, 2013, http://www.homelanddefense.org/downloads/Risk-Intel%20Collaboration%20Final%20Report.pdf.

Barron FH, Barrett BE (1996) The efficacy of SMARTER—Simple multi-attribute rating technique extended to ranking. Acta Psychologica93:23–36.

Barros CP, Proença I (2005) Mixed logit estimation of radical Islamicterrorism in Europe and North America: A comparative study. J. Con-flict Resolution 49(2):298–314.

Bedford TJ, Cooke RM (2001) Probabilistic Risk Analysis: Foundationsand Methods (Cambridge University Press, New York).

Beitel GA, Gertman DI, Plum MM (2004) Balanced scorecard method forpredicting the probability of a terrorist attack. Brebbia CA, ed. RiskAnalysis IV (WIT Press, Southampton, UK), 581–592.

Bier VM, Abhichandani V (2003) Optimal allocation of resources fordefense of simple series and parallel systems from determinedadversaries. Haimes YY, Moser DA, Stakhiv EZ, eds. Risk-BasedDecisionmaking in Water Resources X (American Society of CivilEngineers, Santa Barbara, CA), 59–76.

Bier VM, Yi W (1995) A Bayesian method for analyzing dependencies inprecursor data. Internat. J. Forecasting 11(1):25–41.

Bier VM, Bonorato JM, Wang C (2013) Achieving realistic levels ofdefensive hedging based on non-monotonic and multi-attribute ter-rorist utility functions. Herrmann J, ed. Handbook of OperationsResearch for Homeland Security (Springer, New York), 125–139.

Bier VM, Nagaraj A, Abhichandani V (2005) Optimal allocation ofresources for defense of simple series and parallel systems fromdetermined adversaries. Reliability Engrg. System Safety 87:313–323.

Center for Risk and Economic Analysis of Terrorism Events (2011) Adap-tive adversary modeling for terrorism risk analysis: Final report. Uni-versity of Southern California, Los Angeles.

Clemen RT, Reilly T (1999) Correlations and copulas for decision andrisk analysis. Management Sci. 45(2):208–224.

Cooke RM (1991) Experts in Uncertainty: Opinion and Subjective Prob-ability in Science (Oxford University Press, Oxford, UK).

Cooke RM (1994) Parameter fitting for uncertain models: Modeling uncer-tainty in small models. Reliability Engrg. System Safety 44:89–102.

Csiszár I (1975) I-divergence geometry of probability distributions andminimization problems. Ann. Probab. 3:146–158.

Du C, Kurowicka D, Cooke RM (2006) Techniques for generic proba-bilistic inversion. Comput. Statist. Data Anal. 50(1):1164–1187.

Eckenrode RT (1965) Weighting multiple criteria. Management Sci.12(3):180–192.

Edwards W (1961) Behavioral decision theory. Annual Rev. Psych.12:473–498.

Edwards W (1977) How to use multiattribute utility measurement forsocial decision making. IEEE Trans. Man, Systems, Cybernetics7:326–340.

Edwards W, Barron FH (1994) SMARTS and SMARTER: Improved sim-ple methods for multiattribute utility measurement. Organ. Behav.Human Decision Processes 60:306–325.

Enders W, Sandler T (2000) Is transnational terrorism becoming morethreatening: A time-series investigation. J. Conflict Resolution44(3):307–332.

Erkanli A, Stangl DK, Muller P (1993) A Bayesian analysis of ordinaldata. ISDS Discussion Paper 93-A01, Duke University, Durham, NC.

Escobar MD, West M (1995) Bayesian density estimation and inferenceusing mixtures. J. Amer. Statist. Assoc. 90:577–588.

Ferguson TS (1973) A Bayesian analysis of some nonparametric prob-lems. Ann. Statist. 1:209–230.

Ferguson TS (1974) Prior distributions on spaces of probability measures.Ann. Statist. 2:615–629.

Ferguson TS (1983) Bayesian density estimation by mixtures of nor-mal distributions. Rizvi MH, Rustagi J, Siegmund D, eds. RecentAdvances in Statistics (Academic Press, New York), 287–409.

Gelfand AE, Smith AFM, Lee TM (1992) Bayesian analysis of con-strained parameter and truncated data problems using Gibbs sam-pling. J. Amer. Statist. Assoc. 87:523–532.

Geman S, Geman D (1984) Stochastic relaxation, Gibbs distributionsand the Bayesian restoration of images. IEEE Trans. Pattern Anal.Machine Intelligence 6:721–741.

Gigerenzer G, Todd PM, ABC Research Group (1999) Simple HeuristicsThat Make Us Smart (Oxford University Press, New York).

Green PE, Srinivasan V (1990) Conjoint analysis in marketing: New devel-opments with implications for research and practice. J. Marketing54(4):3–19.

Hanemann WM (1984) Welfare evaluation in contingent valuation exper-iments with discrete responses. Amer. J. Agricultural Econom.66:332–341.

Hora S (2010) An analytic method for evaluating the performanceof aggregation rules for probability densities. Oper. Res. 58(5):1440–1449.

Hora S, Jensen M (2002) Expert Judgement Elicitation (Swedish RadiationProtection Authority, Stockholm).

Jenelius E, Westin J, Holmgren ÅJ (2010) Critical infrastructure protectionunder imperfect attacker perception. Internat. J. Critical Infrastruc-ture Protection 3(1):16–26.

Kadane JB, Dickey JM, Winkler RL, Smith WS, Peters SC (1980) Interac-tive elicitation of opinion for a normal linear model. J. Amer. Statist.Assoc. 75(372):845–854.

Keeney R, Raiffa H (1976) Decisions with Multiple Objectives: Prefer-ences and Value Trade-Offs (John Wiley & Sons, Inc., New York).

Kraan B, Bedford T (2005) Probabilistic inversion of expert judg-ments in the quantification of model uncertainty. Management Sci.51(6):995–1006.

Kurowicka D, Nauta M, Jozwiak K, Katarzyna J, Cooke RM (2010)Updating parameters of the chicken processing line model. Risk Anal.30(6):934–944.

McFadden D (1994) Contingent valuation and social choice. Amer. J. Agri-cultural Econom. 76:689–708.

McFadden D, Train K (2000) Mixed MNL models for discrete response.Appl. Econometrics 15:447–470.

Mohtadi H, Murshid A (2009) The risk of catastrophic terrorism: Anextreme value approach. J. Appl. Econometrics 24:537–559.

Neslo R, Micheli F, Kappel CV, Selkoe KA, Halpern BS, Cooke RM(2011) Modeling stakeholder preferences with probabilistic inver-sion: Application to prioritizing marine ecosystem vulnerabilities.Linkov I, Ferguson E, Magar V, eds. Real Time and DeliberativeDecision Making: Application to Risk Assessment for Non-chemicalStressors (Springer, Amsterdam), 265–284.

Dow

nloa

ded

from

info

rms.

org

by [

128.

125.

124.

17]

on 1

4 A

ugus

t 201

5, a

t 11:

59 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


Paté-Cornell ME, Guikema S (2002) Probabilistic modeling of terroristthreats: A systems analysis approach to setting priorities among coun-termeasures. Military Oper. Res. 7(4):5–20.

Revelt D, Train K (1998) Mixed logit with repeated choices: House-holds choice of appliance efficiency level. Rev. Econom. Statist.LXXX(4):647–657.

Rosoff H, John R (2009) Decision analysis by proxy for the rational ter-rorist. Proc. Internat. Joint Conf. Artificial Intelligence (IJCAI-09),Quant. Risk Anal. Security Appl. Accessed January 4, 2013, http://teamcore.usc.edu/QRASA-09/SubmissionsFinal/Paper_4.pdf.

Sándor Z, Wedel M (2002) Profile construction in experimental choicedesigns for mixed logit models. Marketing Sci. 21(4):455–475.

Seppäläinen T (2010) Lecture Notes on Basics of Stochastic Analy-sis. Department of Mathematics, University of Wisconsin–Madison,Madison, WI. Accessed January 4, 2013, http://www.math.wisc.edu/seppalai/sa-book/notes.pdf.

Shocker AD, Srinivasan V (1979) Multiattribute approaches for productconcept evaluation and generation: A critical review. J. MarketingRes. 16:159–180.

Tierney L (1994) Markov chains for exploring posterior distributions. Ann.Statist. 22:1701–1762.

von Winterfeldt D, Edwards W (1986) Decision Analysis and BehavioralResearch (Cambridge University Press, New York).

Wang C, Bier VM (2011) Target-hardening decisions based on uncertainmultiattribute terrorist utility. Decision Anal. 8(4):286–302.

Chen Wang is a Ph.D. candidate in the Department of Indus-trial and Systems Engineering at the University of Wisconsin–Madison. She works as a research assistant in the Center forHuman Performance and Risk Analysis under the supervision ofVicki M. Bier. Her research interests include applications of oper-ations research, decision analysis, and expert elicitation to securityproblems.

Vicki M. Bier is a full professor in the Department of Indus-trial and Systems Engineering at the University of Wisconsin–Madison, where she is department chair and also directs theCenter for Human Performance and Risk Analysis. Her currentresearch interests focus on the application of decision analysis,risk analysis, game theory, and related methods to problems ofsecurity and critical infrastructure protection.

Dow

nloa

ded

from

info

rms.

org

by [

128.

125.

124.

17]

on 1

4 A

ugus

t 201

5, a

t 11:

59 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.

expert elicitation of adversary preferences using...

Documents