selecting observations against adversarial objectives

25
Carnegie Mellon Selecting Observations against Adversarial Objectives Andreas Krause Brendan McMahan Carlos Guestrin Anupam Gupta

Upload: sarila

Post on 11-Feb-2016

21 views

Category:

Documents


0 download

DESCRIPTION

Selecting Observations against Adversarial Objectives. Andreas Krause Brendan McMahan Carlos Guestrin Anupam Gupta. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A A A A A. Observation selection problems. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Selecting Observations against Adversarial Objectives

Carnegie Mellon

Selecting Observations against Adversarial

ObjectivesAndreas Krause

Brendan McMahanCarlos GuestrinAnupam Gupta

Page 2: Selecting Observations against Adversarial Objectives

Observation selection problems

Place sensors forbuilding automation

Monitor rivers, lakes using robots

Detectcontaminations

in water networksSet V of possible observations (sensor locations,..)Want to pick subset A* µ V such that

For most interesting utilities F, NP-hard!

A¤ = argmaxjA j· k

F (A)

Page 3: Selecting Observations against Adversarial Objectives

Placement B = {S1,…, S5}

Key observation: Diminishing returns

Placement A = {S1, S2}

Formalization: SubmodularityFor A µ B, F(A [ {S’}) – F(A) ¸ F(B [ {S’}) – F(B)

Adding S’ will help a lot! Adding S’ doesn’t

help muchNew sensor S’

Page 4: Selecting Observations against Adversarial Objectives

Submodularity[with Guestrin, Singh, Leskovec, VanBriesen, Faloutsos, Glance]

We prove submodularity forMutual information F(A) = H(unobs) – H(unobs|A)

UAI ’05, JMLR ’07 (Spatial prediction)Outbreak detection F(A) = Impact reduction sensing A

KDD ’07 (Water monitoring, …)

Also submodular:Geometric coverage F(A) = area coveredVariance reduction F(A) = Var(Y) – Var(Y|A) …

Page 5: Selecting Observations against Adversarial Objectives

Why is submodularity useful?

Theorem [Nemhauser et al ‘78]Greedy algorithm gives constant factor approximationF(Agreedy) ¸ (1-1/e) F(Aopt)

Can get online (data dependent) bounds for any algorithmCan significantly speed up greedy algorithmCan use MIP / branch & bound for optimal solution

~63%

12

34

5Greedy Algorithm(forward selection)

sj +1 = argmaxs2VnA j

F (A j [ f sg)

Page 6: Selecting Observations against Adversarial Objectives

Robust observation selection

What if …… parameters of model P(XV j ) unknown / change?… sensors fail?… an adversary selects the outbreak scenario?

Morevariabilityhere now

new

Attackhere!

Best placement forparameters old

Sensors

Page 7: Selecting Observations against Adversarial Objectives

Robust prediction

Instead: minimize “width” of the confidence bandsFor every location s 2 V, define Fs(A) = Var(s) – Var(s|A)Minimize “width” simultaneously maximize all Fs(A)Each Fs(A) is (often) submodular! [Das & Kempe ‘07]

Low average variance (MSE)but high maximum

(in most interesting part!)

Typical objective: Minimize average variance (MSE)

Confidencebands

Horizontal positions V

pH v

alue

-3

-2

-1

0

1

2

3

-3

-2

-1

0

1

2

3

Page 8: Selecting Observations against Adversarial Objectives

-3

-2

-1

0

1

2

3

Adversarial observation selection

Given:Possible observations V, Submodular functions F1,…,Fm

Want to solve

Can model many problems this way:Width of confidence bands: Fi is variance at location iunknown parameters: Fi is info-gain with parameters i

adversarial outbreak scenarios: Fi is utility for scenario i…

Unfortunately, mini Fi(A) is not submodular

One Fi foreach location i

… …A¤ = argmax

jA j· kmin

iF i (A)

Page 9: Selecting Observations against Adversarial Objectives

How does greedy do?Set A F1 F2 mini Fi

{x} 1 0 0{y} 0 2 0{z} {x,y} 1 2 1{x,z} 1 {y,z} 2

Theorem: The problem max|A|· k mini F(A) does not admit any approximation unless P=NP

Optimalsolution(k=2)

Greedy picksz first

Then, canchoose only

x or y

Greedy does arbitrarily badly. Is there something better?

Page 10: Selecting Observations against Adversarial Objectives

Alternative formulationIf somebody told us the optimal value,

can we recover the optimal solution A*?

Need to solve dual problem

Is this any easier?

Yes, if we relax the constraint |A| · k

c¤ = maxjA j· k

mini

F i (A)

A¤ = argminA

jA j such that mini

F i (A) ¸ c¤

Page 11: Selecting Observations against Adversarial Objectives

Solving the alternative problem

Trick: For each Fi and c, define truncation c

|A|

Fi(A)F’i(A)

Set F1 F2 F’1

F’2

F’avg,1 mini Fi

{x} 1 0 1 0 ½ 0{y} 0 2 0 1 ½ 0{z} {x,y}

1 2 1 1 1 1

{x,z}

1 1 (1+)/2

{y,z}

2 1 (1+)/2

mini Fi(A) ¸ c F’avg,c(A) = c

Lemma:

F’avg,c(A)is submodular!

F 0i (A) = minfF i (A);cg

F 0avg;c(A) = 1

mX

iF 0

i (A)

Page 12: Selecting Observations against Adversarial Objectives

Why is this useful?Can use the greedy algorithm to find (approximate) solution!

Proposition: Greedy algorithm finds

AG with |AG| · k and F’avg,c(AG) = c

where = 1+log maxs i Fi({s})

Page 13: Selecting Observations against Adversarial Objectives

Back to our example

Guess c=1First pick xThen pick y

Optimal solution!

How do we find c?

Set F1 F2 mini Fi

F’avg,1

{x} 1 0 0 ½{y} 0 2 0 ½{z} {x,y}

1 2 1 1

{x,z}

1 (1+)/2

{y,z}

2 (1+)/2

Page 14: Selecting Observations against Adversarial Objectives

Submodular Saturation Algorithm

Given set V, integer k and functions F1,…,Fm

Initialize cmin=0, cmax = mini Fi(V)Do binary search: c = (cmin+cmax)/2

Use greedy algorithm to find AG such that F’avg,c(AG) = cIf |AG| > k: decrease cmax

If |AG| · k: increase cmin

until convergencecmaxcmin c

|AG| · k c too low

|AG| > k c too high

Page 15: Selecting Observations against Adversarial Objectives

Theoretical guarantees

Theorem: If there were a polytime algorithm with better constant < , then NPµ DTIME(nlog log n)

Theorem: Saturate finds a solution AS such that

mini Fi(AS) ¸ OPTk and |AS|· k

where OPTk = max|A|· k mini Fi(A) = 1 + log maxs i Fi({s})

Theorem: The problem max|A|· k mini F(A) does not admit any approximation unless P=NP

Page 16: Selecting Observations against Adversarial Objectives

Experiments:Minimizing maximum variance in GP regressionRobust biological experimental designOutbreak detection against adversarial contaminations

Goals:Compare against state of the artAnalyze appropriateness of“worst-case” assumption

Page 17: Selecting Observations against Adversarial Objectives

0 20 40 600

0.05

0.1

0.15

0.2

0.25

Number of sensors

Max

imum

mar

gina

l var

ianc

e

Greedy

SimulatedAnnealing

Saturate

Spatial prediction

Compare to state of the art [Sacks et.al. ’88, Wiens ’05, …]Highly tuned simulated annealing heuristics (7 parameters)

Saturate is competitive & faster, better on larger problems

Environmental monitoring Precipitation data

bette

r

0 20 40 60 80 1000.5

1

1.5

2

2.5

Number of sensors

Max

imum

mar

gina

l var

ianc

e

Greedy

Saturate

SimulatedAnnealing

Page 18: Selecting Observations against Adversarial Objectives

Maximum vs. average variance

Minimizing the worst-case leads to good average-case score, not vice versa

Environmental monitoring Precipitation data

bette

r

0 5 10 15 200

0.05

0.1

0.15

0.2

0.25

Number of sensors

Mar

gina

l var

ianc

e

Max. var.opt. avg.(Greedy) Max. var.

opt. var.(Saturate)

Avg. var.opt. max.(Saturate)

Avg. var.opt. avg.(Greedy)

0 5 10 15 200

0.5

1

1.5

2

2.5

3

Number of sensors

Mar

gina

l var

ianc

e

Max. var.opt. avg.(Greedy) Max. var.

opt. max.(Saturate)

Avg. var.opt. max.(Saturate)

Avg. var.opt. avg.(Greedy)

Page 19: Selecting Observations against Adversarial Objectives

Outbreak detection

Results even more prominent on water network monitoring (12,527 nodes)

Water networks

bette

r

Water networks

0 2 4 6 8 100

500

1000

1500

2000

2500

3000

Number of sensors

Det

ectio

n tim

e (m

inut

es)

max DT(Saturate)

max DT(Greedy)

avg DT(Saturate)

avg DT(Greedy)

0 10 20 300

500

1000

1500

2000

2500

3000

Number of sensors

Max

imum

det

ectio

n tim

e (m

inut

es)

Greedy

SimulatedAnnealing

Saturate

Page 20: Selecting Observations against Adversarial Objectives

Robust experimental design

Learn parameters of nonlinear functionyi = f(xi,) + wChoose stimuli xi to facilitate MLE of Difficult optimization problem!

Common approach: linearization!yi ¼ f(xi,0) + rf0

(xi)T (-0) + wAllows nice closed form (fractional) solution!

How should we choose 0??

Page 21: Selecting Observations against Adversarial Objectives

Robust experimental design

State-of-the-art: [Flaherty et al., NIPS ‘06]Assume perturbation on Jacobian rf0

(xi)Solve robust SDP against worst-case perturbationMinimize maximum eigenvalue of estimation error (E-optimality)

This paper:Assume perturbation of initial parameter estimate 0

Use Saturate to perform well against all initial parameter estimatesMinimize MSE of parameter estimate(Bayesian A-optimality, typically submodular!)

Page 22: Selecting Observations against Adversarial Objectives

Experimental setupEstimate parameters of Michaelis-Menten model (to compare results)Evaluate efficiency of designs

Loss of optimal design,knowing true parameter true

Loss of robust design,assuming (wrong) initial parameter 0

e±ciency ´ ¸max[Cov(µ̂ j µtrue;wopt(µtr ue)))]¸max[Cov(µ̂ j µtr ue;w½(µ0))]

Page 23: Selecting Observations against Adversarial Objectives

Robust design results

Saturate more efficient than SDP if optimizing for high parameter uncertainty

bette

r

Low uncertainty in 0 High uncertainty in 0

A B C A B C

10-1

100

1010

0.2

0.4

0.6

0.8

1

Initial parameter estimate 02

Effi

cien

cy (w

.r.t.

E-o

ptim

ality

)

ClassicalE-optimal

design

SDP = 10-3

true 2

Saturate

10-1

100

1010

0.2

0.4

0.6

0.8

1

Initial parameter estimate 02

Effi

cien

cy (w

.r.t.

E-o

ptim

ality

)

ClassicalE-optimal

design

SDP = 10-3

true 2

Saturate

SDP = 16.3

Page 24: Selecting Observations against Adversarial Objectives

Future (current) workIncorporating complex constraints (communication, etc.)Dealing with large numbers of objectives

Constraint generationImproved guarantees for certain objectives (sensor failures)

Trading off worst-case and average-case scores

0 200 400 600 8000

500

1000

1500

2000

2500

3000

Expected score

Adv

ersa

rial s

core

k=5k=10

k=15k=20

Page 25: Selecting Observations against Adversarial Objectives

ConclusionsMany observation selection problems require optimizing adversarially chosen submodular function

Problem not approximable to any factor!Presented efficient algorithm: Saturate

Achieves optimal score, with bounded increase in costGuarantees are best possible under reasonable complexity assumptions

Saturate performs well on real-world problemsOutperforms state-of-the-art simulated annealing algorithms for sensor placement, no parameters to tuneCompares favorably with SDP based solutions for robust experimental design

A¤ = argmaxjA j· k

mini

F i (A)