raga gopalakrishnan university of colorado at boulder sean d. nixon (university of vermont) jason r....

20
Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable Utility Design for Distributed Resource Allocation

Upload: cecily-sullivan

Post on 25-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable

Raga GopalakrishnanUniversity of Colorado at Boulder

Sean D. Nixon (University of Vermont)Jason R. Marden (University of Colorado at

Boulder)

Stable Utility Design forDistributed Resource Allocation

Page 2: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable

Resource AllocationAllocate agents to

resources to optimize system-level objective

Wireless Frequency Selection

F1 F2 F3 F1 F2 F3

?

frequency

frequency

Page 3: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable

Wireless Access Point Assignment

?

Resource AllocationAllocate agents to

resources to optimize system-level objective

Page 4: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable

?

Sensor Coverage

Resource AllocationAllocate agents to

resources to optimize system-level objective

Page 5: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable

?

Sensor Coverage

Allocate agents to resources to optimize system-level objective

Distributed Resource Allocation

Page 6: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable

?

Sensor Coverage

Design local control policies for agents that result in desirable global behavior

(convergence to an allocation optimizing system-level objective)

Distributed Resource Allocation

Page 7: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable

Design local control policies for agents that result in desirable global behavior

(convergence to an allocation optimizing system-level objective)

Distributed Resource Allocation

Game Theoretic Control• Model agents as players in a non-cooperative

game• Equilibria correspond to stable allocations• Goal is to design the game such that

equilibria• exist (stability)• are efficient• are easy to converge to

UTILITY DESIGN (static)

LEARNING DESIGN (dynamic)

Page 8: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable

Formal Model• – set of agents• – set of resources• – action set of player • – joint action (allocation) set• – utility function of player

A pure Nash equilibrium (PNE) is an action profile such that for each player ,

Page 9: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable

Formal Model• – set of agents• – set of resources• – action set of player • – joint action (allocation) set• – utility function of player

DESIGN must be “scalable” independent of specific problem instance (resources, action sets)

A pure Nash equilibrium (PNE) is an action profile such that for each player ,

Page 10: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable

Formal Model• – set of agents• – set of resources• – action set of player • – joint action (allocation) set• – utility function of player • – global objective function

or “welfare”• Separability: • – local “welfare” generated

at resource

• – local “distribution rule” at resource

set of players

that chose

Page 11: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable

Formal Model• – set of agents• – set of resources• – action set of player • – joint action (allocation) set• – utility function of player • – global objective function

or “welfare”• Separability: • – local “welfare” generated

at resource

• – local “distribution rule” at resource

set of players

that chose

Page 12: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable

S1

S2

D1

D2

61 6

1

1

6

1?+?

Example: Network formation

Page 13: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable

S1

S2

D1

D2

61 6

1

1

6

13+3

A Nash equilibrium

Also optimal!

1+5

Unique Nash

equilibriumSuboptimal

Example: Network formation

Page 14: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable

Key feature:Distribution rules

outcome

?+?

S1

S2

D1

D2

61 6

1

1

6

1

Example: Network formation

Page 15: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable

Formal Model• – set of agents• – set of resources• – action set of player • – joint action (allocation) set• – utility function of player • – local “welfare” generated

at resource • – local “distribution rule”

at resource

UTILITY DESIGN DISTRIBUTION RULE DESIGN

Page 16: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable

Most prior work studies two distribution rules

Marginal Contribution (MC)[ Wolpert and Tumer 1999 ]

average marginal contribution over player

orderings

Shapley Value (SV)[ Shapley 1953 ]

externality experienced by all other players

Extensions: “weighted” versions parameterized by weights

Both guarantee PNE in all games!

Question: Are there other such distribution rules?Prior Work: NO, for any given welfare function.[G., Marden, Wierman 2013]

𝒇 𝒓𝑺𝑽 (𝒊 ,𝑺 )= ∑

𝑻⊆𝑺¿𝒊 }¿ ¿¿ ¿¿ 𝒇 𝒓

𝑴𝑪 (𝒊 ,𝑺 )=𝑾 𝒓 (𝑺 )−𝑾 𝒓 (𝑺¿ {𝒊¿})

Page 17: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable

Most prior work studies two distribution rules

Marginal Contribution (MC)[ Wolpert and Tumer 1999 ]

Shapley Value (SV)[ Shapley 1953 ]

Both guarantee PNE in all games!

Question: Are there other such distribution rules?Prior Work: NO, for any given welfare function.[G., Marden, Wierman 2013]Observation: Many practical problems involve “single-selection”: agents select a single resource.Question: Are there other such distribution rules if we only require equilibrium existence for all single-selection games?Our Answer: No, not for all welfare functions.

Page 18: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable

Single-Selection ScenarioPrior Work:• “Proportional share” distribution rule

guarantees PNE for certain types of coverage problems (certain forms of )

[Marden and Wierman 2013]

Our Results (characterizations):• The only linear budget-balanced distribution

rules that guarantee PNE in all single-selection games, for all welfare functions, are weighted Shapley values.

• Given any linear welfare function with no dummy players, the only budget-balanced distribution rules that guarantee PNE in all single-selection games are weighted Shapley values.

• Given any welfare function, the only budget-balanced distribution rules that guarantee PNE in all two-player single-selection games are weighted Shapley values.

[G., Nixon, Marden 2013]

Page 19: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable

Concluding Remarks

• Consequences of the restriction to weighted Shapley values:• Resulting game is a weighted

potential game for which several learning dynamics converge to PNE.

• It is hard for agents to compute their utilities.

• Open Problems:• Obtaining a tighter characterization of

stable distribution rules for a given welfare function.

• Obtaining the characterization when budget-balance is relaxed.

• Optimizing the “weights” for efficiency.

Page 20: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable

Ragavendran GopalakrishnanUniversity of Colorado at Boulder

Sean D. Nixon (University of Vermont)Jason R. Marden (University of Colorado at

Boulder)

Stable Utility Design forDistributed Resource Allocation