seminar on coordinated systems department of systems and information engineering university of...
Post on 04-Jan-2016
216 Views
Preview:
TRANSCRIPT
Seminar onCoordinated Systems
Department of Systems and Information Engineering
University of Virginia
SYS 793Fall 2004
2
Motivation• We are increasingly reliant on large-scale,
distributed engineering systems: Internet, national power grid, etc.– coordination is often achieved in engineering systems through the
specification of ad hoc protocols for relatively well defined (constrained) interactions between distributed systems.
– existing protocols are not adequately tuned for new applications and/or unexpected situations
– there appears to be little in the way of underlying guiding principles and theory for designing and operating such systems
– Notions of game theory and decentralized control go only part way toward revealing the basic problems associated with distributed engineering systems
3
SYS 793• In this seminar:
– we will explore the recent literature on the intersections between game theory, distributed planning in robotics and artificial intelligence, and distributed control.
– Faculty and students are expected to present a paper (to be chosen from the list given below).
• After each presentation we shall have a discussion session where the main contributions of each paper will be critically assessed.
4
Parameters
• Time:– Wednesdays, 5-6:15
• Place:– Olsson 005
• Credit:– One hour, pass/fail
• Registration:– Schedule number 95083 (SYS 793, Section 2)
• Webpage:https://toolkit.itc.virginia.edu/cgi-local/tk/UVa_SEAS_2004_Fall_SYS793-2
5
Student Responsibilities
• To earn a passing grade, each student will:– prepare a presentation on at least one
approved paper– lead subsequent discussion on salient features
of the paper– hand over slides for publication on the course
website
6
Partial List of Approved PapersR. Aumann. Subjectivity and correlation in randomized strategies. Journal of Mathematical Economics, 1:67–96, 1974.
H. Bui, S. Venkatesh, and D. Kieronska. A framework for coordination and learning among teams of agents. Lecture Notes in Computer Science, 1441:164–178, 1998.
J. Doran, S. Franklin, N. Jennings, and T. Norman. On cooperation in multi-agent systems. The Knowledge Engineering Review, 12(3):309–314, 1997.
G. Ellison. Learning, local interaction and coordination. Econometrica, 61:1047–1071, 1993.
D. Gauthier. Coordination. Dialogue, 14:195–221, 1975.
D. Gilbert. Rationality and salience. Philosophical Studies, 57:61–77, 1989.
V. Gervasi and G. Prencipe. Robotic cops: The intruder problem. In Proceedings of IEEE International Conference on Systems, Man, and Cybernetics, 2003.
M. Kandori, G. Mailath, and R. Rob. Learning, mutation and long-run equilibria in games. Econometrica, 61:29–56, 1993.
G. Prencipe. Corda: Distributed coordination of a set of autonomous mobile robots. In Proceedings of the Fourth European Research Seminar on Advances in Distributed Systems, pages 185–190, 2001.
P. Vanderschraaf. Learning and coordination: inductive deliberation, equilibrium and convention. Routledge, 2001.
X. Wang and T. Sandholm. Reinforcement learning to play an optimal nash equilibirum in team markov games. In Advances in Neural Information Processing Systems 15 (NIPS-2002), 2002.
G. Weiss. Multiagent Systems: a Modern Approach to Distributed Artificial Intelligence. The MIT Press, Cambridge, 1999.
7
Agenda
• Coordinated Systems Research Group (CSRG)
• Introduction to “Coordinated Systems”
Coordinated Systems Research Group
9
CSRG Mandate• The CSRG focuses on issues of coordination in large
scale decentralized engineering systems.– Engineering systems are increasingly reliant on coordination,
more than on (centralized) optimization processes and/or control.
– Typically, coordination is achieved through ad hoc protocols for relatively well defined (constrained) interactions between distributed systems.
– However, as information systems become more integrated into society, we find that existing protocols aren’t adequately tuned for new applications and/or unexpected situations.
• CSRG goals:– Develop theory for achieving coordination with limited
communication and without the benefit or predefined protocols– Transform theory into practice by developing and evaluating
prototype applications
10
CSRG Activities
• Distributed decision-making without communication– fault tolerant and adaptive construction of overlay networks for
group communication and collaboration.– remote sensing.– robotic coordination.
• Auction mechanisms for “on-demand” IT services– market mechanisms for efficient allocation of distributed
computation and communication services
• Fictitious play in Internet traffic engineering– account for competing interests of end-to-end connections within
and across Internet autonomous systems
11
Peter Beling
• Associate Professor– Department of Systems and
Information Engineering
• Research:– Financial engineering– Optimization theory &
computational complexity
• Funded Projects:– Solution Concepts for Static
Coordination Problems• NASA LaRC Grant NNL-04-AA66G
12
Alfredo Garcia• Assistant Professor
– Department of Systems and Information Engineering
• Research:– Modeling and control of
communications networks– Stochastic Optimization and
Optimal Control• Funded Projects:
– Complex Networks Optimization
• NSF Grant DMI-0217371– Security of Supply & Strategic
Learning in Restructured Power Markets
• NSF Grant ECS-0224747
13
Stephen Patek• Assistant Professor
– Department of Systems and Information Engineering
• Research:– Modeling and control of
communications networks– Stochastic Optimization and
Optimal Control– Coordination Processes
• Funded Projects:– Dynamic Coordination Processes
for Distributed Planning with Limited Communication, NSF (DST-0414727)
14
Current Students
• Himanshu Gupta
• Kaushik Sinha
• Yijia Zhao
Introduction to Coordinated Systems
16
Example: Dial / Wait• Two-Players, Two Actions:
• What makes sense?– Two “coordinated minimum-cost solutions” :
(Dial, Wait) and (Wait, Dial).– Unfortunately, neither player knows which one to select.
Dial
Dial
Wait
Wait
17
Some Thoughts• Arbitrarily selecting an action is irrational!
– If Player 1 arbitrarily chooses to Dial, Player 2 could also arbitrarily choose to Dial, resulting in a worst case outcome.
– Worst case solutions are easy to achieve arbitrarily (without coordination).
• Randomly selecting an action makes more sense!– If Player 1 chooses to Dial with probability p and Player 2 chooses to
Dial with probability q, then the expected cost of the outcome is
– Only in introducing “mixed actions” (randomized decisions) is it meaningful to talk about expected cost.
)connectingPr(Not
)1)(1(
],|[E),(
qppq
qpCostqpV
18
Interesting Observations• Suppose Player 1 chooses p = .5, then
In other words, regardless of Player 2’s decision, Player 1 is able to “lock in” an expected cost of .5.
• Suppose Player 2 chooses q = .5, then
In other words, regardless of Player 1’s decision, Player 2 is able to lock in an expected cost of .5.
5.)]1([5.)1)(5.1(5.],5.|Cost[E qqqqq
5.)]1([5.)5.1)(1()5(.]5,.|Cost[E ppppp
19
A Strong Equilibrium
Each player has the ability to lock in an expected have of .5, namely by choosing p = .5 and q = .5, respectively.– In fact, given p = .5 and q = .5, neither player has
the ability to change (let alone) improve the expected cost of the solution.
– So, p = .5 and q = .5 constitutes a strong equilibrium solution for the Dial / Wait problem.
20
Connection to Game Theory
• The Dial / Wait problem is a finite, two-player, non-cooperative, game in strategic form (with identical interests)
• For the Dial / Wait game there are two pure strategy Nash equilibria:– (Dial, Wait)– (Wait, Dial)
and exactly one mixed (non-pure) strategy mixed Nash equilibrium:– p = .5, q = .5.
21
Remarks
• The pure strategy equilibria are completely uninteresting.– They can only be achieved if the players are allowed to coordinate their
actions.
• Minimax is correct here, but it is not a reasonable approach in general.
• The mixed strategy equilibrium makes a lot of sense.– It achieves a decent value of expect cost and is not the result of an
arbitrary decision.– In fact, all that’s required in this example is that one of the two players
play their equilibrium strategy.
• But, in general, there can be multiple mixed strategy equilibria, and it is not obvious which one each player would select.
N-Agent, Multistage Problems
23
Uncoordinated Decision-Making
We consider N-agent multistage decision situations, where– all N agents must select an available action at each
stage, without coordinating their actions in advance (simultaneous decision making)
– all N agents perceive the same cost (disutility) associated the joint selection of actions at each stage.
24
Key Idea
Since all agents share the same notion of cost, they would coordinate their actions (if they could) to pick out a minimum-cost joint solution.– This would an easy, even without coordination, if
there were a single minimum-cost joint selection of actions.
– Unfortunately, if more than one “coordinated minimum-cost solution” exists, then there may not be a clear course of action for all agents.
25
Dynamic Version of the Dial / Wait Problem
• Both players decide to Dial or Wait in stages.– If both decide to Dial or if both decide to Wait, then they remain
unconnected.
• Both players are interested in reaching in as few stages as possible.
not connected1
connected
(Dial, Dial)(Wait, Wait)
(Dial, Wait)(Wait, Dial)
26
Formalization
• Statespace, X– Set of conditions associated with the operation of an underlying system.
• Mixed Action sets, Ai(x)
– Actions available to player i when the system is in state x 2 X.
• Transition reward function, gx(a1, …, aN)
– Expected reward (perceived equally by all players) associated with the profile of mixed actions (a1, …., aN) at state x 2 X.
• State transition probability, pxy(a1, …., aN)
– Probability of transitioning from x 2 X to y 2 X under the profile of actions (a1, …, aN).
• Time horizon, T– Number of stages of decision-making before the process ends.
Network Connection Recovery
Example
28
A Bn
m
n-m
…
n connections
…
Set Up• Consider a network in which n connections are
served by a direct link between two nodes A and B.
• Suppose two alternative links exist with capacities m and (n-m), respectively.
• Note that all n connections can still be accommodated, but how should they re-route themselves?
29
Dynamic Recovery
• The initial state of this process is < m, n-m >.• Rules:
– If k < m connections select the top link, then those k connections are satisfied, but the other link is overwhelmed.
• The system transitions to < m-k, n-m >, with n-k connections left to be satisfied.
• We are left with a network routing problem similar to the original one but involving the same or fewer unsatisfied connections.
– If k = m connections select the top link, then all n connections are satisfied.
• The system transitions to – If k > m connections select the top link, then the top link is
overwhelmed, but the n-k other connections are satisfied.• The system transitions to < m, k-m >, with k connections left to be
satisfied.• We are left with a network routing problem similar to the original one
but involving the same or fewer unsatisfied connections.
• All players randomize their selection of links.
Remote Sensing
Example
31
Problem Overview
• N sensing platforms– Identical in capability
• M targets
• Autonomous planning
• Strict collision avoidance protocol– Impossible for two or more platforms investigate a
single target simultaneously
• Finite Time Horizon, T
32
Simple Example
• Parameters– 3 UAVs– 5 points of interest– 2 sensing opportunities
• Stage 1– Green UAV successfully
reaches its target– Blue and Red UAVS
compete and fail to make the necessary observations
33
Simple Example
• Stage 2– Green and Blue UAVs
compete a new target– Red UAV is successful
• Summary– We end up only
observing two targets!– (With coordination, we
could have arranged for all points of interest to be revealed.)
34
Research Question
What decision rule should we implement within each sensing platform so that we gather as much information as possible within the available time, without requiring the platforms to coordinate their actions?
SYS 793 Topics of Interest
36
Topics of Interest• Game Theory
– solution concepts for rational decision-making– alternatives to Nash and correlated equilibria for non-
cooperative games– equilibrium selection
• Learning algorithms for games– including fictitious play– reinforcement learning
• Multi-Agent Frameworks/Applications– Robotic Cops, Robo-anything– Team theory
• Philosophy of Coordination
top related