access group meeting mikael johansson [email protected] novel algorithms for peer-to-peer...
Post on 15-Jan-2016
216 views
TRANSCRIPT
ACCESS Group meeting Mikael Johansson [email protected]
Novel algorithms for peer-to-peer optimization
in networked systems
Björn Johansson and Mikael Johansson, Automatic Control Lab, KTH, Stockholm, Sweden
Joint work with M. Rabi, C. Caretti, T. Keviczky and K.-H. Johansson
ACCESS Group meeting Mikael Johansson [email protected]
Content
• Motivation• Decomposition review• A framework for peer-to-peer optimization• Markov-randomized incremental subgradient method• Combined consensus-subgradient method• Experiences from implementation• Conclusions
ACCESS Group meeting Mikael Johansson [email protected]
MotivationLarge-scale optimization problem…
Decomposed into several small subproblems• Potentially large computational savings• Foundation for distributed decision-making
– fi performance of agent i, depends on action of others
– challenge: avoid coordinator, obey communication constraints
CoordinatorCoordinator
ACCESS Group meeting Mikael Johansson [email protected]
Application: multi-agent coordination
Find jointly optimal controls and rendez-vous point
”DMPC” – Distributed model-predictive consensus.
ACCESS Group meeting Mikael Johansson [email protected]
Application: distributed estimation
Node v measures yv, cooperates to find network-wide estimate
Solution is average, algorithm solves ”consensus” problem– Directly extends to Huber’s M-function (robust estimator)
Insert ”physical” pictureof estimation network here
ACCESS Group meeting Mikael Johansson [email protected]
Application: resource allocation
Throughput maximization under global bandwidth constraint
Global constraint, not global variable complicates problem.
Insert ”physical” pictureof estimation network here
ACCESS Group meeting Mikael Johansson [email protected]
Content
• Motivation• Decomposition review• A framework for peer-to-peer optimization• Markov-randomized incremental subgradient method• Combined consensus-subgradient method• Experiences from implementation• Conclusions
ACCESS Group meeting Mikael Johansson [email protected]
Decomposition review
Techniques for decomposing large-scale problem into many small
CoordinatorCoordinator
ACCESS Group meeting Mikael Johansson [email protected]
Trivial case: separable problems
Separable problems
Each node v can find xv by itself, no coordinator needed.
– Reality often more complex (and interesting!)
CoordinatorCoordinator
ACCESS Group meeting Mikael Johansson [email protected]
Complicating variables
Consider unconstrained problem in variables (x1, x2, ):
Here, is complicating (or coupling) variable.
Observation: when fixed, problem is separable in (x1, x2)
– how can this be exploited?
ACCESS Group meeting Mikael Johansson [email protected]
Primal decomposition
Fix complicating variable , define
To evaluate functions i we need to solve associated subproblems.
Original problem is equivalent to the master problem
in variable . Convex when original problem is. Possibly non-smooth.
Called primal decomposition– master problem (coordinator) optimizes primal variable.
ACCESS Group meeting Mikael Johansson [email protected]
Dual decomposition
Introduce new variables 1, 2 and consider
Here, 1 and 2 are local versions of complicating variable
The constraints 1=2 enforces consistency.
Key observation: Lagrangian
is separable (can minimize over local variables separately)
ACCESS Group meeting Mikael Johansson [email protected]
Dual decomposition
Hence, the dual function has the form
where each part of the dual can be evaluated locally,
(evaluation requires solving dual subproblems)
Dual problem
is convex, but not necessarily differentiable.
ACCESS Group meeting Mikael Johansson [email protected]
Subgradient methods
A subgradient of a convex function f at x is any that satisfies
• affine global underestimators• coincide with gradient if f smooth
Projected subgradient method
Converge if bounded and
ACCESS Group meeting Mikael Johansson [email protected]
Incremental subgradient methods
Apply to problems on the form
(e.g. our general form, by letting )
Algorithm: (v,k subgradient of fv at k)
Update by cyclic componentwise (negative) subgradient steps– can use fixed (e.g. 1…V) or random update order
ACCESS Group meeting Mikael Johansson [email protected]
Content
• Motivation• Decomposition review• A framework for peer-to-peer optimization• Markov-randomized incremental subgradient method• Combined consensus-subgradient method• Experiences from implementation• Conclusions
ACCESS Group meeting Mikael Johansson [email protected]
Our frameworkA convex (possibly non-smooth) optimization problem
A connected communication graph• local variables xv at each node v • global variables • per-node loss function fv(xv, )
Peer-to-peer:• Nodes can only communicate with neighbors
ACCESS Group meeting Mikael Johansson [email protected]
Quiz and challenge
Quiz: Which of the techniques we described are peer-to-peer?– Primal decomposition? – Dual decomposition? – Incremental subgradient methods?
Challenge: develop simple and efficient p2p optimization techniques!
ACCESS Group meeting Mikael Johansson [email protected]
Content
• Motivation• Decomposition review• A framework for peer-to-peer optimization• Markov-randomized incremental subgradient method• Combined consensus-subgradient method• Experiences from implementation• Conclusions
ACCESS Group meeting Mikael Johansson [email protected]
Peer-to-peer incremental subgradients?
Incremental subgradients not peer-to-peer– Estimate of optimizer forwarded in ring, or to arbitrary node
Is it possible to develop method that only forwards to neighbors?
ACCESS Group meeting Mikael Johansson [email protected]
Unbiased random walk on graph
Need to construct “unbiased” random walk– Visit every node with equal probability
(has stationary uniform probability)– Transition matrix can be computed via Metropolis-Hastings
(dv is the degree of node v, i.e. number of links)
– Can be computed using local info only!
ACCESS Group meeting Mikael Johansson [email protected]
Markov-randomized algorithm
Repeat:
• Update estimate
(vk state of Markov chain, vk subgradient of fvk
at k)
• Pass estimate to random neighbor using Markov chainP=[Pv,w] computed via Metropolis-Hasting
Conceptually simple idea. What can we say about its properties?
ACCESS Group meeting Mikael Johansson [email protected]
Main result
Proof highlights:• Sample sequence when chain in state v• Establish: all nodes visited w. equal probability during return time• Use conditional expectations• Invoke supermartingale theorem
ACCESS Group meeting Mikael Johansson [email protected]
Example: robust estimation
ACCESS Group meeting Mikael Johansson [email protected]
Content
• Motivation• Decomposition review• A framework for peer-to-peer optimization• Markov-randomized incremental subgradient method• Combined consensus-subgradient method• Experiences from implementation• Conclusions
ACCESS Group meeting Mikael Johansson [email protected]
Consensus-subgradient method
Key trick for distributing dual decomposition
Dual decomposition: relax consistency requirements
Alternative idea: “neglect and project” – Each node has local view of global decision variables– Updates in direction of (negative) subgradient– Coordinate with neighbors to achieve consistency
Will apply consensus iterations
ACCESS Group meeting Mikael Johansson [email protected]
Basic algorithm
Repeat
1. Predict next iterate using subgradient method
(v subgradient of f at v(k))
1. Execute I consensus iterations to approach consistency
2. Project (locally) on constraint set
ACCESS Group meeting Mikael Johansson [email protected]
Main result (unconstrained case)
Proof: based on results from approximate subgradient methods
Similar, somewhat more complex, results for constrained case.
ACCESS Group meeting Mikael Johansson [email protected]
Example
Simple 5-node network (left) non-smooth functions fv (right)
ACCESS Group meeting Mikael Johansson [email protected]
Example
Iterates for one (left) and 11 consensus iterations per step
ACCESS Group meeting Mikael Johansson [email protected]
To think about…
What is the right aggregation primitive in the network?– Sampling via unbiased random walk?– Consensus/gossiping?– Spanning-trees?
Has implication on– Implementation complexity/accuracy– Privacy (internal models, objectives private or shared?)– Information dissemination (who knows what in the end)
ACCESS Group meeting Mikael Johansson [email protected]
Content
• Motivation• Decomposition review• A framework for peer-to-peer optimization• Markov-randomized incremental subgradient method• Combined consensus-subgradient method• Experiences from implementation• Conclusions
ACCESS Group meeting Mikael Johansson [email protected]
Implementation experiences
Wireless sensor network testbed at KTH
The ultimate test: – can we make these algorithms run on our WSN nodes?
ACCESS Group meeting Mikael Johansson [email protected]
Wireless communication
Sensors communicate using 802.15.4 compliant radios
Basic primitives: – Unicast: a node addresses a
single neighbor at a time– Broadcast: communication with
(possibly) all neighbors
Exist in reliable and unreliable versions
ACCESS Group meeting Mikael Johansson [email protected]
Problem and solution candidates
We considered quadratic loss functions in nodes – consensus iterations one way to find optimum
Implemented three alternatives– P2P incremental subgradient, using reliable unicast– Dual decomposition using unreliable broadcast– Gossiping algorithm by Boyd et al, reliable broadcast
ACCESS Group meeting Mikael Johansson [email protected]
Nodes maintain local estimate of optimizer
1. Broadcasts current iterate to neighors
2. Updates Lagrange multipliers for some links(based on disagreement with neigbors)
3. Updates local estimate
Unreliable broadcast, since algorithm
can tolerate some packet losses
[Rabbat et al, IEEE SPAWC 2005]
Algorithm I: dual
ACCESS Group meeting Mikael Johansson [email protected]
The classical consensus iteration
1. Broadcasts current iterate to neighors
2. Updates local estimate
Reliable broadcast for consistency
[Xiao et al, IPSN 2005]
Algorithm II: consensus iteration
ACCESS Group meeting Mikael Johansson [email protected]
Algorithm III: p2p incremental
Our peer-to-peer incremental subgradient method
1. Update estimate using subgradientwith respect to local loss function
2. Pass estimate to random neigbour(forwarding decision based on Metropolis)
Reliable unicast (important not to loose token)
ACCESS Group meeting Mikael Johansson [email protected]
Ns2 simulations
fv quadratic (consensus), NS2 evaluation of three schemes
Dual, Markov-incremental subgradient, Xiao-Boyd.
ACCESS Group meeting Mikael Johansson [email protected]
Real implementation
ACCESS Group meeting Mikael Johansson [email protected]
Experiences
• Works surprisingly well
• Basic primitives not so basic– Reliable broadcast– Neighbor discovery
• Challenging the model– Link assymetry!– Packet loss, – Time/energy-efficiency.
Need to go back and revise theory (and implementation!)
ACCESS Group meeting Mikael Johansson [email protected]
Conclusions
Distributed optimization in networked systems– Important and useful– Many challenges remain!
Novel peer-to-peer optimization algorithms– Markov-modulated incremental subgradient method– Consensus-subgradient
Practical implementation in WSN testbed
Implementation and application challenges drive next iteration!