systematic analysis of distributed optimization algorithms over … · 2020. 12. 15. · systematic...
TRANSCRIPT
-
Systematic Analysis of DistributedOptimization Algorithms overJointly-Connected Networks
Bryan Van ScoyMiami University
Laurent LessardNortheastern University
IEEE Conference on Decision and Control December 16, 2020
-
minimizex1,...,xn
n∑i=1
fi(xi)
subject to x1 = x2 = . . . = xn
f1, x1f2, x2
f3, x3
f4, x4f5, x5
W =
13
13 0 0
13
0 12 012 0
0 1623
16 0
13 0
13 0
13
13 0 0
13
13
Want each agent to compute the global optimizer x? by communicatingwith local neighbors and performing local computations.
1
-
DIGing
xi(k + 1) =n∑
j=1Wij(k)xj(k)− α yi(k)
yi(k + 1) =n∑
j=1Wij(k) yj(k) +∇fi
(xi(k + 1)
)−∇fi
(xi(k)
)
• analyzed in (Nedić, Olshevsky, and Shi, 2017) and (Qu, Li, 2018)• linear convergence over time-varying networks
‖xi(k)− x?‖ = O(ρk)
iterations to converge ∝ −1/ log ρ
2
-
DIGing
xi(k + 1) =n∑
j=1Wij(k)xj(k)− α yi(k)
yi(k + 1) =n∑
j=1Wij(k) yj(k) +∇fi
(xi(k + 1)
)−∇fi
(xi(k)
)
Issues:
• the bound grows with the number of agents• the bound does not apply to large stepsizes• the analysis is specific to DIGing
3
-
Systematicanalysistool
LMI
Worst-caseperformance
bound
ρAlgorithm
A Bu BvCy Dyu DyvCz Dzu DzvFx Fu
Local functions
κ
Communication network
σ, B
main contribution = sharp and systematic analysis
4
-
Systematicanalysistool
LMI
Worst-caseperformance
bound
ρAlgorithm
A Bu BvCy Dyu DyvCz Dzu DzvFx Fu
Local functions
κ
Communication network
σ, B
main contribution = sharp and systematic analysis
5
-
Function assumptionsEach local objective function fi is L-smooth and m-strongly convex withrespect to the global optimizer.
x
fi(x)
κ = 1
x
fi(x)
κ = 2
x
fi(x)
κ = 4
x
fi(x)
κ = 8
The condition ratio κ = L/m characterizeshow much the curvature varies.
For all x,(∇fi(x)−∇fi(x?)−m (x− x?)
)T(∇fi(x)−∇fi(x?)− L (x− x?)) ≤ 06
-
Systematicanalysistool
LMI
Worst-caseperformance
bound
ρAlgorithm
A Bu BvCy Dyu DyvCz Dzu DzvFx Fu
Local functions
κ
Communication network
σ, B
7
-
Spectral gap
The spectral gap characterizes the connectivity of a graph.
0.0 0.667 0.707 0.809 1.0
spectral gap =∥∥ 1
n 11T −W
∥∥2
8
-
Network assumptions
sparsity Wij(k) = 0 if agent i does not receive informationfrom agent j at time k
balanced W (k)1 = W (k)T1 = 1
spectrum the spectral gap ofW (k) is less than or equal to one
joint spectrum for some fixed B, the spectral gap of[W (k)W (k + 1) · · ·W (k +B − 1)
]1/Bis less than or equal to σ < 1
The spectral gap σ characterizes the connectivity ofthe time-varying network averaged over B iterations.
9
-
Systematicanalysistool
LMI
Worst-caseperformance
bound
ρAlgorithm
A Bu BvCy Dyu DyvCz Dzu DzvFx Fu
Local functions
κ
Communication network
σ, B
10
-
Algorithm
xi(k + 1)yi(k)zi(k)
=A Bu BvCy Dyu DyvCz Dzu Dzv
xi(k)ui(k)vi(k)
state updateui(k) = ∇fi
(yi(k)
)gradient computation
vi(k) =n∑
j=1Wij(k) zj(k) communication
0 =n∑
j=1
(Fx xj(k) + Fu uj(k)
)invariant
11
-
2 state variables 3 state variables
EXTRA
1 − 12 α −α 11 0 0 0 00 0 0 1 01 0 0 0 01 − 12 0 0 01 −1 α 0
NIDS
1 − 12 α2 −α2 11 0 0 0 00 0 0 1 01 0 0 0 01 − 12 α2 −α2 01 −1 α 0
DIGing
0 −α 0 0 1 00 0 −1 1 0 10 0 0 1 0 00 −α 0 0 1 01 0 0 0 0 00 1 0 0 0 00 1 −1 0
AugDGM
0 0 0 0 1 −α0 0 −1 1 0 10 0 0 1 0 00 0 0 0 1 −α1 0 0 0 0 00 1 0 0 0 00 1 −1 0
SVL
1− γ β −α −γ−1 1 0 −11− δ 0 0 δ1 0 0 00 1 0
Exact Diffusion(ExDIFF)
1 −1 −α 112 0 −α 121 0 − 12 01 0 0 01 −1 0
Unified DIGing(uDIG)
0 −α −α 1 0L+m
2 0 −1 0 11 0 0 0 01 0 0 0 0
−L+m2 1 1 0 00 1 0
Unified EXTRA(uEXTRA)
0 −α −α 1 00 0 −1 L 11 0 0 0 01 0 0 0 00 1 1 −L 00 1 0
1co
mmunicatedvariable
2co
mmunicatedvariables
12
-
Systematicanalysistool
LMI
Worst-caseperformance
bound
ρAlgorithm
A Bu BvCy Dyu DyvCz Dzu DzvFx Fu
Local functions
κ
Communication network
σ, B
13
-
LMIFind P � 0, Q � 0, R � 0, S � 0, and λ ≥ 0 such that
0 � ΨT(ξT+P ξ+ − ρ
2 (ξTP ξ) +B−1∑`=0
λ(`)[y(`)u(`)
]T[−2 κ+ 1κ+ 1 −2κ
][y(`)u(`)
])Ψ
and
0 � ξT+Q ξ+ − ρ2 (ξTQ ξ) +
B−1∑`=0
λ(`)[y(`)u(`)
]T[−2 κ+ 1κ+ 1 −2κ
][y(`)u(`)
]+[w(0)w(B)
]T([σ2B 0
0 −1
]⊗R
)[w(0)w(B)
]
+B−1∑`=0
z(`)w(`)v(`)w(`+ 1)
T([
1 00 −1
]⊗ S(`)
) z(`)w(`)v(`)w(`+ 1)
Solved in Julia using Convex.jl and Mosek.
14
-
Systematicanalysistool
LMI
Worst-caseperformance
bound
ρAlgorithm
A Bu BvCy Dyu DyvCz Dzu DzvFx Fu
Local functions
κ
Communication network
σ, B
15
-
DIGing
100 101 102
(1 + σ)/(1− σ)
101
102
103
104
iter
ati
on
sto
con
verg
eκ = 10, B = 1, α optimizes DIGing bound
n = 50
n = 10
n = 2
all n
lower bound
• dashed lines are from (Nedić, Olshevsky, and Shi, 2017)• solid line is from our LMI• lower bound corresponds to only optimization and only consensus
16
-
DIGing
100 101 102
(1 + σ)/(1− σ)
101
102
103
104
iter
ati
on
sto
con
verg
eκ = 10, B = 1, α optimizes LMI
all n
lower bound
• the bound in (Nedić, Olshevsky, and Shi, 2017) is vacuous
17
-
DIGing
100 101 102
(1 + σ)/(1− σ)
101
102
103
104
iter
ati
on
sto
con
verg
eκ = 10, α optimizes LMI
B = 4
B = 3
B = 2
B = 1
lower bound
18
-
Algorithm comparison
100 101 102
(1 + σ)/(1− σ)
101
102
103
104
iter
ati
on
sto
con
verg
eκ = 10, B = 1, α optimizes LMI
EXTRA
NIDS
DIGing
AugDGM
ExDIFF
uEXTRA
uDIG
SVL
lower bound
SVL is designed to optimize the worst-case performance when B = 1.
19Sundararajan, Van Scoy, and Lessard (2020)
-
Algorithm comparison
100 101 102
(1 + σ)/(1− σ)
101
102
103
104
iter
ati
on
sto
con
verg
eκ = 10, B = 2, α optimizes LMI
EXTRA
NIDS
DIGing
AugDGM
ExDIFF
uEXTRA
uDIG
SVL
lower bound
20
-
Algorithm comparison
100 101 102
(1 + σ)/(1− σ)
101
102
103
104
iter
ati
on
sto
con
verg
eκ = 10, B = 3, α optimizes LMI
EXTRA
NIDS
DIGing
AugDGM
ExDIFF
uEXTRA
uDIG
SVL
lower bound
21
-
Algorithm comparison
100 101 102
(1 + σ)/(1− σ)
101
102
103
104
iter
ati
on
sto
con
verg
eκ = 10, B = 4, α optimizes LMI
EXTRA
NIDS
DIGing
AugDGM
ExDIFF
uEXTRA
uDIG
SVL
lower bound
22
-
Summary
main contribution = sharp and systematic analysis
open questions:
• is SVL asymptotically optimal in the limit as σ → 1?• what is the optimal algorithm for B > 1?
https://vanscoy.github.io
23
https://vanscoy.github.io