Separating Doubly Nonnegative andCompletely Positive Matrices
Kurt M. AnstreicherDept. of Management Sciences
University of Iowa(joint work with Hongbo Dong)
INFORMS National Meeting, Austin, November 2010
2
0
CP and DNN matrices
Let Sn denote the set of n×n real symmetric matrices, S+n denote
the cone of n×n real symmetric positive semidefinite matrices andNn denote the cone of symmetric nonnegative n× n matrices.
CP and DNN matrices
Let Sn denote the set of n×n real symmetric matrices, S+n denote
the cone of n×n real symmetric positive semidefinite matrices andNn denote the cone of symmetric nonnegative n× n matrices.
• The cone of n×n doubly nonnegative (DNN) matrices is thenDn = S+
n ∩Nn.
CP and DNN matrices
Let Sn denote the set of n×n real symmetric matrices, S+n denote
the cone of n×n real symmetric positive semidefinite matrices andNn denote the cone of symmetric nonnegative n× n matrices.
• The cone of n×n doubly nonnegative (DNN) matrices is thenDn = S+
n ∩Nn.
• The cone of n× n completely positive (CP) matrices isCn = {X |X = AAT for some n× k nonnegative matrix A}.
CP and DNN matrices
Let Sn denote the set of n×n real symmetric matrices, S+n denote
the cone of n×n real symmetric positive semidefinite matrices andNn denote the cone of symmetric nonnegative n× n matrices.
• The cone of n×n doubly nonnegative (DNN) matrices is thenDn = S+
n ∩Nn.
• The cone of n× n completely positive (CP) matrices isCn = {X |X = AAT for some n× k nonnegative matrix A}.• Dual of Cn is the cone of n× n copositive matrices,C∗n = {X ∈ Sn | yTXy ≥ 0 ∀ y ∈ <+
n}.
CP and DNN matrices
Let Sn denote the set of n×n real symmetric matrices, S+n denote
the cone of n×n real symmetric positive semidefinite matrices andNn denote the cone of symmetric nonnegative n× n matrices.
• The cone of n×n doubly nonnegative (DNN) matrices is thenDn = S+
n ∩Nn.
• The cone of n× n completely positive (CP) matrices isCn = {X |X = AAT for some n× k nonnegative matrix A}.• Dual of Cn is the cone of n× n copositive matrices,C∗n = {X ∈ Sn | yTXy ≥ 0 ∀ y ∈ <+
n}.
Clear that
Cn ⊆ Dn, D∗n = S+n +Nn ⊆ C∗n,
and these inclusions are in fact strict for n > 4.
Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using amatrix V ∈ C∗n having V •X < 0.
Why Bother?
Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using amatrix V ∈ C∗n having V •X < 0.
Why Bother?
• [Bur09] shows that broad class of NP-hard problems can beposed as linear optimization problems over Cn.
Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using amatrix V ∈ C∗n having V •X < 0.
Why Bother?
• [Bur09] shows that broad class of NP-hard problems can beposed as linear optimization problems over Cn.
• Dn is a tractable relaxation of Cn. Expect that solution ofrelaxed problem will be X ∈ Dn \ Cn.
Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using amatrix V ∈ C∗n having V •X < 0.
Why Bother?
• [Bur09] shows that broad class of NP-hard problems can beposed as linear optimization problems over Cn.
• Dn is a tractable relaxation of Cn. Expect that solution ofrelaxed problem will be X ∈ Dn \ Cn.
• Note that least n where problem occurs is n = 5.
For X ∈ Sn let G(X) denote the undirected graph on vertices{1, . . . , n} with edges {{i 6= j} |Xij 6= 0}.
For X ∈ Sn let G(X) denote the undirected graph on vertices{1, . . . , n} with edges {{i 6= j} |Xij 6= 0}.Definition 1. Let G be an undirected graph on n vertices.Then G is called a CP graph if any matrix X ∈ Dn withG(X) = G also has X ∈ Cn.
For X ∈ Sn let G(X) denote the undirected graph on vertices{1, . . . , n} with edges {{i 6= j} |Xij 6= 0}.Definition 1. Let G be an undirected graph on n vertices.Then G is called a CP graph if any matrix X ∈ Dn withG(X) = G also has X ∈ Cn.
The main result on CP graphs is the following:
Proposition 1. [KB93] An undirected graph on n vertices isa CP graph if and only if it contains no odd cycle of length 5or greater.
In [BAD09] it is shown that:
• Extreme rays of D5 are either rank-one matrices in C5, or rank-three “extremely bad” matrices where G(X) is a 5-cycle (everyvertex in G(X) has degree two).
In [BAD09] it is shown that:
• Extreme rays of D5 are either rank-one matrices in C5, or rank-three “extremely bad” matrices where G(X) is a 5-cycle (everyvertex in G(X) has degree two).
• Any such extremely bad matrix can be separated from C5 by atransformation of the Horn matrix
H :=
1 −1 1 1 −1−1 1 −1 1 1
1 −1 1 −1 11 1 −1 1 −1−1 1 1 −1 1
∈ C∗5 \ D∗5 .
In [DA10] show that:
• Separation procedure based on transformed Horn matrix ap-plies to X ∈ D5 \ C5 where X has rank three and G(X) hasat least one vertex of degree 2.
In [DA10] show that:
• Separation procedure based on transformed Horn matrix ap-plies to X ∈ D5 \ C5 where X has rank three and G(X) hasat least one vertex of degree 2.
•More general separation procedure applies to any X ∈ D5 \ C5that is not componentwise strictly positive.
In [DA10] show that:
• Separation procedure based on transformed Horn matrix ap-plies to X ∈ D5 \ C5 where X has rank three and G(X) hasat least one vertex of degree 2.
•More general separation procedure applies to any X ∈ D5 \ C5that is not componentwise strictly positive.
An even more general separation procedure that applies to anyX ∈ D5 \ C5 is described in [BD10]. In this talk we will describethe procedure from [DA10] for X ∈ D5 \ C5, X 6> 0, and itsgeneralization to larger matrices having block structure.
A separation procedure for the 5× 5 case
Assume that X ∈ D5, X 6> 0. After a symmetric permutationand diagonal scaling, X may be assumed to have the form
X =
X11 α1 α2
αT1 1 0
αT2 0 1
, (1)
where X11 ∈ D3.
A separation procedure for the 5× 5 case
Assume that X ∈ D5, X 6> 0. After a symmetric permutationand diagonal scaling, X may be assumed to have the form
X =
X11 α1 α2
αT1 1 0
αT2 0 1
, (1)
where X11 ∈ D3.
Theorem 1. [BX04, Theorem 2.1] Let X ∈ D5 have the form(1). Then X ∈ C5 if and only if there are matrices A11 andA22 such that X11 = A11 + A22, and(
Aii αiαTi 1
)∈ D4, i = 1, 2.
A separation procedure for the 5× 5 case
Assume that X ∈ D5, X 6> 0. After a symmetric permutationand diagonal scaling, X may be assumed to have the form
X =
X11 α1 α2
αT1 1 0
αT2 0 1
, (1)
where X11 ∈ D3.
Theorem 1. [BX04, Theorem 2.1] Let X ∈ D5 have the form(1). Then X ∈ C5 if and only if there are matrices A11 andA22 such that X11 = A11 + A22, and(
Aii αiαTi 1
)∈ D4, i = 1, 2.
In [BX04], Theorem 1 is utilized only as a proof mechanism, butwe now show that it has algorithmic consequences as well.
Theorem 2. Assume that X ∈ D5 has the form (1). ThenX ∈ D5 \ C5 if and only if there is a matrix
V =
V11 β1 β2
βT1 γ1 0
βT2 0 γ2
such that
(V11 βiβTi γi
)∈ D∗4 , i = 1, 2,
and V •X < 0.
Theorem 2. Assume that X ∈ D5 has the form (1). ThenX ∈ D5 \ C5 if and only if there is a matrix
V =
V11 β1 β2
βT1 γ1 0
βT2 0 γ2
such that
(V11 βiβTi γi
)∈ D∗4 , i = 1, 2,
and V •X < 0.
• Suppose that X ∈ D5 \ C5, and V are as in Theorem 2. IfX ∈ C5 is another matrix of the form (1), then Theorem 1implies that V • X ≥ 0.
Theorem 2. Assume that X ∈ D5 has the form (1). ThenX ∈ D5 \ C5 if and only if there is a matrix
V =
V11 β1 β2
βT1 γ1 0
βT2 0 γ2
such that
(V11 βiβTi γi
)∈ D∗4 , i = 1, 2,
and V •X < 0.
• Suppose that X ∈ D5 \ C5, and V are as in Theorem 2. IfX ∈ C5 is another matrix of the form (1), then Theorem 1implies that V • X ≥ 0.
• However cannot conclude that V ∈ C∗5 because V •X ≥ 0 only
holds for X of the form (1), in particular, x45 = 0.
Theorem 2. Assume that X ∈ D5 has the form (1). ThenX ∈ D5 \ C5 if and only if there is a matrix
V =
V11 β1 β2
βT1 γ1 0
βT2 0 γ2
such that
(V11 βiβTi γi
)∈ D∗4 , i = 1, 2,
and V •X < 0.
• Suppose that X ∈ D5 \ C5, and V are as in Theorem 2. IfX ∈ C5 is another matrix of the form (1), then Theorem 1implies that V • X ≥ 0.
• However cannot conclude that V ∈ C∗5 because V •X ≥ 0 only
holds for X of the form (1), in particular, x45 = 0.
• Fortunately, by [HJR05, Theorem 1], V can easily be “com-pleted” to obtain a copositive matrix that still separates Xfrom C5.
Theorem 3. Suppose that X ∈ D5 \ C5 has the form (1), andV satisfies the conditions of Theorem 2. Define
V (s) =
V11 β1 β2
βT1 γ1 s
βT2 s γ2
.
Then V (s) •X < 0 for any s, and V (s) ∈ C∗5 for s ≥ √γ1γ2.
Separation for larger matrices with block structure
Procedure for 5 case where X 6> 0 can be generalized to largermatrices with block structure. Assume X has the form
X =
X11 X12 X13 . . . X1k
XT12 X22 0 . . . 0
XT13 0 . . . . . . ...... ... . . . . . . 0
XT1k 0 . . . 0 Xkk
, (2)
where k ≥ 3, each Xii is an ni × ni matrix, and∑ki=1 ni = n.
Lemma 1. Suppose that X ∈ Dn has the form (2), k ≥ 3,and let
Xi =
(X11 X1i
XT1i Xii
), i = 2, . . . , k.
Then X ∈ Cn if and only if there are matrices Aii, i = 2, . . . , ksuch that
∑ki=2Aii = X11, and(Aii X1i
XT1i Xii
)∈ Cn1+ni, i = 2, . . . , k.
Moreover, if G(Xi) is a CP graph for each i = 2, . . . , k,then the above statement remains true with Cn1+ni replacedby Dn1+ni.
Theorem 4. Suppose that X ∈ Dn \ Cn has the form (2),where G(Xi) is a CP graph, i = 2, . . . , k. Then there is amatrix V , also of the form (2), such that(
V11 V1i
V T1i Vii
)∈ D∗n1+ni, i = 2, . . . , k,
and V • X < 0. Moreover, if γi = diag(Diag(Vii).5), then the
matrix
V =
V11 . . . V1k... . . . ...
V T1k . . . Vkk
,
where Vij = γiγTj , 2 ≤ i 6= j ≤ k, has V ∈ C∗n and V • X =
V •X < 0.
Note that:
•Matrix X may have numerical entries that are small but notexactly zero. Can then apply Lemma 1 to perturbed matrix Xwhere entries of X below a specified tolerance are set to zero. Ifa cut V separating X from Cn is found, then V •X ≈ V •X <0, and V is very likely to also separate X from Cn.
Note that:
•Matrix X may have numerical entries that are small but notexactly zero. Can then apply Lemma 1 to perturbed matrix Xwhere entries of X below a specified tolerance are set to zero. Ifa cut V separating X from Cn is found, then V •X ≈ V •X <0, and V is very likely to also separate X from Cn.
• Theorem 4 may provide a cut separating a given X ∈ Dn \ Cneven when the sufficient conditions for generating such a cutare not satisfied. In particular, a cut may be found even whenthe condition that Xi is a CP graph for each i is not satisfied.
A second case where block structure can be used to generate cutsfor a matrix X ∈ Dn \ Cn is when X has the form
X =
I X12 X13 . . . X1k
XT12 I X23 . . . X2k
XT13 XT
23. . . . . . ...
... ... . . . . . . X(k−1)k
XT1k X
T2k . . . XT
(k−1)kI
, (3)
where k ≥ 2, each Xij is an ni × nj matrix, and∑ki=1 ni = n.
The structure in (3) corresponds to a partitioning of the vertices{1, 2, . . . , n} into k stable sets in G(X), of size n1, . . . , nk (notethat ni = 1 is allowed).
Applications
Example 1: Consider the box-constrained QP problem
(QPB) max xTQx + cTx
s.t. 0 ≤ x ≤ e.
Applications
Example 1: Consider the box-constrained QP problem
(QPB) max xTQx + cTx
s.t. 0 ≤ x ≤ e.
To describe a CP-formulation of QPB, define matrices
Y =
(1 xT
x X
), Y + =
1 xT sT
x X Z
s ZT S
.
Applications
Example 1: Consider the box-constrained QP problem
(QPB) max xTQx + cTx
s.t. 0 ≤ x ≤ e.
To describe a CP-formulation of QPB, define matrices
Y =
(1 xT
x X
), Y + =
1 xT sT
x X Z
s ZT S
.
By the result of [Bur09], QPB can then be written in the form
(QPBCP) max Q •X + cTx
s.t. x + s = e, Diag(X + 2Z + S) = e,
Y + ∈ C2n+1.
Replacing C2n+1 with D2n+1 gives tractable DNN relaxation.
• Can be shown that DNN relaxation is equivalent to “SDP+RLT”relaxation, and for n = 2 problem is equivalent to QPB [AB10].
Replacing C2n+1 with D2n+1 gives tractable DNN relaxation.
• Can be shown that DNN relaxation is equivalent to “SDP+RLT”relaxation, and for n = 2 problem is equivalent to QPB [AB10].
• Constraints from Boolean Quadric Polytope (BQP) are valid foroff-diagonal components of X [BL09]. For n = 3, BQP is com-pletely determined by triangle inequalities and RLT constraints.However, can still be a gap when using SDP+RLT+TRI relax-ation.
Replacing C2n+1 with D2n+1 gives tractable DNN relaxation.
• Can be shown that DNN relaxation is equivalent to “SDP+RLT”relaxation, and for n = 2 problem is equivalent to QPB [AB10].
• Constraints from Boolean Quadric Polytope (BQP) are valid foroff-diagonal components of X [BL09]. For n = 3, BQP is com-pletely determined by triangle inequalities and RLT constraints.However, can still be a gap when using SDP+RLT+TRI relax-ation.
• Example from [BL09] with n = 3 has optimal value for QPBof 1.0, value for SDP+RLT+TRI relaxation of 1.093. Solutionmatrix Y + has 5 × 5 principal submatrix that is not strictlypositive, and is not CP. Can obtain cut from Theorem 3, re-solve problem, and repeat.
1.00E‐07
1.00E‐06
1.00E‐05
1.00E‐04
1.00E‐03
1.00E‐02
1.00E‐01
1.00E+00
Gap
to optim
al value
1.00E‐09
1.00E‐08
1.00E‐07
1.00E‐06
1.00E‐05
1.00E‐04
1.00E‐03
1.00E‐02
1.00E‐01
1.00E+00
0 5 10 15 20 25
Gap
to optim
al value
Number of CP cuts added
Figure 1: Gap to optimal value for Burer-Letchford QPB problem (n = 3)
Example 2: Let A be the adjacency matrix of a graph G onn vertices, and let α be the maximum size of a stable set in G.Known [dKP02] that
Example 2: Let A be the adjacency matrix of a graph G onn vertices, and let α be the maximum size of a stable set in G.Known [dKP02] that
α−1 = min{
(I + A) •X : eeT •X = 1, X ∈ Cn}. (4)
Example 2: Let A be the adjacency matrix of a graph G onn vertices, and let α be the maximum size of a stable set in G.Known [dKP02] that
α−1 = min{
(I + A) •X : eeT •X = 1, X ∈ Cn}. (4)
Relaxing Cn to Dn results in the Lovasz-Schrijver bound
(ϑ′)−1 = min{
(I + A) •X : eeT •X = 1, X ∈ Dn}. (5)
Let G12 be the complement of the graph corresponding to thevertices of a regular icosahedron [BdK02]. Then α = 3 and ϑ′ ≈3.24.
Let G12 be the complement of the graph corresponding to thevertices of a regular icosahedron [BdK02]. Then α = 3 and ϑ′ ≈3.24.
• Using the cone K112 to approximate the dual of (4) provides no
improvement [BdK02].
Let G12 be the complement of the graph corresponding to thevertices of a regular icosahedron [BdK02]. Then α = 3 and ϑ′ ≈3.24.
• Using the cone K112 to approximate the dual of (4) provides no
improvement [BdK02].
• For the solution matrix X ∈ D12 from (5), cannot find cutbased on first block structure (2). However can find a cutbased on (3). Adding this cut and re-solving, gap to 1/α = 1
3is approximately 2× 10−8.
References
[AB10] Kurt M. Anstreicher and Samuel Burer, Computable representations for convex hulls of low-dimensional quadratic forms, Math. Prog. B 124 (2010), 33–43.
[BAD09] Samuel Burer, Kurt M. Anstreicher, and Mirjam Dur, The difference between 5 × 5 doublynonnegative and completely positive matrices, Linear Algebra Appl. 431 (2009), 1539–1552.
[BD10] Samuel Burer and Hongbo Dong, Separation and relaxation for cones of quadratic forms, Dept.of Management Sciences, University of Iowa (2010).
[BdK02] Immanuel M. Bomze and Etienne de Klerk, Solving standard quadratic optimization problems vialinear, semidefinite and copositive programming, J. Global Optim. 24 (2002), 163–185.
[BL09] Samuel Burer and Adam N. Letchford, On nonconvex quadratic programming with box constri-ants, SIAM J. Optim. 20 (2009), 1073–1089.
[Bur09] Samuel Burer, On the copositive representation of binary and continuous nonconvex quadraticprograms, Math. Prog. 120 (2009), 479–495.
[BX04] Abraham Berman and Changqing Xu, 5× 5 Completely positive matrices, Linear Algebra Appl.393 (2004), 55–71.
[DA10] Hongbo Dong and Kurt Anstreicher, Separating doubly nonnegative and completely positive ma-trices, Dept. of Management Sciences, University of Iowa (2010).
[dKP02] Etienne de Klerk and Dmitrii V. Pasechnik, Approximation of the stability number of a graph viacopositive programming, SIAM J. Optim. 12 (2002), 875–892.
[HJR05] Leslie Hogben, Charles R. Johnson, and Robert Reams, The copositive completion problem, LinearAlgebra Appl. 408 (2005), 207–211.
[KB93] Natalia Kogan and Abraham Berman, Characterization of completely positive graphs, DiscreteMath. 114 (1993), 297–304.