method and approximation algorithms i - d steurer · for approximation algorithms: need...
TRANSCRIPT
Cargese Workshop, 2014
SUM-OF-SQUARES method and approximation algorithms I
David SteurerCornell
meta-task
given: functions ð1, ⊠, ðð: ±1ð â â
find: solution ð¥ â ±1 ð to ð1 = 0,⊠, ðð = 0
encoded as low-degree polynomial in â ð¥
example: ð(ð¥) = ð,ðâ ð ð€ðð â ð¥ð â ð¥ð2
examples: combinatorial optimization problem on graph ðº
MAX CUT: ð¿ðº = 1 â í over ±1 ð
MAX BISECTION: ð¿ðº = 1 â í, ð ð¥ð = 0 over ±1 ð
where 1 â í is guess for optimum value
Laplacian ð¿ðº =1
ðž ðº ððâðž ðº
1
4ð¥ð â ð¥ð
2
goal: develop SDP-based algorithms with provable guarantees in terms of complexity and approximation
(âon the edge intractabilityâ need strongest possible relaxations)
price of convexity: individual solutions distributions over solutions
price of tractability: can only enforce âefficiently checkable knowledgeâ about solutions
individual solutions
distributions over solutions
âpseudo-distributions over solutionsâ(consistent with efficiently checkable knowledge)
given: functions ð1, ⊠, ðð: ±1ð â â
find: solution ð¥ â ±1 ð to ð1 = 0,⊠, ðð = 0
meta-task
goal: develop SDP-based algorithms with provable guarantees in terms of complexity and approximation
distribution ð· over ±1 ð
function ð·: ±1 ð â â
non-negativity: ð· ð¥ ⥠0 for all ð¥ â ±1 ð
normalization: ð¥â ±1 ð· ð¥ = 1
distribution ð· satisfies ð1 = 0,⊠, ðð = 0 for some ðð: ±1ð â â
ðŒð·ð12 +â¯+ ðð
2 = 0 (equivalently: âð· âð. ðð â 0 = 0)
examplesuniform distribution: ð· = 2âð
fixed 2-bit parity: ð· ð¥ = (1 + ð¥1ð¥2)/2ð
examplesfixed 2-bit parity distribution satisfies ð¥1ð¥2 = 1uniform distribution does not satisfy ð = 0 for any ð â 0
convex: ð·,ð·â² satisfy conditions ð· + ð·â² /2 satisfies conditions
# function values is exponential need careful representation
# independent inequalities is exponential not efficiently checkable
distribution ð· over ±1 ð
function ð·: ±1 ð â â
non-negativity: ð· ð¥ ⥠0 for all ð¥ â ±1 ð
normalization: ð¥â ±1 ð· ð¥ = 1
distribution ð· satisfies ð1 = 0,⊠, ðð = 0 for some ðð: ±1ð â â
ðŒð·ð12 +â¯+ ðð
2 = 0 (equivalently: âð· âð. ðð â 0 = 0)
deg.-ð pseudo-distribution ð·
ð¥â ±1 ðð· ð¥ ð ð¥ 2 ⥠0 for
every deg.-ð/2 polynomial ð
convenient notation: ðŒð·ð â ð¥ð· ð¥ ð ð¥âpseudo-expectation of ð under ð·â
ðŒð·
deg.-2ð pseudo-distributions are actual distributions
(point-indicators ð ð¥ have deg. ð ð· ð¥ = ðŒð·ð ð¥2 ⥠0)
pseudo-
deg.-ð pseudo-distr. ð·: ±1 ð â â
non-negativity: ðŒð·ð2 ⥠0 for every deg.-ð/2 poly. ð
normalization: ðŒð·1 = 1
pseudo-distr. ð· satisfies ð1 = 0,⊠, ðð = 0 for some ðð: ±1ð â â
ðŒð·ð12 +â¯+ ðð
2 = 0 (equivalently: ðŒð·ðð â ð = 0 whenever deg ð †ð â deg ðð)
notation: ðŒð·ð â ð¥ð· ð¥ ð ð¥ , âpseudo-expectation of ð under ð·â
deg.-ð pseudo-distr. ð·: ±1 ð â â
non-negativity: ðŒð·ð2 ⥠0 for every deg.-ð/2 poly. ð
normalization: ðŒð·1 = 1
pseudo-distr. ð· satisfies ð1 = 0,⊠, ðð = 0 for some ðð: ±1ð â â
ðŒð·ð12 +â¯+ ðð
2 = 0 (equivalently: ðŒð·ðð â ð = 0 whenever deg ð †ð â deg ðð)
notation: ðŒð·ð â ð¥ð· ð¥ ð ð¥ , âpseudo-expectation of ð under ð·â
claim: can compute such ð· in time ðð(ð) if it exists (otherwise, certify that no solution to original problem exists)
(can assume ð· is deg.-ð polynomial separation problem minð
ðŒð·ð2 is ðð-
dim. eigenvalue prob. ðð(ð)-time via grad. descent / ellipsoid method)
[Shor, Parrilo, Lasserre]
surprising property: ðŒð·ð ⥠0 for many* low-degree polynomials ðsuch that ð ⥠0 follows from ð1 = 0,⊠, ðð = 0 by âexplicit proofâ
soon: examples of such properties and how to exploit them
deg.-ð pseudo-distr. ð·: ±1 ð â â
non-negativity: ðŒð·ð2 ⥠0 for every deg.-ð/2 poly. ð
normalization: ðŒð·1 = 1
pseudo-distr. ð· satisfies ð1 = 0,⊠, ðð = 0 for some ðð: ±1ð â â
ðŒð·ð12 +â¯+ ðð
2 = 0 (equivalently: ðŒð·ðð â ð = 0 whenever deg ð †ð â deg ðð)
notation: ðŒð·ð â ð¥ð· ð¥ ð ð¥ , âpseudo-expectation of ð under ð·â
surprising property: ðŒð·ð ⥠0 for many* low-degree polynomials ðsuch that ð ⥠0 follows from ð1 = 0,⊠, ðð = 0 by âexplicit proofâ
deg.-ð pseudo-distr. ð·: ±1 ð â â
non-negativity: ðŒð·ð2 ⥠0 for every deg.-ð/2 poly. ð
normalization: ðŒð·1 = 1
pseudo-distr. ð· satisfies ð1 = 0,⊠, ðð = 0 for some ðð: ±1ð â â
ðŒð·ð12 +â¯+ ðð
2 = 0 (equivalently: ðŒð·ðð â ð = 0 whenever deg ð †ð â deg ðð)
notation: ðŒð·ð â ð¥ð· ð¥ ð ð¥ , âpseudo-expectation of ð under ð·â
soon: examples of such properties and how to exploit them
emerging algorithm-design paradigm:analyze algorithm pretending that underlying actual distribution exists; verify only afterwards that low-deg. pseudo-distr.âs satisfy required properties
pseudo-distr. over optimal solutions
approximate solution (to original problem)
efficient algorithm
deg.-ð part of actual distr. over optimal solutions
ðð(ð)-time algorithms cannot* distinguish between deg.-ð pseudo-distributions and deg.-ð part of actual distr.âs
dual view (sum-of-squares proof system)
eitherâ deg.-ð pseudo-distribution ð· over ±1 ð satisfying ð1 = 0,⊠, ðð = 0
or â ð1, ⊠, ðð and â1, ⊠, âð such that ð ðð â ðð + ð âð
2 = â1 over ±1 ð
and deg ðð + degðð †ð and deg âð †ð/2
derivation of unsatisfiable constraint â1 ⥠0from ð1 = 0,⊠, ðð = 0 over ±1 ð
â1
ð·ðð1ð2ðð
ðŸð = ð = ð ðð â ðð + ð âð2
ðŸð
if â1 â ðŸð then â separating hyperplane ð·with ðŒð· â 1 = â1 and ðŒð·ð ⥠0 for all ð â ðŸð
pseudo-distribution satisfies all local properties of ±ð ð
claimsuppose ð ⥠0 is ð/2-junta over ±1 ð (depends on †ð/2 coordinates)then, ðŒð·ð ⥠0
proof: ð has degree †ð/2 ðŒð·ð = ðŒð· ð2⥠0
corollaryfor any set ð of †ð coordinates, marginal ð·â² = ð¥ð ð· is actual distribution
ð·â² ð¥ð =
ð¥ ð âð
ð· ð¥ð, ð¥ ð âð = ðŒð·ð ð¥ð ⥠0
ð-junta
(also captured by LP methods, e.g., SheraliâAdams hierarchies ⊠)
example: triangle inequalities over ±1 ð
ðŒð· ð¥ð â ð¥ð2+ ð¥ð â ð¥ð
2â ð¥ð â ð¥ð
2 ⥠0
conditioning pseudo-distributions
claim
âð â ð , ð â ±1 . ð·â² = ð¥ ⣠ð¥ð = ðð·
is deg.- ð â 2 pseudo-distr.
proof
ð·â² ð¥ =1
âð· ð¥ð=ðð· ð¥ â 1 ð¥ð=ð
ðŒð·â²ð2 â ðŒð·1 ð¥ð=ðð2 = ðŒð· 1 ð¥ð=ð
ð2⥠0
deg ð †(ð â 2)/2 degð ð¥ð=ðð †ð/2
(also captured by LP methods, e.g., SheraliâAdams hierarchies ⊠)
pseudo-covariances are covariances of distributions over âð
claimthere exists a (Gaussian) distr. ð over âð such that
ðŒð·ð¥ = ðŒ ð and ðŒð·ð¥ð¥ð = ðŒ ððð
let ð = ðŒð·ð¥ and ð = ðŒð· ð¥ â ð ð¥ â ð ð
choose ð to be Gaussian with mean ð and covariance ð
matrix ð p.s.d. because ð£ððð£ = ðŒð· ð£ðð¥ 2 ⥠0 for all ð£ â âð
consequence: ðŒð·ð = ðŒ ð ð
for every ð of deg. 2
square of linear form
proof
claimfor every univariate ð ⥠0 over â and every ð-variate polynomial ðwith deg ð â deg ð †ð,
ðŒð·ð ð ð¥ ⥠0
enough to show: ð is sum of squares
choose: minimizer ðŒ of ð
proof by induction on deg ð
squares sum of squares by ind. hyp.
ðŒ
ð ðŒ ⥠0
then: p= ð ðŒ + ð¥ â ðŒ 2 â ðâ² for some polynomial ðâ² with deg ðâ² < deg ð
â
pseudo-distr.âs satisfy (compositions of) low-deg. univariate properties
useful class of non-local higher-deg. inequalities
ð
MAX CUT
given: deg.-ð pseudo-distr. ð· over ±1 ð, satisfies ð¿ðº = 1 â í
ð¿ðº =1
ðž ðº ððâðž ðº
1
4ð¥ð â ð¥ð
2
goal: find ðŠ â ±1 ð with ð¿ðº ðŠ ⥠1 â ð í
algorithm
sample from Gaussian distr. ð over âð with ðŒ ððð = ðŒð· ð¥ð¥ð
output ðŠ = sgn ð
analysis
claim: âð· ð¥ð â ð¥ð = 1 â ð â â ðŠð â ðŠð ⥠1 â ð ð
proof: ðð , ðð satisfies âðŒ ðððð = â ðŒð·ð¥ðð¥ð = 1 â ð ð and ðŒðð2 = ðŒðð
2 = 1
(tedious calculation) â sgn ðð â sgn ðð ⥠1 â ð ð
[Goeman-Williamson]
low global correlation in (pseudo-)distributions
claimâð. â deg.- ð â 2ð pseudo-distribution ð·â², obtained by conditioning ð·,
Avgð,ðâ ð ðŒð·â² ð¥ð , ð¥ð †1/ð
[Barak-Raghavendra-S.,Raghavendra-Tan]
proofpotential Avgðâ ð ð» ð¥ð ; greedily condition on variables to maximize
potential decrease until global correlation is low
mutual information: ðŒ ð¥, ðŠ = ð» ð¥ â ð» ð¥ ðŠ
potential decrease ⥠Avgðâ ð ð» ð¥ð â Avgðâ ð Avgðâ ð ð» ð¥ð ⣠ð¥ð= Avgð,ðâ ð ðŒð·â² ð¥ð , ð¥ð
how often do we need to condition?
only need to condition †ð times
MAX BISECTION
given: deg.-ð pseudo-distr. ð· over ±1 ð, satisfies ð¿ðº = 1 â í, ð ð¥ð = 0
goal: find ðŠ â ±1 ð with ð¿ðº ðŠ ⥠1 â ð í and ð ðŠð = 0
ð = 1/íð 1
algorithm
let ð·â² be conditioning of ð· with global correlation †íð 1
sample Gaussian ð with same deg.-2 moments as ð·â²output ðŠ with ðŠð = sgn(ðð â ð¡ð) (choose ð¡ð â â so that ðŒ ðŠð = ðŒð·ð¥ð)
analysis
almost as before:âð·â² ð¥ð â ð¥ð ⥠1 â ð â â ðŠð â ðŠð ⥠1 â ð ð
(ð¡ð = 0 is worst case same analysis as MAX CUT)
new: ðŒ ð¥ð , ð¥ð †íð 1 â ðŒðŠððŠð = ðŒ ð¥ðð¥ð ± íð(1)
ðŒ ð ðŠð †ðŒ ð ðŠð2 1/2 = ðŒ ð ð¥ð
2 1/2+ íð 1 â ð = íð 1 â ð
get bisection ðŠâ² from ðŠ by correcting íð(1) fraction of vertices
[Raghavendra-Tan]
ðŒ ð ð¥ð2 = 0
Cargese Workshop, 2014
SUM-OF-SQUARES method and approximation algorithms II
David SteurerCornell
sparse vector
given: linear subspace ð â âð (represented by some basis), parameter ð â ð
promise: âð£0 â ð such that ð£0 is ð-sparse (and ð£0 â 0,±1 ð)goal: find ð-sparse vector ð£ â ð
efficient approximation algorithm for ð = Ω ð would be major step toward refuting Khotâs Unique Games Conjecture and improved guarantees for MAX CUT, VERTEX COVER, âŠ
planted / average-case version (benchmark for unsupervised learning tasks)
subspace ð spanned by ð â 1 random vectors and some ð-sparse vector ð£0
previous best algorithms only work for very sparse vectors ð
ð†1/ ð
[Spielman-Wang-Wright, Demanet-Hand]
here: deg.-4 pseudo-distributions work for ð
ð= Ω ð up to ð †ð ð
[Barak-Kelner-S.]
limitations of ââ/â1 (previous best algorithm; exact via linear programming)
limitations of std. SDP relaxation for â2/â1 (best proxy for sparsity)
analytical proxy for sparsity
if vector ð£ is ð-sparse then ð£ â
ð£ 1â¥
1
ð, ð£ 2
2
ð£ 12 â¥
1
ð, and
ð£ 44
ð£ 24 â¥
1
ð
(tight if ð£ â 0,±1 ð)
ð£ = sum of ð random ±1 vectors with same first coordinate
â âð£ â ⥠ð , â âð£ 1 †ð + ð ð ratio âð
ð
ââ/â1 algorithm fails for ð
ðâ¥
1
ð
âideal objectâ: distribution ð· over â2 unit sphere of subspace ðâ1-constraint: ðŒð· ð£ 1
2 †ð
tractable relaxation: ð,ð ðŒð·ð£ðð£ð †ð
not a low-deg. polynomial in ð£ unclear how to represent(also NP-hard in worst-case)
[d'Aspremont-El Ghaoui-Jordan-Lanckriet]
but: for uniform distr. ð· over â2 sphere of ð-dim. rand. subspace
ð,ð ðŒð·ð£ðð£ð âð
ð same limitation as ââ/â1
deg.-ð pseudo-distr. ð·: ð£ â ð; ð£ 2 = 1 â â over unit â2-sphere of ð
degree-ð SOS relaxation for â4/â2
pseudo-distribution satisfies ð£ 44 = 1/ð
notation: ðŒð·ð â ð£âð;ð£ =1
ð· â ð (only consider polynomials easy to integrate)
normalization: ðŒð·1 = 1
non-negativity: ðŒð·â(ð£)2 ⥠0 for every â of deg. †ð/2
orthogonality: ðŒð· ð£ 44 â
1
ðâ ð(ð£) = 0 for every ð of deg. †ð â 4
set of deg.-ð pseudo-distributions
= convex set with ðð ð -time separation oracle
separation problemgiven: function ð· (represented as deg.-ð polynomial)
check: quadratic form ð ⊠ðŒð·ð2 is p.s.d. or output
violated constraint ðŒð·ð2 < 0
how to find pseudo-distributions?
deg.-ð pseudo-distr. ð·: ð£ â ð; ð£ 2 = 1 â â over unit â2-sphere of ð
degree-ð SOS relaxation for â4/â2
pseudo-distribution satisfies ð£ 44 = 1/ð
notation: ðŒð·ð â ð£âð;ð£ =1
ð· â ð (only consider polynomials easy to integrate)
normalization: ðŒð·1 = 1
non-negativity: ðŒð·â(ð£)2 ⥠0 for every â of deg. †ð/2
orthogonality: ðŒð· ð£ 44 â
1
ðâ ð(ð£) = 0 for every ð of deg. †ð â 4
rule of thumb: set of deg.-ð pseudo-moments ðŒð·ð ⣠deg ð †ð difficult*
to distinguish / separate from deg.-ð moments of actual distr. of solutions
(* unless you invest ðΩ ð time to distinguish)
also: values ðŒð·ð ⣠deg ð > ð do not carry additional information
no need to look at them
how to use pseudo-distributions?
deg.-ð pseudo-distr. ð·: ð£ â ð; ð£ 2 = 1 â â over unit â2-sphere of ð
degree-ð SOS relaxation for â4/â2
pseudo-distribution satisfies ð£ 44 = 1/ð
notation: ðŒð·ð â ð£âð;ð£ =1
ð· â ð (only consider polynomials easy to integrate)
normalization: ðŒð·1 = 1
non-negativity: ðŒð·â(ð£)2 ⥠0 for every â of deg. †ð/2
orthogonality: ðŒð· ð£ 44 â
1
ðâ ð(ð£) = 0 for every ð of deg. †ð â 4
dual view (SOS certificates)
ð£ 44 â
1
ðâ ð + ð âð
2 = â1 over ð£ â ð; ð£ 2 = 1
for some ð of deg. †ð â 4 and {âð} of deg. †ð/2
â no deg.-ð pseudo-distr. exists ( no solution exists)
for approximation algorithms: need pseudo-distr. to extract approx. solution(hard to exploit non-existence of SOS certificate directly)
deg.-ð pseudo-distr. ð·: ð£ â ð; ð£ 2 = 1 â â over unit â2-sphere of ð
degree-ð SOS relaxation for â4/â2
pseudo-distribution satisfies ð£ 44 = 1/ð
notation: ðŒð·ð â ð£âð;ð£ =1
ð· â ð (only consider polynomials easy to integrate)
normalization: ðŒð·1 = 1
non-negativity: ðŒð·â(ð£)2 ⥠0 for every â of deg. †ð/2
orthogonality: ðŒð· ð£ 44 â
1
ðâ ð(ð£) = 0 for every ð of deg. †ð â 4
CauchyâSchwarz inequality
Hölderâs inequality
â4-triangle inequality
ðŒð· ð¢, 𣠆ðŒð· ð¢ 2 1/2 ðŒð· ð£ 2 1/2
let ð· = ð¢, ð£ be a deg.-4 pseudo-distribution over âð à âð
ðŒð· ð ð¢ð3 â ð£ð †ðŒð· ð¢ 4
4 3/4 ðŒð· ð£ 44 1/4
ðŒð· ð¢ + ð£ 44 †ðŒð· ð¢ 4
4 1/4+ ðŒð· ð£ 4
4 1/4
following inequalities hold as expected (same as for distributions)
general properties of pseudo-distributions
claimlet ðâ² â âð be a random ð-dim. subspace with ð ⪠ðlet ðâ² be the orthogonal projector into ðâ²
then w.h.p, ðâ²ð£ 44 =
ð 1
ðð£ 2
4 â ð âð ð£ 2 over ð£ â âð for âðâs of deg. 4
[Barak-Brandao-Harrow-Kelner-S.-Zhou]
proof sketch
(SOS certificate for classical inequality ðâ²ð£ 44 â€
ð 1
ðð£ 2
4)
basis change: let ð¥ = ðµðð£ where ðµâs columns are orthonormal basis of ð
(so that ðâ² = ðµðµð) ðâ²ð£ 44 =
1
ð2 ð ðð , ð¥
4 with ð1, ⊠, ðð close to i.i.d.
standard Gaussian vectors (so that ðŒð ð, ð¥ 2 = ð¥ 22 and ðŒð ð, ð¥ 4 = 3 â
ð¥ 24)
enough to show:1
ð ð=1ð ðð , ð¥
4 = ð 1 â ðŒð ð, ð¥ 4 â ð âðâ² ð¥ 2
reduce to deg. 2: 1
ð ð=1ð ðð
â2, ðŠ 2 †ð 1 â ðŒð ðâ2, ðŠ 2 (ðŠ = ð¥â2)
use concentration inequalities for quadratic forms (aka matrices)
given: some basis of subspace ð = span ðⲠ⪠ð£0 â âð,where ðâ² â âð random ð-dim. subspace,
and ð£0 â âð with ð£0 ⥠ðâ², ð£0 44 =
1
ð, and ð£0 2
4 = 1 (e.g., ð-sparse)
approximation algorithm for planted sparse vector
compute deg.-4 pseudo-distr. ð· = {ð£} over unit ball of ð satisfying ð£ 44 =
1
ð
goal: find unit vector ð€ with ð€, ð£02 ⥠1 â ð ð/ð 1/4
algorithm
sample Gaussian distr. ð€ with ðŒ ð€ð€ð = ðŒð·ð£ð£ð and renormalize
analysis
claim: ðŒð· ð£, ð£02 ⥠1 â ð ð/ð 1/4 ( Gaussian ð€ almost 1-dim.)
analysis
claim: ðŒð· ð£, ð£02 ⥠1 â ð ð/ð 1/4 ( Gaussian ð€ almost 1-dim.)
1
ð1/4= ðŒð· ð£ 4
4 1/4(ð· satisfies â âð£ 4
4 = 1 ð )
= ðŒð· ð£, ð£0 ð£0 + ðâ²ð£ 44 1/4
(same function)
†ðŒð· ð£, ð£0 ð£0 44 1 4
+ ðŒð· ðâ²ð£ 44 1 4
(â4-triangle inequ.)
â€1
ð 1 4 â ðŒð· ð£, ð£04 1 4
+ð 1
ð1/4(SOS cert. for ðâ²)
ðŒð· ð£, ð£04 ⥠1 â ð ð/ð 1/4
ðŒð· ð£, ð£02 ⥠1 â ð ð/ð 1/4 (because ð£, ð£0
4 = 1 â ðâ²ð£ 22 ð£, ð£0
2)
given: some basis of subspace ð = span ðⲠ⪠ð£0 â âð,where ðâ² â âð random ð-dim. subspace,
and ð£0 â âð with ð£0 ⥠ðâ², ð£0 44 =
1
ð, and ð£0 2
4 = 1 (e.g., ð-sparse)
approximation algorithm for planted sparse vector
goal: find unit vector ð€ with ð€, ð£02 ⥠1 â ð ð/ð 1/4
CauchyâSchwarz inequality
Hölderâs inequality
â4-triangle inequality
ðŒð· ð¢, 𣠆ðŒð· ð¢ 2 1/2 ðŒð· ð£ 2 1/2
let ð· = ð¢, ð£ be a deg.-4 pseudo-distribution over âð à âð
ðŒð· ð ð¢ð3 â ð£ð †ðŒð· ð¢ 4
4 3/4 ðŒð· ð£ 44 1/4
ðŒð· ð¢ + ð£ 44 †ðŒð· ð¢ 4
4 1/4+ ðŒð· ð£ 4
4 1/4
following inequalities hold as expected (same as for distributions)
general properties of pseudo-distributions
products of pseudo-distributions
claimsuppose ð·,ð·â²: Ω â â is deg.-ð pseudo-distr. over Ωthen, ð·âð·â²: Ω à Ω â â is deg.-ð pseudo-distr. over Ω à Ω
proof
tensor products of positive semidefinite matrices are positive semidefinite
CauchyâSchwarz inequality
ðŒð· ð¢, 𣠆ðŒð· ð¢ 22 1/2 ðŒð· ð£ 2
2 1/2
let ð· = ð¢, ð£ be a deg.-2 pseudo-distribution over âð à âð
ðŒð· ð¢, ð£2
= ðŒð·âð· ð¢, ð£ ð¢â², ð£â² (ð· âð·â² is product pseudo-distr.)
= ðŒð·âð· ðð ð¢ðð£ðð¢ðâ²ð£ð
â²
â€1
2 ðŒð·âð· ðð ð¢ð
2 ð£ðâ² 2
+ ðð ð¢ðâ² 2
ð£ð2 (2ðð = ð2 + ð2 â ð â ð 2)
=1
2 ðŒð·âð· ð¢ 2
2 ð£â² 22 + ð¢â² 2
2 ð£ 22
= ðŒð· ð¢ 22 â ðŒð· ð£ 2
2 (ð· âð·â² is product pseudo-distr.)
proof
let ð· = ð¢, ð£ be a deg.-4 pseudo-distribution over âð à âð
proof
Hölderâs inequality
ðŒð· ð ð¢ð3 â ð£ð †ðŒð· ð¢ 4
4 3/4 ðŒð· ð£ 44 1/4
ðŒð· ð ð¢ð3 â ð£ð
†ðŒð· ð ð¢ð4 1 2
â ðŒð· ð ð¢ð2 â ð£ð
2 1/2(Cauchy-Schwarz)
†ðŒð· ð ð¢ð4 1 2
â ðŒð· ð ð¢ð4 â ðŒð· ð ð£ð
4 1/4(Cauchy-Schwarz)
we also used:{ð¢, ð£} deg-4 pseudo-distr. ð¢ â ð¢, ð¢ â ð£ deg.-2 pseudo-distr.(every deg.-2 poly. in ð¢ â ð¢, ð¢ â ð£ is deg.-4 poly. in ð¢, ð£ )
let ð· = ð¢, ð£ be a deg.-4 pseudo-distribution over âð à âð
â4-triangle inequality
ðŒð· ð¢ + ð£ 44 1/4
†ðŒð· ð¢ 44 1/4
+ ðŒð· ð£ 44 1/4
proof
expand ð¢ + ð£ 44 in terms of ð ð¢ð
4, ð ð¢ð3ð£ð , ð ð¢ð
2ð£ð2, ð ð¢ðð£ð
3, ð ð£ð4
bound pseudo-expect. of âmixed termsâ using Cauchy-Schwarz / Hölder
check that total is equal to right-hand side
tensor decomposition
given: tensor ð â ð ððâ4
(in spectral norm) for nice ð1, ⊠, ðð â âð
goal: find set of vectors ðµ â ±ð1, ⊠, ±ðð
for simplicity: orthonormal and ð = ð
approach
show âuniquenessâ: ð ððâ4
â ð ððâ4
â ±ð1, ⊠, ±ðð â ±ð1, ⊠, ±ðð
show that uniqueness proof translates to SOS certificate
any pseudo-distribution over decomposition is âconcentratedâ on unique decomposition ±ð1, ⊠, ±ðð
recover decomposition by reweighing pseudo-distribution by log ðdegree polynomial (approximation to ð¿ function)