sparse approximations

34
Sparse Approximations Nick Harvey University of British Columbia

Upload: manning

Post on 24-Feb-2016

81 views

Category:

Documents


0 download

DESCRIPTION

Sparse Approximations. Nick Harvey University of British Columbia. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A. Approximating Dense Objects by Sparse Objects. Floor joists. Wood Joists. Engineered Joists. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Sparse Approximations

Sparse Approximations

Nick Harvey University of British Columbia

Page 2: Sparse Approximations

Approximating Dense Objectsby Sparse Objects

Floor joists

Wood Joists Engineered Joists

Page 3: Sparse Approximations

Approximating Dense Objectsby Sparse Objects

Bridges

Masonry Arch Truss Arch

Page 4: Sparse Approximations

Approximating Dense Objectsby Sparse Objects

Bones

Human Femur Robin Bone

Page 5: Sparse Approximations

Mathematically• Can an object with many pieces be

approximately represented by fewer pieces?

• Independent random sampling usually does well

• Theme of this talk: When can we beat random sampling?

Dense GraphSparse Graph

6 -1 -1 -1 -1 -1-1 4 -1 -1 -1-1 -1 6 -1 -1 -1

-1 5 -1 -1-1 -1 -1 7 -1 -1 -1-1 -1 -1 5-1 -1 -1 5 -1

-1 -1 -1 -1 6

6 -15 -1 -3

-1 28

-1 2 -11

-3 -1 52

Dense MatrixSparse Matrix

Page 6: Sparse Approximations

Talk Outline• Vignette #1: Discrepancy theory

• Vignette #2: Singular values and eigenvalues

• Vignette #3: Graphs

• Theorem on “Spectrally Thin Trees”

Page 7: Sparse Approximations

Discrepancy• Given vectors v1,…,vn2Rd with kvikp bounded.

Want y2{-1,1}n with ki yivikq small.• Eg1: If kvik1·1 then E ki yi vik1 ·• Eg2: If kvik1·1 then 9y s.t. ki yi vik1 ·

Spencer ‘85: Partial Coloring + Entropy MethodGluskin ‘89: Sidak’s LemmaGiannopoulos ‘97: Partial Coloring + SidakBansal ‘10: Brownian Motion + Semidefinite ProgramBansal-Spencer ‘11: Brownian Motion + Potential functionLovett-Meka ‘12: Brownian Motion

Non-algorithmic

Algorithmic

Page 8: Sparse Approximations

Discrepancy• Given vectors v1,…,vn2Rd with kvikp

bounded.Want y2{-1,1}n with ki yivikq small.

• Eg1: If kvik1·1 then E ki yi vik1 ·• Eg2: If kvik1·1 then 9y s.t. ki yi vik1 · • Eg3: If kvik1·¯, kvik1·±, and ki vik1·1, then

9y with ki yi vik1 · Harvey ’13: Using Lovasz Local Lemma.Question: Can log(±/¯2) factor be improved?

Page 9: Sparse Approximations

Talk Outline• Vignette #1: Discrepancy theory

• Vignette #2: Singular values and eigenvalues

• Vignette #3: Graphs

• Theorem on “Spectrally Thin Trees”

Page 10: Sparse Approximations

Partitioning sums of rank-1 matrices• Let v1,…,vn2Rd satisfy i vivi

T=I and kvik2·±.Want y2{-1,1}n with ki yivivi

Tk2 small.• Random sampling: E ki yivivi

Tk2 · .Rudelson ’96: Proofs using majorizing measures, then nc-Khintchine

• Marcus-Spielman-Srivastava ’13:9y2{-1,1}n with ki yivivi

Tk2 · .

2

Page 11: Sparse Approximations

Partitioning sums of matrices• Given dxd symmetric matrices M1,

…,Mn2Rd withi Mi=I and kMik2·±.Want y2{-1,1}n with ki yiMik2 small.

• Random sampling: E ki yiMik2 · Also follows from nc-Khintchine.Ahlswede-Winter ’02: Using matrix moment generating function.Tropp ‘12: Using matrix cumulant generating function.

Page 12: Sparse Approximations

Partitioning sums of matrices

• Given dxd symmetric matrices M1,…,Mn2Rd withi Mi=I and kMik2·±.Want y2{-1,1}n with ki yiMik2 small.

• Random sampling: E ki yiMik2 · • Question: 9y2{-1,1}n with ki yiMik2 · ?• Conjecture: Suppose i Mi=I and kMikSch-1·±.

9y2{-1,1}n with ki yiMik2 · ?–MSS ’13: Rank-one case is true– Harvey ’13: Diagonal case is true (ignoring

log(¢) factor)

False!

Page 13: Sparse Approximations

Partitioning sums of matrices

• Given dxd symmetric matrices M1,…,Mn2Rd withi Mi=I and kMik2·±.Want y2{-1,1}n with ki yiMik2 small.

• Random sampling: E ki yiMik2 · • Question: Suppose only that kMik2·1.

9y2{-1,1}n with ki yiMik2 · ?– Spencer/Gluskin: Diagonal case is true

Page 14: Sparse Approximations

Column-subset selection• Given vectors v1,…,vn2Rd with kvik2=1.

Let st.rank=n/ki viviTk2. Let .

9y2{0,1}n s.t. i yi=k and (1-²)2 · ¸k( i yivivi

T ).

Spielman-Srivastava ’09: Potential function argumentYoussef ’12: Let . 9y2{0,1}n

s.t. i yi=k, (1-²)2 · ¸k( i yivivi

T ) and ¸1( i yiviviT ) ·

(1+²)2.

Page 15: Sparse Approximations

Column-subset selectionup to the stable rank

• Given vectors v1,…,vn2Rd with kvik2=1.Let st.rank=n/ki vivi

Tk2. Let .For y2{0,1}n s.t. i yi=k, can we control ¸k( i yivivi

T ) and ¸1( i yiviviT ) ?

– ¸k can be very small, say O(1/d).– Rudelson’s theorem: can get ¸1 · O(log d) and

¸k>0.– Harvey-Olver ’13: ¸1 · O(log d / log log d) and ¸k>0.–MSS ‘13: If i vivi

T =I, can get ¸1 · O(1) and ¸k>0.

Page 16: Sparse Approximations

Talk Outline• Vignette #1: Discrepancy theory

• Vignette #2: Singular values and eigenvalues

• Vignette #3: Graphs

• Theorem on “Spectrally Thin Trees”

Page 17: Sparse Approximations

Graph Laplacian

Lu = D-A =

7 -2 -5-2 3 -1-5 -1 16 -

10-

1010

abcd

a b c d

weighted degree of node

c

negative of u(ac)

Graph with weights u: 5 102 1

Laplacian Matrix:

ab

dc

Effective Resistance from s to t: voltage difference when each edge e is a (1/ue)-ohm resistor and a 1-amp current source placed between s and t= (es-et)T Lu

y (es-et)Effective Conductance: cst = 1 / (effective resistance from s

to t)

Page 18: Sparse Approximations

Spectral approximation of graphs

®-spectral sparsifier: Lu ¹ Lw ¹ ®¢Lu

5 -1 -1 -1 -1 -14 -1 -1 -1 -1

-1 -1 6 -1 -1 -1 -1-1 5 -1 -1 -1 -1

-1 -1 -1 7 -1 -1 -1 -1-1 -1 -1 5 -1 -1-1 -1 -1 5 -1 -1

-1 -1 -1 -1 6 -1 -1-1 -1 -1 -1 -1 5

-1 -1 -1 -1 -1 -1 6

6 -1 -55 -1 -3 -1

-1 2 -18 -8

-1 2 -11 -1

-3 -1 5 -12 -1 -1

-5 -1 -1 -1 8-1 -8 -1 10

Edge weights u

Edge weights w

Lu = Lw =

Page 19: Sparse Approximations

Ramanujan Graphs• Suppose Lu is complete graph on n vertices

(ue=1 8e).• Lubotzky-Phillips-Sarnak ’86:

For infinitely many d and n, 9w2{0,1}E such that e we=dn/2 (actually Lw is d-regular)and

• MSS ‘13: Holds for all d¸3, and all n=c¢2k.• Friedman ‘04: If Lw is a random d-regular graph,

then 8²>0

with high probability.

Page 20: Sparse Approximations

Arbitrary graphs• Spielman-Srivastava ’08: For any graph

Lu with n=|V|, 9w2RE such that |support(w)| = O(n log(n)/²2)

andProof: Follows from Rudelson’s theorem

• MSS ’13: For any graph Lu with n=|V|,9w2RE such that we 2 £(²2) ¢ N ¢ (effective

conductance of e) |support(w)| = O(n/²2)

and

Page 21: Sparse Approximations

Spectrally-thin trees• Question: Let G be an unweighted graph with n

vertices. Let C = mine (effective conductance of edge e).Want a subtree T of G with .

• Equivalent to

• Goddyn’s Conjecture ‘85: There is a subtree T with

– Relates to conjectures of Tutte (‘54) on nowhere-zero flows,and to approximations of the traveling salesman problem.

Page 22: Sparse Approximations

Spectrally-thin trees• Question: Let G be an unweighted graph with n

vertices. Let C = mine (effective conductance of edge e).Want a subtree T of G with .

• Rudelson’s theorem: Easily gives ® = O(log n).• Harvey-Olver ‘13: ® = O(log n / log log n).

Moreover, there is an efficient algorithm to find such a tree.

• MSS ’13: ® = O(1), but not algorithmic.

Page 23: Sparse Approximations

Talk Outline• Vignette #1: Discrepancy theory

• Vignette #2: Singular values and eigenvalues

• Vignette #3: Graphs

• Theorem on “Spectrally Thin Trees”

Page 24: Sparse Approximations

Given an (unweighted) graph G with eff. conductances ¸ C.Can find an unweighted tree T with

Spectrally Thin Trees

Proof overview:1. Show independent sampling gives

spectral thinness, but not a tree.► Sample every edge e independently with

prob. xe=1/ce

2. Show dependent sampling gives a tree, and spectral thinness still works.

Page 25: Sparse Approximations

Matrix ConcentrationTheorem: [Tropp ‘12]Let Y1,…,Ym be independent, PSD matrices of size nxn.Let Y=i Yi and Z=E [ Y ]. Suppose Yi ¹ R¢Z a.s. Then

Page 26: Sparse Approximations

Define sampling probabilities xe = 1/ce. It is known that e xe

= n–1.Claim: Independent sampling gives T µ E with E [|T|]=n–1 and

Theorem [Tropp ‘12]: Let M1,…,Mm be nxn PSD matrices.Let D(x) be a product distribution on {0,1}m with marginals x.Let Suppose Mi ¹ Z.ThenDefine Me = ce¢Le. Then Z = LG and Me ¹ Z holds.Setting ®=6 log n / log log n, we get whp.But T is not a tree!

Independent sampling

Laplacian of the single edge eProperties of conductances used

Page 27: Sparse Approximations

Given an (unweighted) graph G with eff. conductances ¸ C.Can find an unweighted tree T with

Spectrally Thin Trees

Proof overview:1. Show independent sampling gives spectral thinness,

but not a tree.► Sample every edge e independently with prob.

xe=1/ce

2. Show dependent sampling gives a tree, and spectral thinness still works.► Run pipage rounding to get tree T with Pr[ e2T ] = xe =

1/ce

Page 28: Sparse Approximations

Pipage rounding[Ageev-Svirideno ‘04, Srinivasan ‘01, Calinescu et al. ‘07, Chekuri et al. ‘09]

Let P be any matroid polytope.E.g., convex hull of characteristic vectors of spanning trees.Given fractional x

Find coordinates a and b s.t. linez x + z ( ea – eb ) stays in current faceFind two points where line leaves PRandomly choose one of thosepoints s.t. expectation is x

Repeat until x = ÂT is integral

x is a martingale: expectation of final ÂT is original fractional x.

ÂT1ÂT2

ÂT3

ÂT4

ÂT5

ÂT6

x

Page 29: Sparse Approximations

Say f : Rm ! R is concave under swaps if z ! f( x + z(ea-eb) ) is concave 8x2P, 8a, b2[m].Let X0 be initial point and ÂT be final point visited by pipage rounding.Claim: If f concave under swaps then E[f(ÂT)] · f(X0). [Jensen]

Let E µ {0,1}m be an event.Let g : [0,1]m ! R be a pessimistic estimator for E, i.e.,

Claim: Suppose g is concave under swaps. Then Pr[ ÂT 2 E ] · g(X0).

Pipage rounding and concavity

Page 30: Sparse Approximations

Chernoff BoundChernoff Bound: Fix any w, x 2 [0,1]m and let ¹ = wTx.Define . Then,

Claim: gt,µ is concave under swaps. [Elementary calculus]

Let X0 be initial point and ÂT be final point visited by pipage rounding.Let ¹ = wTX0. Then Bound achieved by independent sampling also achieved by pipage rounding

Page 31: Sparse Approximations

Matrix Pessimistic Estimators

Main Theorem: gt,µ is concave under swaps.

Theorem [Tropp ‘12]: Let M1,…,Mm be nxn PSD matrices.Let D(x) be a product distribution on {0,1}m with marginals x.Let Suppose Mi ¹ Z.LetThen and .

Bound achieved by independent sampling also achieved by pipage rounding

Pessimistic estimator

Page 32: Sparse Approximations

Given an (unweighted) graph G with eff. conductances ¸ C.Can find an unweighted tree T with

Spectrally Thin Trees

Proof overview:1. Show independent sampling gives spectral thinness,

but not a tree.► Sample every edge e independently with prob. xe=1/ce

2. Show dependent sampling gives a tree, and spectral thinness still works.► Run pipage rounding to get tree T with Pr[ e2T ] = xe =

1/ce

Page 33: Sparse Approximations

Matrix AnalysisMatrix concentration inequalities are usually proven via sophisticated inequalities in matrix analysisRudelson: non-commutative Khinchine inequalityAhlswede-Winter: Golden-Thompson inequalityif A, B symmetric, then tr(eA+B) · tr(eA eB).Tropp: Lieb’s concavity inequality [1973]if A, B Hermitian and C is PD, then z ! tr exp( A + log(C+zB) ) is concave.Key technical result: new variant of Lieb’s theoremif A Hermitian, B1, B2 are PSD, and C1, C2 are PD, then z ! tr exp( A + log(C1+zB1) + log(C2–zB2) ) is concave.

Page 34: Sparse Approximations

QuestionsCan Spencer/Gluskin theorem be

extended to matrices?Can MSS’13 be made algorithmic?Can MSS’13 be extended to large-rank

matrices?O(1)-spectrally thin trees exist. Can one

be found algorithmically?Are O(1)-spectrally thin trees helpful for

Goddyn’s conjecture?