one algorithm to rule them all one join query at a time atri rudra university at buffalo

40
One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

Upload: blake-lane

Post on 18-Jan-2016

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

One algorithm to rule them allOne join query at a time

Atri RudraUniversity at Buffalo

Page 2: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

A brief history of this talk

L2/L2 foreach sparse recovery/compressed sensing

http://www-stat.stanford.edu/~candes/stats330/index.shtml

Page 3: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

The key technical problem

Given the three shadows, what is the largest size of the original set of points?

Given the three shadows, what is the largest size of the original set of points?

Page 4: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

The key technical problem

Highly trivial: 43 = 64 Still trivial: 42 = 16 Correct answer: 41.5 = 8

Page 5: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

The key technical problem

A

B

C

|R|= k

|T| =k|S|=k

k3/2

Loomis Whitney

Algorithmic Loomis-

Whitney?

Algorithmic Loomis-

Whitney?

Page 6: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

An equivalent view

A

B

C

R

TS

A

B C

R

S

T

Output all (a,b,c) s.t. (a,b) in R, (b,c) in S and

(c,a) in T

Output all (a,b,c) s.t. (a,b) in R, (b,c) in S and

(c,a) in T

Page 7: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

Overview of the talk

A

B C

R

S

T

Page 8: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

The take-away message

Joinalgo

http://welovetumblr.blogspot.com/2012/07/thor-is.html

Page 9: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

Overview of the talk

A

B C

R

S

T

Page 10: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

(Database) Joins

Codd

Attributes/Nodes: [n]

Relations/Hyperedges: e1,…, em [n]

11

2233

44

55

Tables/Projections: R1 , … , Rm

Output all a = (a1,..,an) s.t. a projected down to

ei is in Ri for every i in [m]

Output all a = (a1,..,an) s.t. a projected down to

ei is in Ri for every i in [m]

Page 11: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

The triangle join query

A

B

C

R

TS

Output all (a,b,c) s.t. (a,b) in R, (b,c) in S and

(c,a) in T

Output all (a,b,c) s.t. (a,b) in R, (b,c) in S and

(c,a) in T

S

AA

BB CC

R T

Page 12: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

Bounding the output size

Atserias Grohe Marx

AA

BB CC

S

R T

Highly trivial bound: R S T

Still trivial bound: R S

Loomis-Whitney bound: R1/2 S1/2 T1/2

½

½

½x

y

z

AGM bound: Rx Sy Tz

x + z ≥ 1 x + y ≥ 1 y + z ≥ 1

AA

BB

CCx, y, z ≥ 0

Page 13: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

Loomis Whitney

?

Page 14: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

Algorithmic Loomis-WhitneyLoomis-Whitney bound: R1/2 S1/2 T1/2

AA

BB CC

S

R T½

½

½

R

TS CC

BBAA

c

Goal: Count number of trianglesGoal: Count number of triangles

There are Rchoices for edges in R

There are dS(c)dT(c)choices for pairs ofneighbors of c

http://agilitrix.com/2011/03/red-pill-blue-pill/

TS CC

BBAA

c

dT(c)dS(c)

Page 15: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

Algorithmic Loomis-WhitneyLoomis-Whitney bound: R1/2 S1/2 T1/2

Goal: Count number of trianglesGoal: Count number of triangles

There are Rchoices for edges in R

There are dS(c)dT(c)choices for pairs ofneighbors of c

Make this choice for every c in CMake this choice for every c in C

Run time of algo=Σc min( R

,dS(c)dT(c) )

Run time of algo=Σc min( R

,dS(c)dT(c) )

R

TS CC

BBAA

c

Page 16: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

Analyzing the algorithmLoomis Whitney bound: R½ S½ T½

Σc min( R , dS(c) dT(c) )

≤ Σc (R dS(c) dT(c) ) ½

= R½Σc ( dS(c) ½ dT(c) ½ )

≤ R½(Σc dS(c)) ½(ΣcdT(c)) ½

= R½S½T½

R

TS CC

BBAA

c

Cauchy Schwartz

min(E,F) ≤ (EF)½

min(E,F) ≤ (EF)½

Page 17: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

?Atserias Grohe Marx

Page 18: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

Same algorithm!AGM bound: Rx Sy Tz

Σc min( R , dS(c) dT(c) )

≤ Σc Rx (dS(c) dT(c) ) 1-x

≤ RxΣc ( dS(c) y dT(c) z )

≤ Rx(Σc dS(c)) y(ΣcdT(c)) z

= RxSyTz

R

TS CC

BBAA

c

x + z ≥ 1 x + y ≥ 1 y + z ≥ 1

AA

BB

CC

Hölder

min(E,F) ≤ ExF1-x

min(E,F) ≤ ExF1-x

Page 19: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

General Join Result

Attributes/Nodes: [n]

Relations/Hyperedges: e1,…, em [n]

11

2233

44

55

Tables/Projections: R1 , … , Rm

x1,..,xm be a fractional cover

AGM bound: R1x1…Rm

xm

Our result: O(AGM + Input size)

x1

x2

x3

x4

Provably worst-case

optimal join algorithm

Provably worst-case

optimal join algorithm

Page 20: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

List recovery

.

.

.

..

.

.

S1 S2 S3 Sn

………………………Si subset of [q]

………………………c1 c2 c3 cn

20

Code C subset of [q]nApplications in

expandersApplications in

expanders

Page 21: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

An alternate view of joins

A

B C

R

S

T Msg in [q]3

Codeword in [q2]3

.

.

.

..

R S T

Constant dimensionConstant block length

Large alphabet sizeLarge input list size

Constant dimensionConstant block length

Large alphabet sizeLarge input list size

Page 22: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

Overview of the talk

A

B C

R

S

T

Page 23: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

Sparse Recovery/Compressed Sensing

UnknownTo be designed

Observed

DecodeDecode

Output

k=2

Heavy Hitter

Tail

Page 24: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

Quantifying the approximation

L2 ≤ C L2

Page 25: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

(Most of) rest of the talk

Page 26: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

Designing the matrix

UnknownTo be designed

Observed

DecodeDecode

Output

k=2

Page 27: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

Designing the matrix k=2

N

m

k-expander

N m

< ¼ (neighborhood)

Measurement = + noise

Heavy tail noise < ¼ (neighborhood)

> ½ of the neighbors of have the

“correct” value

> ½ of the neighbors of have the

“correct” value

Page 28: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

Count-Sketch style algo k=2

N m

Estimate = median of O(log N) values

Output the top O(k) estimates

O(N log N) decoding

Indyk Ružić

Page 29: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

We need a faster algorithm…

Page 30: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

S

Towards a sub-linear time algo

Estimate=median value

Output the top O(k) estimates in S

O(|S| log N) decoding

All we need to do is to

compute a small S quikcly

All we need to do is to

compute a small S quikcly

Page 31: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

Porat-Strauss Idea: Recursion!

[N]

{0,1}log N

[√N] [√N]

Solve in ~ √N time Solve in ~ √N time

Page 32: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

The problem we now need to solveElements of S Geometrically…

k

k

?

Output size ~ k2Overall running time ~ √N + k2

Not sub-linear for

k > √N

Not sub-linear for

k > √N

Use a table-look up to decrease

the run time

Use a table-look up to decrease

the run time

Page 33: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

Finally…

Page 34: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

Slightly different recursionlog N

[N]

[N⅔] [N⅔] [N⅔]

Geometricproblem tosolve

Overall runtime

k3/2 + N2/3

Page 35: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

Our Results

L2/L2 sparse recovery with failure prob p

Optimal k log(N/k) measurements*

k1+ε poly-log N decoding+space

p ~ (N/k)-k/poly-log k

Also prove tight lower bound of k log(N/k) + log(1/p)

Page 36: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

One algorithm to rule them allOne join query at a time

Atri RudraUniversity at Buffalo

Page 37: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

Only two problems so far…

A

B C

R

S

T

Page 38: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

Albert Meyer (via Dick Lipton)

"Prove it for n=3 and then let 3 go to infinity"

Page 39: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

The 3rd problem…

Big (hyper)graph G

http://pigeonsandplanes.com/2010/12/thoughts-on-net-neutrality.html

11

2233

44

55

Small (hyper) graph H

Compute all copies of H in G

Our join algorithm gives a worst-case optimal algorithm for any constant-sized H

Our join algorithm gives a worst-case optimal algorithm for any constant-sized H

Joins model many more

problems, e.g. CSPs

Joins model many more

problems, e.g. CSPs

Page 40: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo

The take-away message

Joinalgo

http://welovetumblr.blogspot.com/2012/07/thor-is.html