ryan o’donnell (cmu, ias) joint work with yi wu (cmu, ibm), yuan zhou (cmu)

32
Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Upload: winfred-tate

Post on 17-Dec-2015

230 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Ryan O’Donnell (CMU, IAS)

joint work with

Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Page 2: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Locality Sensitive Hashing [Indyk–Motwani ’98]

objects sketchesh :

H : family of hash functions h s.t.

“similar” objects collide w/ high prob.

“dissimilar” objects collide w/ low prob.

Page 3: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Abbreviated history

Page 4: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

A

Broder ’97, Altavista

B

0 1 1 1 0 0 1 0 0

1 1 1 0 0 0 1 0 1

wor

d 1?

wor

d 2?

wor

d 3?

wor

d d?

Jaccard similarity:

Invented simple H s.t. Pr [h(A) = h(B)] =

Page 5: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Indyk–Motwani ’98 (cf. Gionis–I–M ’98)

Defined LSH.

Invented very simple H good for

{0, 1}d under Hamming distance.

Showed good LSH implies good

nearest-neighbor-search data structs.

Page 6: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Charikar ’02, STOC

Proposed alternate H (“simhash”) for

Jaccard similarity.

Page 7: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Many papers about LSH

Page 8: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Practice Theory

Free code base [AI’04]

Sequence comparisonin bioinformatics

Association-rule findingin data mining

Collaborative filtering

Clustering nouns bymeaning in NLP

Pose estimation in vision

• • •

[Tenesawa–Tanaka ’07]

[Broder ’97]

[Indyk–Motwani ’98]

[Gionis–Indyk–Motwani ’98]

[Charikar ’02]

[Datar–Immorlica– –Indyk–Mirrokni ’04]

[Motwani–Naor–Panigrahi ’06]

[Andoni–Indyk ’06]

[Neylon ’10]

[Andoni–Indyk ’08, CACM]

Page 9: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Given: (X, dist), r > 0, c > 1

distance space “radius” “approx factor”

Goal: Family H of functions X → S

(S can be any finite set)

s.t. ∀ x, y ∈ X,

≥ p

≤ q

≥ q.5 ≥ q.25 ≥ q.1 ≥ qρ

Page 10: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Theorem

[IM’98, GIM’98]

Given LSH family for (X, dist),

can solve “(r,cr)-near-neighbor search”

for n points with data structure of

size: O(n1+ρ)

query time: Õ(nρ) hash fcn evals.

Page 11: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Example

X = {0,1}d, dist = Hamming

r = ϵd, c = 5

0 1 1 1 0 0 1 0 0

1 1 1 0 0 0 1 0 1

dist ≤ ϵd

or ≥ 5ϵd

H = { h1, h2, …, hd }, hi(x) = xi[IM’98]

“output a random coord.”

Page 12: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Analysis

= q

= qρ

(1 − 5ϵ)1/5 ≈ 1 − ϵ. ∴ ρ ≈

(1 − 5ϵ)1/5 ≤ 1 − ϵ. ∴ ρ ≤

In general, achieves ρ ≤ ∀ c (∀ r).

Page 13: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Optimal upper bound

( {0, 1}d, Ham ), r > 0, c > 1.

S ≝ {0, 1}d ∪ {✔}, H ≝ {hab : dist(a,b) ≤ r}

hab(x) = ✔ if x = a or x = b

x otherwise

0

positive=> 0.5 > 0.1 > 0.01 > 0.0001

Page 14: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)
Page 15: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Wait, what?

[IM’98, GIM’98] Theorem:

Given LSH family for (X, dist),

can solve “(r,cr)-near-neighbor search”

for n points with data structure of

size: Õ(n1+ρ)

query time: Õ(nρ) hash fcn evals

Page 16: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Wait, what?

[IM’98, GIM’98] Theorem:

size: Õ(n1+ρ)

query time: Õ(nρ) hash fcn evals

Page 17: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

More results

For Rd with ℓp-distance:

when p = 1, 0 < p < 1, p = 2

[IM’98] [DIIM’04] [AI’06]For Jaccard similarity: ρ ≤ 1/c

For {0,1}d with Hamming distance:

[Bro’97]

−od(1) (assuming q ≥ 2−o(d))[MNP’06]

immediately

for ℓp-distance

Page 18: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Our Theorem

For {0,1}d with Hamming distance:

−od(1) (assuming q ≥ 2−o(d))

immediately

for ℓp-distance

(∃ r s.t.)

Proof also yields ρ ≥ 1/c for Jaccard.

Page 19: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Proof:

Page 20: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Proof:

Noise-stability is log-convex.

Page 21: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Proof:

A definition, and two lemmas.

Page 22: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Fix any arbitrary function h : {0,1}d → S.

Pick x ∈ {0,1}d at random:

0 1 1 1 0 0 1 0 0x = h(x) = s

Continuous-time (lazy)

random walk for time τ.

0 0 1 1 0 0 1 1 0y = h(y) = s’

def:

Page 23: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Lemma 1:

Lemma 2:

From which the proof of ρ ≥ 1/c follows easily.

For x y,τ

when τ ≪ 1.

Kh(τ) is a log-convex function of τ.

(for any h)

0

1

τ

Page 24: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Continuous-Time Random Walk

: Repeatedly

— waits Exponential(1) seconds,

— dings.

(Reminder: T ~ Expon(1) means Pr[T > u] = e−u.)

In C.T.R.W. on {0,1}d, each coord. gets

its own independent alarm clock.

When ith clock dings, coord. i is rerandomized.

Page 25: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

0 1 1 1 0 0 1 0 0 1x =

0 1 0 1 0 0 1 0 1 1y =

timeτ

0

1

1

1

Pr[coord. i never updated] = Pr[Exp(1) > τ] = e−τ

∴ Pr[xi ≠ yi] =

⇒ Lemma 1: dist(x,y) ≈

Page 26: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Lemma 2: Kh(τ) is a log-convex function of τ.

Remark: True for any reversible C.T.M.C.

Recall: For f : {0,1}d → ℝ,

Given hash function h : {0,1}d → S,

for each s ∈ S, introduce

hs : {0,1}d → {0,1}, hs(x) = 1{h(x)=s}

Page 27: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Proof of Lemma 2:

is log-convex.log-convexnon-neg. lin. comb. of

Page 28: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Lemma 1:

Lemma 2:

Theorem: LSH for {0,1}d requires

For x y,τ

is a log-convex function of τ.

Page 29: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Proof: Say H is an LSH family for {0,1}d

with params .

r (c − o(1)) r

def: (Non-neg. lin. comb.

of log-convex fcns.

∴ KH(τ) is also

log-convex.)

w.v.h.p.,

dist(x,y) ≈ ∴ KH(ϵ) ≳ qρ

KH(cϵ) ≲ q

in truth, q+2−Θ(d); we assume q not tiny

Page 30: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

∴ KH(ϵ) ≳

KH(cϵ) ≲

∴ KH(0) = ln

ln

ln

1

q

0

ρ ln q

ln q

KH(τ) is log-convex

0 τ

ln KH(τ)

ln q

ϵ

Page 31: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)

Super-tedious, super-straightforward

Make Lemma 1 precise. (Chernoff)

Make precise. (Taylor)

Choose ϵ = ϵ(c, q, d) very carefully.

Theorem:

Meaningful iff q ≥ 2−o(d); i.e., not tiny.

Page 32: Ryan O’Donnell (CMU, IAS) joint work with Yi Wu (CMU, IBM), Yuan Zhou (CMU)