kernel methods on £nite groups - columbia universityrisi/papers/kondornips2001.pdf · kernel...

29
Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery School of Computer Science Carnegie Mellon University 1

Upload: others

Post on 25-Mar-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

Kernel methods on £nite groupsRisi Imre Kondor

Center for Automated Learning and DiscoverySchool of Computer ScienceCarnegie Mellon University

1

Page 2: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

The learning task

Data: (xi, yi) pairs xi∈Ω

Task: predict y from x

K : Ω× Ω 7→ R expresses our prior beliefs about similarity

eg. K ∼ e−(x1−x2)2/2σ2

2

Page 3: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

The kernel trick

φ : Ω 7→ H

K(x1, x2) = 〈φ(x1), φ(x2)〉

“ K induces an isometric embedding of Ω in the Hilbert space H”

3

Page 4: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

Positive de£niteness

K symmetric and

Ω

Ω

f(x1) f(x2)K(x1, x2) dx1 dx2 ≥ 0 ∀f ∈ L2(Ω) (Mercer)

x1

x2

cx1cx2

K(x1, x2) ≥ 0 ∀c1, c2, . . . ∈ R

4

Page 5: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

Can we do this on discrete Ω?

Can we use the kernel trick to unfold the internal structureof the object Ω?

5

Page 6: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

Finite Groups

Finite set G with operation G×G 7→ G

• x1x2 ∈ G for any x1, x2 ∈ G (closure)

• x1(x2x3) = (x1x2)x3 (associativity)

• xe = ex = e for any x ∈ G (identity)

• x−1x = xx−1 = e (inverses)

6

Page 7: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

Symmetric groups Sn

x1 = (12)(3)(4)(5) x1(ABCDE) = BACDE

x2 = (1)(324)(5) x2(ABCDE) = ACDBE

x3 = x1x2 x3(ABCDE) = x1(x2(ABCDE))

rankings, orderings, allocation, etc.natural sense of distance

7

Page 8: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

Stationary kernels

K(x1, x2) = f(x2 x−11 )

∼K(x1, x2) = f(x2−x1)eg. K ∼ e−(x1−x2)

2/2σ2

f positive de£nite function on G

8

Page 9: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

Bochner’s Theorem

f is positive de£nite and symmetrical on Rm iff f(ω) > 0

f(ω) =1√2π

m

∫e−iω ·xf(x) dx

is there an analogue for £nite groups?

9

Page 10: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

Representation theory

ρ : G→ Cd×d

ρ(x1x2) = ρ(x1)ρ(x2)

Equivalence:

ρ1(x) = t−1ρ2(x) t ∀ x ∈ G

Reducibility:

t−1ρ(x) t =

ρ1(x) 0

0 ρ2(x)

∀ x ∈ G

10

Page 11: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

Irreducible representations of S5

ρtrivial(x) ≡ (1) → ρ(5)

ρsign(x) ≡ (sgn(x)) → ρ(1,1,1,1,1)

ρdef.(x) ∈ C5×5 ex(i) = ρdef.(x) ei

e1, e2, e3, e4, e5 basis

11

Page 12: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

Irreducible representations of S5

t−1 ρdef.(x) t =

ρtr.(x)

ρ(4,1)(x)

ρreg. = ρ(5) ⊕ 4ρ(4,1) ⊕ 5ρ(3,2) ⊕ 6ρ(3,1,1)⊕

5ρ(2,2,1) ⊕ 4ρ(2,1,1,1) ⊕ ρ(1,1,1,1,1)

12

Page 13: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

Fourier transforms on Groups

f(ρ) =∑

x∈G

ρ(x) f(x) ρ∈R

F : CG 7→⊕

ρ∈R

Cdρ×dρ

inversion:

f(x) =1

|G |∑

ρ∈R

dρ trace[f(ρ)ρ(x−1)]

13

Page 14: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

Main Theorem

The function f is positive de£nite on G if and only ifthe matrices f(ρ) are all positive de£nite.

14

Page 15: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

Conjugacy classes

conjugacy classes: x1 ∼=Conj. x2 iff x1 = t−1 x2 t for some t

eg. on Sn (·)(·)(·)(·) . . .(· ·)(·)(·)(·) . . .(· · ·)(·)(·) . . .(· ·)(· ·)(·)(·) . . ....

15

Page 16: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

Corollary to main theorem

The function f is positive de£nite on G and constanton conjugacy classes if and only if

f(x) =∑

ρ∈R

cρ χρ cρ > 0 .

characters:

χρ(x) = trace[ρ(x)]

16

Page 17: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

— break —

17

Page 18: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

S4

18

Page 19: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

Diffusion kernel

K ∼ e βH = limn→∞

(1 +

βH

n

)n

Hij =

1 (i, j) ∈ E−di i = j

0 otherwise

(Laplacian)

19

Page 20: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

Diffusion on graphs

Zi

bZi

bZi

bZi

bZi

bZi

bZi

E[Zi(0)] = 0Var[Zi(0)] = σ2

Zi(0) indep.

redistribution:Z(t+ 1) = (1 + bH)Z(t) t = 1, 2, . . .

20

Page 21: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

Diffusion on graphs (cont.)

evolution operator T (t) = (1 + bH)t

Z(t) = T (t)Z(0)

covariance

Covij(t) = Zi(t) Zj(t) =(∑

i′

Tii′ Zi′(0))(∑

j′

Tjj′ Zj′(0))

Cov(t) = σ2 T (t)T T (t) = σ2 T (t)2

21

Page 22: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

The continuous limit

T (t) = limn→∞

(1 +

bH

n

)nt= ebtH

∂tT (t) = bH T (t)

Cov(t) = σ2 e 2btH

on square grid in Rn, limH = ∇2,

∂tCov = 2b∇2Cov → Cov(x1, x2) ∼ e−(x1−x2)

2/(2βt)

22

Page 23: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

Diffusion kernels on groups

K ∼ e βH = limn→∞

(1 +

βH

n

)n= lim

n→∞

(1 +

βH

2n

)2n

Hx1x2= g(x1x

−12 ) g(x) = g(x−1)

• positive de£nite• stationary• continuous bandwidth parameter β (in£nitely divisible)

23

Page 24: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

Diffusion kernels in general

K ∼ eβH = limn→∞

(1 +

βH

n

)n

any graph, any positive de£nite matrix

“The Lie group of positive de£nite kernels over a set Ω is generatedby the set of real symmetric bilinear forms over Ω.”

24

Page 25: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

Experiment

American Psychological Association (1980) (Diaconis 1988)

5 candidates (5! = 120)5738 complete votes

Ranking votes

12345 30

12354 28

12435 27

12453 29

12534 35

12543 34

13245 102

13254 95

13425 35

13452 37

.

.

.

.

.

.

25

Page 26: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

Experiment

26

Page 27: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

Summary

Liberate Support Vector Machines, GaussianProcesses, etc. from R

m!

• Real data often lives on discrete objects with intrinsicstructure

• Many statistical algorithms involve searching, integrating, etc.over a large number of related objects / models

27

Page 28: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

Thanks

John Lafferty

School of Computer Science,Carnegie Mellon University

28

Page 29: Kernel methods on £nite groups - Columbia Universityrisi/papers/KondorNIPS2001.pdf · Kernel methods on £nite groups Risi Imre Kondor Center for Automated Learning and Discovery

• J.-P. Serre. Linear Representations of Finite Groups.Springer-Verlag, 1977.

• B. E. Sagan. The Symmetric Group. Springer Verlag, 2001.

• A. Terras. Fourier Analysis on Finte Groups and Applications.Cambridge, 1999.

• P. Diaconis. Group Representations in Probability andStatistics. Inst. of Math. Stat, 1988.

• C. Watkins. Dynamic Alignment Kernels. Tech. Report. RoyalHolloway, 1999.

• D. Haussler. Convolution Kernels on Discrete Structures.(unpublished) 1999.

29