determinantal point processesย ยท โ€ข special case: -dpp with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal...

40
Jose Gallego-Posada April 2021 Determinantal Point Processes Brahms 7-8 sli.do -- #MilaDPP

Upload: others

Post on 07-Aug-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

Jose Gallego-Posada April 2021

Determinantal Point Processes

Brahms 7-8

sli.do -- #MilaDPP

Page 2: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

Today's agenda

โ€ขWhy DPPs?

โ€ขDefinition and properties

โ€ขSampling

โ€ขApplications2

Page 3: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

Bible for DPP in ML:

Foundations and Trends in Machine Learning

Determinantal Point Processes for Machine Learning

Alex Kulesza and Ben Taskar (2012) [link]

Presentation based on slides by :

โ€ข Simon Barthelmรฉ, Nicolas Tremblay, EUSIPCO19 [link]

โ€ข Alex Kulesza, Ben Taskar and Jennifer Gillenwater โ€“ CVPR13 [link]

3

Page 4: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

4

Guillaume Gautier, Rรฉmi Bardenet, Guillermo Polito, Michal Valko

https://github.com/jgalle29/dpp_slides

Page 6: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

Variance reduction โ€“ Mean estimation

6

IID Samples[BT19, dpp_demo]

Page 7: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

Variance reduction โ€“ Mean estimation

6

DPP Samples[BT19, dpp_demo]

Page 8: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

6

Variance reduction โ€“ Mean estimation

[BT19, dpp_demo]

Page 9: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

Determinantal

Base set ๐’ด= {1,โ€ฆ , ๐‘›} from which we sample a random subset ๐’€.

๐’€ is distributed according to a point process ๐’ซ over 2๐’ด.

๐’ซ ๐’€ = ๐‘Œ depends on the determinant of a

matrix selected based on the elements of ๐‘Œ.

Point Process7

Page 10: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

Poisson Process

8

โ€ข Simplest point processโ€ฆ too simple!

โ€ข Element memberships are parameterized by independent Bernoulli rvs.

โ€ข Special case of a DPP with marginal kernel ๐”Ž = ๐ท๐’‘.

๐’ซ ๐’€ = ๐‘Œ = เท‘

๐‘–โˆˆ๐‘Œ

๐‘๐‘– เท‘

๐‘–โˆ‰๐‘Œ

(1 โˆ’ ๐‘๐‘–)

Page 11: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

Desiderata:

i. Density is tractable; including normalization constant

ii. Inclusion probabilities (marginals) are tractable

iii. Sampling is tractable

iv. Model is easy to understand

Representing repulsion

Contrary to most Gibbs processes (normalized, exponentiated potentials),

DPPs tick all the boxes 9

Page 12: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

GM

s vs D

PPs

Loopy, negative interactions are hard

(Inference becomes intractable; worst case)

Global, negative interactions are easy

10

[KTG13]

Page 13: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

๐”-ensembles

โ€ข Model repulsion based on similarity between elements of ๐’ด.

โ€ข Similarity between elements ๐‘– and ๐‘— is stored in ๐”๐‘–๐‘—.

โ€ข We assume ๐” to be positive definite.

โ€ข ๐” is known as the likelihood kernel.

We say that ๐’€ is distributed according to a DPP if:

๐’ซ ๐’€ = ๐‘Œ โˆ det ๐”๐‘Œ11

Page 14: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

12

Where did the repulsion go?

๐’ซ ๐’€ = ๐‘Œ โˆ det ๐”๐‘Œ = det 2 ๐”…๐‘Œ

๐”…1

๐”…2

๐”๐‘Œ = [๐”…๐‘‡๐”…]๐‘Œ

๐”{1,2,4} =

Embedding of ๐’ด

Page 15: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

13

Where did the repulsion go?

๐’ซ ๐‘–, ๐‘— โˆ ๐’ซ ๐‘– ๐’ซ ๐‘— โˆ’๐”๐‘–๐‘—

det(๐” + ๐•€)

2

Vol ๐”…๐‘– = det ๐”…

[KTG13]

๐”…1

๐”…2

Page 16: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

14

Where did the repulsion go?

๐”๐‘Œ

Probability under a DPP grows with the spanned volume[BT19, dpp_demo]

Page 17: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

15

Normalization

๐ดโŠ‚๐‘ŒโŠ‚๐’ด

det ๐”๐‘Œ = det ๐” + ๐•€ าง๐ด

Analytic normalization

constant!

Page 18: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

16

Exploit linear-algebraic properties to make

inference/sampling easy(or feasible in high-dims)

Page 19: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

17

Marginal kernels

โ€ข Consider a DPP with L-ensemble ๐”.

โ€ข The inclusion (marginal) probability that ๐’€ contains a set ๐‘† is given by:

with ๐”Ž = ๐” ๐” + ๐•€ โˆ’1.

โ€ข ๐”Ž is known as the marginal kernel of the DPP.

โ€ข ๐’ซ ๐‘– โˆˆ ๐’€ = ๐”Ž๐‘–๐‘–.

โ€ข ๐”ผ ๐’€ = ๐”ผ ฯƒ๐‘– ๐Ÿ™๐‘–โˆˆ๐’€ = ฯƒ๐‘–๐’ซ ๐‘– โˆˆ ๐’€ = tr ๐”Ž.

๐’ซ ๐‘† โŠ‚ ๐’€ =1

๐”–

๐‘†โŠ‚๐‘Œ

det ๐”๐‘Œ = det ๐”Ž๐‘†

Page 20: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

18

Conditioning

๐’ซ ๐ต โŠ‚ ๐’€ |๐ด โŠ‚ ๐’€ =๐’ซ ๐ด โˆช ๐ต โŠ‚ ๐’€

๐’ซ ๐ด โŠ‚ ๐’€=det ๐”Ž๐ดโˆช๐ตdet ๐”Ž๐ด

= det ๐”Ž๐ต โˆ’ ๐”Ž๐ต๐ด๐”Ž๐ดโˆ’1๐”Ž๐ด๐ต

๐”Ž๐ดโˆช๐ต =๐”Ž๐ต

๐”Ž๐ด

๐”Ž๐ต๐ด

๐”Ž๐ด๐ต

det ๐”Ž๐ดโˆช๐ต = det ๐”Ž๐ด det ๐”Ž๐ต โˆ’ ๐”Ž๐ต๐ด๐”Ž๐ดโˆ’1๐”Ž๐ด๐ต

Schur complement

DPPs are closed under conditioning!

Page 21: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

19

Complexity?

โ€ข Evaluation of ๐” - ๐’ช ๐‘›2

โ€ข Normalization constant - ๐’ช ๐‘›3 [determinant]

โ€ข Marginal probabilities - ๐’ช ๐‘›3 [matrix inversion]

โ€ข Conditional probabilities - ๐’ช ๐‘›3 [Schur complement]

Page 22: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

Questions?

Brahms 7-8

Page 23: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

20

Extensions

Conditional

๐‘˜-

StructuredDPPs

Non-symmetric

Page 24: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

21

๐‘˜-DPPs

โ€ข In practical applications, often preferred to limit cardinality of the set

โ€ข Search results

โ€ข Minibatch selection

โ€ข Summarization

โ€ข Normalization constant ฯƒ ๐’€ =๐‘˜ det ๐”๐‘Œ = ๐‘’๐‘˜ ๐œ†1, โ€ฆ , ๐œ†๐‘ [๐‘˜-th elementary sym. polynomial]

โ€ข Special case: 1-DPP

โ€ข Need not have a corresponding marginal kernel

๐’ซ ๐’€ = ๐‘Œ โˆ det ๐”๐‘Œ ๐Ÿ ๐’€ =๐‘˜

Page 25: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

22

Elementary ๐œ‹-DPPs

โ€ข Special case: ๐‘˜-DPP with ๐‘˜ = rank ๐” and ๐” = ๐‘‰๐›ฌ๐‘‰๐‘‡, has marginal kernel ๐”Ž = ๐‘‰๐‘‰๐‘‡

โ€ข A DPP is called elementary if the spectrum of its marginal kernel is 0, 1 .

โ€ข We denote this process as ๐’ซ๐‘‰.

โ€ข If ๐’€ โˆผ ๐’ซ๐‘‰, then ๐’€ = ๐‘‰ with probability one. ( ๐’€ is a sum of Bernoulli rvs.)

โ€ข ๐”Ž is a projection matrix โ€“ also called projection DPPs

๐”Ž๐‘‰ = ฯƒ๐“ฟโˆˆ๐‘‰๐“ฟ๐“ฟ๐‘‡

Page 26: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

23

Hierarchy of DPPs

Strongly Rayleigh

๐‘˜-DPPs

DPPs

๐”-ensembles

๐œ‹-DPPs

Page 27: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

24

Cauchy-Binet Lemma

๐น๐‘› =ฯ†๐‘› โˆ’ ๐œ“๐‘›

ฯ† โˆ’ ๐œ“

JPM Binet

โ€ข Consider matrices ๐ด of size ๐‘Ÿ ร— ๐‘  and ๐ต of size ๐‘  ร— ๐‘Ÿ

โ€ข For each ๐‘Ÿ-subset ๐‘Œ โŠ‚ [1,โ€ฆ , ๐‘Ÿ], construct square matrices ๐ด:๐‘Œ and ๐ต๐‘Œ:

det ๐ด๐ต = ฯƒ ๐’€ =๐‘Ÿ det ๐ด:๐‘Œ det ๐ต๐‘Œ:

[Proof]

Page 28: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

25

DPPs as mixture models

๐’ซ ๐’€ = ๐‘Œ โˆ det ๐”๐‘Œ = det ๐‘‰๐›ฌ๐‘‰ ๐‘Œ

= det ๐‘‰๐‘Œ: ๐›ฌ ๐›ฌ ๐‘‰:๐‘Œ

=

๐‘ = ๐‘Œ

det ๐‘‰๐‘Œ๐‘ ๐›ฌ๐‘๐‘ det ๐›ฌ๐‘๐‘๐‘‰๐‘๐‘Œ

=

๐‘ = ๐‘Œ

det ๐‘‰๐‘Œ๐‘๐‘‰๐‘Œ๐‘๐‘‡ det ๐›ฌ๐‘๐‘

Elementary

DPP

Diagonal

๐”-ensemble

Page 29: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

26

Sampling

๐’ซ โˆ

๐ฝโŠ‚๐’ด

๐’ซ๐‘‰๐ฝ เท‘

๐‘›โˆˆ๐ฝ

๐œ†๐‘› =

๐ฝโŠ‚๐’ด

๐’ซ๐‘‰๐ฝ det ๐‘ฝ๐ฝ

โ€ข Consider a DPP with L-ensemble ๐” = ฯƒ๐‘› ๐œ†๐‘›๐“ฟ๐‘›๐“ฟ๐‘›๐‘‡ .

โ€ข For each subset ๐ฝ โŠ‚ ๐’ด, let ๐‘‰๐ฝ denote the set ๐“ฟ๐‘› ๐‘›โˆˆ๐ฝ and the elementary DPP ๐’ซ๐‘‰๐ฝ .

Factorize the original DPP as a

mixture of elementary DPPs

Page 30: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

27

Sampling via spectral decomposition

by sequential exploiting closure

of DPPs under conditioningPr ๐ฝ โˆเท‘

๐‘›โˆˆ๐ฝ

๐œ†๐‘›

STAGE ONE STAGE TWO

Draw a sample from ๐’ซ๐ฝChoose elementary DPP ๐’ซ๐ฝ

based on mixture weight

[KT12 โ€“ p.145]

Page 31: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

28

Sampling in action

[KT12]

Page 32: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

29

Advanced sampling

โ€ข Spectral method for sampling has cost ๐’ช ๐‘›2 + ๐‘›3 + ๐‘›๐‘˜2

โ€ข Dual sampling: instead of using ๐” = ๐”…๐‘‡๐”… with ๐”… ๐‘‘ ร— ๐‘› use โ„ญ = ๐”…๐”…๐‘‡ [KT12ยง3.3]

โ€ข Random projections

โ€ข Nystrรถm approximations: Low rank approximation [Li, Jegelka, Sra 16a]

โ€ข MCMC sampling [LJS16b]

โ€ข Add, remove, swap

โ€ข Prove fast mixing for chains in terms of total variation

โ€ข Distortion-free intermediate sampling [Derezinski 18; CDV20]

โ€ข Suitably construct an intermediate subset ๐œŽ and then subsample from it

Page 33: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

30

Learning DPPs

โ€ข Basic setting: Maximum Likelihood

โ€ข Given ๐‘Œ๐‘ก ๐‘ก=1๐‘‡ subsets of ๐’ด. Parameterize ๐”-ensemble as ๐” ๐œƒ

argmax๐œƒ

logเท‘

๐‘ก

๐’ซ๐œƒ ๐‘Œ๐‘ก =

๐‘ก

log det ๐”๐‘Œ๐‘ก(๐œƒ) โˆ’ log det(๐” ๐œƒ + ๐•€)

โ€ข Can use gradient-based methods for optimizing ๐œƒ

โ€ข Can be extended to conditioning on a covariate ๐‘‹: ๐” ๐œƒ, ๐‘‹

โ€ข For each ๐‘‹ we have a DPP

โ€ข ๐‘‹ may be a query during search on which we want to condition the distribution over results

โ€ข See [KT12ยง4] for more details

Page 34: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

31

Applications

Image search

{Relevance vs Diversity}Extractive summarization

[KT12]

Page 35: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

32

Applications

โ€ข (Quasi) Monte-Carlo integration (Gautier et al., On two ways to use DPPs for Monte Carlo integration, 2019)

โ€ข Mini-batch sampling for SGD (Zhang et al., DPPs for Mini-Batch Diversification, 2017)

โ€ข Coresets (Tremblay et al., DPPs for Coresets, 2018)

Page 36: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

33

DPPs in Randomized LinAlg

๐‘คโˆ— = argmin๐‘ค

๐‘ฟ๐‘ค โˆ’ ๐‘ฆ 2 = ๐‘ฟโ€ ๐‘ฆ

โ€ข Consider a linear regression problem with a tall, full-rank matrix ๐‘ฟ โˆˆ โ„๐‘›ร—๐‘‘ with ๐‘› โ‰ซ ๐‘‘

โ€ข Sketching: approximating matrix เทฉ๐‘ฟ (subset of rows, low-rank)

โ€ข Usual bounds have (ํœ€,๐›ฟ)-PAC flavour

โ€ข If ๐‘† โˆผ ๐‘‘-DPP(๐‘ฟ๐‘ฟ๐‘‡), then ๐”ผ[๐‘ฟ๐‘†:โˆ’1๐‘ฆ] = ๐‘คโˆ— [leverage scores]

โ€ข If ๐‘† โˆผDPP1

๐œ†๐‘ฟ๐‘ฟ๐‘‡ , then ๐”ผ[๐‘ฟ๐‘†:

โ€  ๐‘ฆ] = argmin๐‘ค

๐‘ฟ๐‘ค โˆ’ ๐‘ฆ 2 + ๐œ† ๐‘ค 2 [ridge l.s.]

[DM20]

Page 37: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

34

Minibatch sampling for LinReg

โ€ข Previously we related sampling with properties of analytic solution

โ€ข What is the influence of non-iid sampling during stochastic optimization?

โ€ข Previous work by [Zhang, Kjellstrรถm, Mandt 17] for variance reduction

โ€ข Toy example: linear model

โ€ข Gradients are โ€˜constantโ€™ and correspond to points

โ€ข Redundant points lead to redundant sampled gradients

โ€ข Sample minibatches ๐‘† โˆผ ๐‘‘-DPP ๐‘ฟ๐‘ฟ๐‘‡ and run SGD with momentum

Page 38: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

35

Minibatch sampling for LinReg

๐’…-DPP

IID

Train Test

๐œ‚ = 1 ร— 10โˆ’1 ๐œ‚ = 2.5 ร— 10โˆ’1 ๐œ‚ = 3.5 ร— 10โˆ’1 ๐œ‚ = 4 ร— 10โˆ’1

[optim_demo]

Page 39: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

36

Overparameterized regime

๐’Œ-DPP

IID

Train

[optim_demo]

Page 40: Determinantal Point Processesย ยท โ€ข Special case: -DPP with =rank and =๐‘‰๐›ฌ๐‘‰ , has marginal kernel =๐‘‰๐‘‰ โ€ข A DPP is called elementary if the spectrum of its marginal

37

Determinantal Point Processes

are elegant, efficient and useful

models of repulsion