topic discovery models - artificial...

22
Topic Discovery Models

Upload: others

Post on 23-May-2020

25 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Topic Discovery Models - Artificial Intelligencevision.stanford.edu/teaching/cs231b_spring1213/... · Topic Discovery Models . Algorithm Overview •Input: Large collection of unlabeled

Topic Discovery Models

Page 2: Topic Discovery Models - Artificial Intelligencevision.stanford.edu/teaching/cs231b_spring1213/... · Topic Discovery Models . Algorithm Overview •Input: Large collection of unlabeled

Algorithm Overview

• Input: Large collection of unlabeled images

• Generate multiple segmentations – Normalized Cuts varying K and image scale

• Obtain visual words for each segment – Quantized SIFT descriptors

– Bag-of-words

• Topic discovery models

Page 3: Topic Discovery Models - Artificial Intelligencevision.stanford.edu/teaching/cs231b_spring1213/... · Topic Discovery Models . Algorithm Overview •Input: Large collection of unlabeled

Topic Models

• Analyze collection of segments and discover ‘topics’

• Topics: visually similar, frequently occuring

Page 4: Topic Discovery Models - Artificial Intelligencevision.stanford.edu/teaching/cs231b_spring1213/... · Topic Discovery Models . Algorithm Overview •Input: Large collection of unlabeled

Topic Models

• Why topic models?

• Insight:

– segments corresponding to objects will be represented by coherent groups (topics)

– segments overlapping object boundaries will need to be explained by a mixture of several groups (topics)

Page 5: Topic Discovery Models - Artificial Intelligencevision.stanford.edu/teaching/cs231b_spring1213/... · Topic Discovery Models . Algorithm Overview •Input: Large collection of unlabeled

Topic Models

• Originally designed for text analysis

– Discovering topics in documents

• Probabilistic Latent Semantic Analysis (pLSA)

• Latent Dirichlet Allocation (LDA)

• Following slides courtesy of David Blei

Page 6: Topic Discovery Models - Artificial Intelligencevision.stanford.edu/teaching/cs231b_spring1213/... · Topic Discovery Models . Algorithm Overview •Input: Large collection of unlabeled

Topic Models

Page 7: Topic Discovery Models - Artificial Intelligencevision.stanford.edu/teaching/cs231b_spring1213/... · Topic Discovery Models . Algorithm Overview •Input: Large collection of unlabeled

Topic Models

Page 8: Topic Discovery Models - Artificial Intelligencevision.stanford.edu/teaching/cs231b_spring1213/... · Topic Discovery Models . Algorithm Overview •Input: Large collection of unlabeled

Topic Models

• 100-topic LDA over 17,000 Science articles

Page 9: Topic Discovery Models - Artificial Intelligencevision.stanford.edu/teaching/cs231b_spring1213/... · Topic Discovery Models . Algorithm Overview •Input: Large collection of unlabeled

Topic Models

• 100-topic LDA over 17,000 Science articles

Page 10: Topic Discovery Models - Artificial Intelligencevision.stanford.edu/teaching/cs231b_spring1213/... · Topic Discovery Models . Algorithm Overview •Input: Large collection of unlabeled

Topic Models

• Discovering objects in images

Page 11: Topic Discovery Models - Artificial Intelligencevision.stanford.edu/teaching/cs231b_spring1213/... · Topic Discovery Models . Algorithm Overview •Input: Large collection of unlabeled

LDA

Page 12: Topic Discovery Models - Artificial Intelligencevision.stanford.edu/teaching/cs231b_spring1213/... · Topic Discovery Models . Algorithm Overview •Input: Large collection of unlabeled

LDA

Page 13: Topic Discovery Models - Artificial Intelligencevision.stanford.edu/teaching/cs231b_spring1213/... · Topic Discovery Models . Algorithm Overview •Input: Large collection of unlabeled

LDA

Page 14: Topic Discovery Models - Artificial Intelligencevision.stanford.edu/teaching/cs231b_spring1213/... · Topic Discovery Models . Algorithm Overview •Input: Large collection of unlabeled

LDA

Page 15: Topic Discovery Models - Artificial Intelligencevision.stanford.edu/teaching/cs231b_spring1213/... · Topic Discovery Models . Algorithm Overview •Input: Large collection of unlabeled

LDA

Page 16: Topic Discovery Models - Artificial Intelligencevision.stanford.edu/teaching/cs231b_spring1213/... · Topic Discovery Models . Algorithm Overview •Input: Large collection of unlabeled

LDA

Page 17: Topic Discovery Models - Artificial Intelligencevision.stanford.edu/teaching/cs231b_spring1213/... · Topic Discovery Models . Algorithm Overview •Input: Large collection of unlabeled

LDA

Page 18: Topic Discovery Models - Artificial Intelligencevision.stanford.edu/teaching/cs231b_spring1213/... · Topic Discovery Models . Algorithm Overview •Input: Large collection of unlabeled

LDA

Page 19: Topic Discovery Models - Artificial Intelligencevision.stanford.edu/teaching/cs231b_spring1213/... · Topic Discovery Models . Algorithm Overview •Input: Large collection of unlabeled

LDA

Page 20: Topic Discovery Models - Artificial Intelligencevision.stanford.edu/teaching/cs231b_spring1213/... · Topic Discovery Models . Algorithm Overview •Input: Large collection of unlabeled

LDA

• Gibbs sampling

– obtain samples from probability distribution

– for each variable, condition on all other variables and compute posterior

– collect sequence of samples and estimate probabilities

• Collapsed Gibbs sampling

– analytically compute the posterior of certain variables

Page 21: Topic Discovery Models - Artificial Intelligencevision.stanford.edu/teaching/cs231b_spring1213/... · Topic Discovery Models . Algorithm Overview •Input: Large collection of unlabeled

LDA

• Collapsed Gibbs sampling

– integrate out theta/beta parameters

– sample topic assignments z for each word using Gibbs sampling

– estimate parameters from these samples

Page 22: Topic Discovery Models - Artificial Intelligencevision.stanford.edu/teaching/cs231b_spring1213/... · Topic Discovery Models . Algorithm Overview •Input: Large collection of unlabeled

pLSA

• Simple topic model

– topic proportions not latent variables

– learns topic mixtures for training documents

– problems with overfitting due to large parameter space