topic discovery models - artificial...
TRANSCRIPT
Topic Discovery Models
Algorithm Overview
• Input: Large collection of unlabeled images
• Generate multiple segmentations – Normalized Cuts varying K and image scale
• Obtain visual words for each segment – Quantized SIFT descriptors
– Bag-of-words
• Topic discovery models
Topic Models
• Analyze collection of segments and discover ‘topics’
• Topics: visually similar, frequently occuring
Topic Models
• Why topic models?
• Insight:
– segments corresponding to objects will be represented by coherent groups (topics)
– segments overlapping object boundaries will need to be explained by a mixture of several groups (topics)
Topic Models
• Originally designed for text analysis
– Discovering topics in documents
• Probabilistic Latent Semantic Analysis (pLSA)
• Latent Dirichlet Allocation (LDA)
• Following slides courtesy of David Blei
Topic Models
Topic Models
Topic Models
• 100-topic LDA over 17,000 Science articles
Topic Models
• 100-topic LDA over 17,000 Science articles
Topic Models
• Discovering objects in images
LDA
LDA
LDA
LDA
LDA
LDA
LDA
LDA
LDA
LDA
• Gibbs sampling
– obtain samples from probability distribution
– for each variable, condition on all other variables and compute posterior
– collect sequence of samples and estimate probabilities
• Collapsed Gibbs sampling
– analytically compute the posterior of certain variables
LDA
• Collapsed Gibbs sampling
– integrate out theta/beta parameters
– sample topic assignments z for each word using Gibbs sampling
– estimate parameters from these samples
pLSA
• Simple topic model
– topic proportions not latent variables
– learns topic mixtures for training documents
– problems with overfitting due to large parameter space