medlda: maximum margin supervised topic models for regression and classification
DESCRIPTION
MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification. J. Zhu, A. Ahmed and E.P. Xing Carnegie Mellon University ICML 2009. Presented By Haojun Chen. Sources: http://www.cs.cmu.edu/~junzhu/medlda.htm. Outline. Motivation - PowerPoint PPT PresentationTRANSCRIPT
MedLDA: Maximum Margin Supervised Topic Models for
Regression and Classification
J. Zhu, A. Ahmed and E.P. Xing Carnegie Mellon University
ICML 2009
Presented By
Haojun ChenSources: http://www.cs.cmu.edu/~junzhu/medlda.htm
Outline
• Motivation
• Supervised topic model (sLDA) and Support vector regression (SVR)
• Maximum entropy discrimination LDA (MedLDA)– MedLDA for Regression– MedLDA for Classification
• Experiments Results
• Conclusion
Motivation
• Learning latent topic models with side information, like sLDA, has attracted increasingly attention.
• Maximum likelihood estimation are used for posterior inference and parameter estimation in sLDA.
• Max-margin methods, such as SVM, for classification have demonstrated success in many applications.
• General principle for learning max-margin discriminative supervised latent topic models for both regression and classification is proposed in this paper.
Supervised Topic Model (sLDA)
• Joint distribution for sLDA
• Variational MLE for sLDA
Support Vector Regression (SVR)
• Given a training set , the linear SVR finds an optimal linear function by solving the
following constrained convex optimization problem
Max-Entropy Discrimination LDA (MedLDA)
• Maximum entropy discrimination LDA (MedLDA): an integration of max-margin prediction models (e.g. SVR and SVM)
and hierarchical Bayesian topic models (e.g. LDA and sLDA)
• Specifically, a distribution is learned in a max-margin manner in MedLDA.
• MedLDA for regression and classification are considered in this paper.
MedLDA for Regression
• For regression, MedLDA is defined as an integration of Bayesian sLDA and SVR
is the variational approximation for the posterior
EM Algorithm for MedLDA Regression
• Variational EM Algorithm:
• The key difference between sLDA and MedLDA lies in updating
MedLDA for Classification
• Similar to the regression model, the integrated LDA and multi-class classification model is defined as follow:
where
EM Algorithm for MedLDA Classification
• Similar to the EM algorithm for MedLDA regression
• Update equation for
Embedding Results
MedLDA LDA
• 20 Newsgroup dataset
Example Topics Discovered
Classification Results
• 20 Newsgroup DataRelative ratio =
Regression Results
• Movei Review Data
Time Efficiency
Conclusion
• MedLDA: an integration of max-margin prediction models and hierarc
hical Bayesian topic models by optimizing a single objective function with a set of expected margin constraints