medlda: maximum margin supervised topic models for regression and classification

16
MedLDA: Maximum Margin Supervised T opic Models for Regression and Classification J. Zhu, A. Ahmed and E.P. Xing Carnegie Mellon University ICML 2009 Presented By Haojun Chen Sources: http://www.cs.cmu.edu/~junzhu/medlda.htm

Upload: idra

Post on 11-Feb-2016

128 views

Category:

Documents


1 download

DESCRIPTION

MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification. J. Zhu, A. Ahmed and E.P. Xing Carnegie Mellon University ICML 2009. Presented By Haojun Chen. Sources: http://www.cs.cmu.edu/~junzhu/medlda.htm. Outline. Motivation - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification

MedLDA: Maximum Margin Supervised Topic Models for

Regression and Classification

J. Zhu, A. Ahmed and E.P. Xing Carnegie Mellon University

ICML 2009

Presented By

Haojun ChenSources: http://www.cs.cmu.edu/~junzhu/medlda.htm

Page 2: MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification

Outline

• Motivation

• Supervised topic model (sLDA) and Support vector regression (SVR)

• Maximum entropy discrimination LDA (MedLDA)– MedLDA for Regression– MedLDA for Classification

• Experiments Results

• Conclusion

Page 3: MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification

Motivation

• Learning latent topic models with side information, like sLDA, has attracted increasingly attention.

• Maximum likelihood estimation are used for posterior inference and parameter estimation in sLDA.

• Max-margin methods, such as SVM, for classification have demonstrated success in many applications.

• General principle for learning max-margin discriminative supervised latent topic models for both regression and classification is proposed in this paper.

Page 4: MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification

Supervised Topic Model (sLDA)

• Joint distribution for sLDA

• Variational MLE for sLDA

Page 5: MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification

Support Vector Regression (SVR)

• Given a training set , the linear SVR finds an optimal linear function by solving the

following constrained convex optimization problem

Page 6: MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification

Max-Entropy Discrimination LDA (MedLDA)

• Maximum entropy discrimination LDA (MedLDA): an integration of max-margin prediction models (e.g. SVR and SVM)

and hierarchical Bayesian topic models (e.g. LDA and sLDA)

• Specifically, a distribution is learned in a max-margin manner in MedLDA.

• MedLDA for regression and classification are considered in this paper.

Page 7: MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification

MedLDA for Regression

• For regression, MedLDA is defined as an integration of Bayesian sLDA and SVR

is the variational approximation for the posterior

Page 8: MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification

EM Algorithm for MedLDA Regression

• Variational EM Algorithm:

• The key difference between sLDA and MedLDA lies in updating

Page 9: MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification

MedLDA for Classification

• Similar to the regression model, the integrated LDA and multi-class classification model is defined as follow:

where

Page 10: MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification

EM Algorithm for MedLDA Classification

• Similar to the EM algorithm for MedLDA regression

• Update equation for

Page 11: MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification

Embedding Results

MedLDA LDA

• 20 Newsgroup dataset

Page 12: MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification

Example Topics Discovered

Page 13: MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification

Classification Results

• 20 Newsgroup DataRelative ratio =

Page 14: MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification

Regression Results

• Movei Review Data

Page 15: MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification

Time Efficiency

Page 16: MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification

Conclusion

• MedLDA: an integration of max-margin prediction models and hierarc

hical Bayesian topic models by optimizing a single objective function with a set of expected margin constraints