Transcript
Page 1: Hybrids of generative and discriminative methods for          machine learning

MSRC Summer School - 30/06/2009

Cambridge – UK

Hybrids of generative anddiscriminative methods for

machine learning

Page 2: Hybrids of generative and discriminative methods for          machine learning

Motivation

Generative models• prior knowledge• handle missing data such as labels

Discriminative models• perform well at classification

However• no straightforward way to combine them

Page 3: Hybrids of generative and discriminative methods for          machine learning

Content

Generative and discriminative methods

A principled hybrid framework• Study of the properties on a toy example• Influence of the amount of labelled data

Page 4: Hybrids of generative and discriminative methods for          machine learning

Content

Generative and discriminative methods

A principled hybrid framework• Study of the properties on a toy example• Influence of the amount of labelled data

Page 5: Hybrids of generative and discriminative methods for          machine learning

Generative methods

Answer: “what does a cat look like? and a dog?” => data and labels joint distribution

x : data

c : label

: parameters

Page 6: Hybrids of generative and discriminative methods for          machine learning

Generative methods

Objective function:G() = p() p(X, C|)

G() = p() n p(xn, cn|)

1 reusable model per class, can deal with incomplete data

Example: GMMs

Page 7: Hybrids of generative and discriminative methods for          machine learning

Example of generative model

Page 8: Hybrids of generative and discriminative methods for          machine learning

Discriminative methods

Answer: “is it a cat or a dog?” => labels posterior distribution

x : data

c : label

: parameters

Page 9: Hybrids of generative and discriminative methods for          machine learning

Discriminative methods

The objective function isD() = p() p(C|X, )

D() = p() n p(cn|xn, )

Focus on regions of ambiguity, make faster predictions

Example: neural networks, SVMs

Page 10: Hybrids of generative and discriminative methods for          machine learning

Example of discriminative model

SVMs / NNs

Page 11: Hybrids of generative and discriminative methods for          machine learning

Generative versus discriminative

No effect of the double mode on the decision boundary

Page 12: Hybrids of generative and discriminative methods for          machine learning

Content

Generative and discriminative methods

A principled hybrid framework• Study of the properties on a toy example• Influence of the amount of labelled data

Page 13: Hybrids of generative and discriminative methods for          machine learning

Semi-supervised learning

Few labelled data / lots of unlabelled data

Discriminative methods overfit, generative models only help classify if they are “good”

Need to have the modelling power of generative models while performing at discriminating => hybrid models

Page 14: Hybrids of generative and discriminative methods for          machine learning

Discriminative trainingBach et al, ICASSP 05

Discriminative objective function:D() = p() n p(cn|xn, )

Using a generative model:D() = p() n [ p(xn, cn|) / p(xn|) ]

D() = p() n c p(xn, c|)

p(xn, cn|)

Page 15: Hybrids of generative and discriminative methods for          machine learning

Convex combinationBouchard et al, COMPSTAT 04

Generative objective function:G() = p() n p(xn, cn|)

Discriminative objective function:D() = p() n p(cn|xn, )

Convex combination:log L() = log D() + (1- ) log G()

[0,1]

Page 16: Hybrids of generative and discriminative methods for          machine learning

A principled hybrid model

Page 17: Hybrids of generative and discriminative methods for          machine learning

A principled hybrid model

Page 18: Hybrids of generative and discriminative methods for          machine learning

A principled hybrid model

Page 19: Hybrids of generative and discriminative methods for          machine learning

A principled hybrid model

Page 20: Hybrids of generative and discriminative methods for          machine learning

A principled hybrid model

- posterior distribution of the labels

’- marginal distribution of the data

and ’ communicate through a prior

Hybrid objective function:

L(,’) = p(,’) n p(cn|xn, ) n p(xn|’)

Page 21: Hybrids of generative and discriminative methods for          machine learning

A principled hybrid model

= ’ => p(, ’) = p() (-’)

L(,’) = p() (-’) n p(cn|xn, ) n p(xn|’)

L() = G() generative case

’ => p(, ’) = p() p(’) L(,’) = [ p() n p(cn|xn, ) ] [ p(’) n p(xn|’) ] L(,’) = D() f(’) discriminative case

Page 22: Hybrids of generative and discriminative methods for          machine learning

A principled hybrid model

Anything in between – hybrid case

Choice of prior:p(, ’) = p() N(’|, ())

0 => = ’

1 => => ’

Page 23: Hybrids of generative and discriminative methods for          machine learning

Why principled?

Consistent with the likelihood of graphical models

=> one way to train a system

Everything can now be modelled => potential to be Bayesian

Potential to learn

Page 24: Hybrids of generative and discriminative methods for          machine learning

Learning

EM / Laplace approximation / MCMC either intractable or too slow

Conjugate gradients flexible, easy to check BUT sensitive to

initialisation, slow

Variational inference

Page 25: Hybrids of generative and discriminative methods for          machine learning

Content

Generative and discriminative methods

A principled hybrid framework• Study of the properties on a toy example• Influence of the amount of labelled data

Page 26: Hybrids of generative and discriminative methods for          machine learning

Toy example

Page 27: Hybrids of generative and discriminative methods for          machine learning

Toy example

2 elongated distributions

Only spherical gaussians allowed => wrong model

2 labelled points per class => strong risk of overfitting

Page 28: Hybrids of generative and discriminative methods for          machine learning

Toy example

Page 29: Hybrids of generative and discriminative methods for          machine learning

Decision boundaries

Page 30: Hybrids of generative and discriminative methods for          machine learning

Content

Generative and discriminative methods

A principled hybrid framework• Study of the properties on a toy example• Influence of the amount of labelled data

Page 31: Hybrids of generative and discriminative methods for          machine learning

A real example

Images are a special case, as they contain several features each

2 levels of supervision: at the image level, and at the feature level• Image label only => weakly labelled• Image label + segmentation => fully labelled

Page 32: Hybrids of generative and discriminative methods for          machine learning

The underlying generative model

gaussian

multinomial

multinomial

Page 33: Hybrids of generative and discriminative methods for          machine learning

The underlying generative model

weakly – fully labelled

Page 34: Hybrids of generative and discriminative methods for          machine learning

Experimental set-up

3 classes: bikes, cows, sheep

: 1 Gaussian per class => poor generative model

75 training images for each category

Page 35: Hybrids of generative and discriminative methods for          machine learning

HF framework

Page 36: Hybrids of generative and discriminative methods for          machine learning

HF versus CC

Page 37: Hybrids of generative and discriminative methods for          machine learning

Results

When increasing the proportion of fully labelled data, the trend is:

generative hybrid discriminative

Weakly labelled data has little influence on the trend

With sufficient fully labelled data, HF tends to perform better than CC

Page 38: Hybrids of generative and discriminative methods for          machine learning

Experimental set-up

3 classes: lions, tigers and cheetahs

: 1 Gaussian per class => poor generative model

75 training images for each category

Page 39: Hybrids of generative and discriminative methods for          machine learning

HF framework

Page 40: Hybrids of generative and discriminative methods for          machine learning

HF versus CC

Page 41: Hybrids of generative and discriminative methods for          machine learning

Results

Hybrid models consistently perform better

However, generative and discriminative models haven’t reached saturation

No clear difference between HF and CC

Page 42: Hybrids of generative and discriminative methods for          machine learning

Conclusion

Principled hybrid framework

Possibility to learn the best trade-off

Helps for ambiguous datasets when labelled data is scarce

Problem of optimisation

Page 43: Hybrids of generative and discriminative methods for          machine learning

Future avenues

Bayesian version (posterior distribution of ) under study

Replace by a diagonal matrix to allow flexibility => need for the Bayesian version

Choice of priors

Page 44: Hybrids of generative and discriminative methods for          machine learning

Thank you!


Top Related