probabilistic modelling in machine learning · probabilistic modelling involves two main...

126
Probabilistic Modelling in Machine Learning Peter Ti ˇ no School of Computer Science University of Birmingham, UK Probabilistic Modelling in Machine Learning – p.1/126

Upload: others

Post on 01-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Probabilistic Modelling in Machine Learning

Peter Tino

School of Computer Science

University of Birmingham, UK

Probabilistic Modelling in Machine Learning – p.1/126

Page 2: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Probabilistic modelling and ML...

many different flavors

start with dataD

think - Q1: How could the data have been generated?source (model)

think - Q2: What interesting aspects in the data you’d like to capture?This will also inform the model structure!

Probabilistic Modelling in Machine Learning – p.2/126

Page 3: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

The model...

Model (parametrized)- p(D|w)

Or you may want to be smart and formulate the model in a fully Bayesianframework -"parameter free"...

... but at some point there will besomeparameters (of prior), or you mayrely solely on hierarchical Bayes...

Probabilistic Modelling in Machine Learning – p.3/126

Page 4: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Generative probabilistic model - advantages?

– principled formulation

– transparent and interpretable model structure

– principled model selection

– consistent coping with missing data

– consistent building of hierarchies

– ...

Probabilistic Modelling in Machine Learning – p.4/126

Page 5: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Model structure and model fitting

Probabilistic modelling involvestwo main steps/tasks:

1. Design themodel structureby considering Q1 and Q2.

2. Fit your model to the data.

– Sometimes the two tasks are interleaved -e.g. when model fitting involves both parameters and model structure (e.g.infinite mixtures...)

Probabilistic Modelling in Machine Learning – p.5/126

Page 6: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Model structure and model fitting

Model fitting: ML, MAP, Type II ML, full Bayesian treatment...

We will put more emphasis on task 1.

Experience the way probabilistic modelers in ML think -examples of different data structures and different "questions onthe data".

Probabilistic Modelling in Machine Learning – p.6/126

Page 7: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Start with the data... (simple example)

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

Think: How could have this data been generated?

Probabilistic Modelling in Machine Learning – p.7/126

Page 8: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Mixture models

Mixture of GaussiansGiven a set of data pointsD = x1, x2, ..., xN, coming from what lookslike a mixture of3 GaussiansG1, G2, G3, fit a 3-component Gaussianmixture model toD.

p(x) =3∑

j=1

P (j, x) =3∑

j=1

P (j) · p(x|j),

P (j) – mixing coefficient (prior probability) of GaussianGj

p(x|j) – ‘probability mass’ given to the data itemx by GaussianGj .

Generative process:Repeat 2 steps1. generatej from P (j)2. generatex from p(x|j)

Probabilistic Modelling in Machine Learning – p.8/126

Page 9: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Fitting the model is easy!

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

It is fairly "obvious" which point comes from which Gaussian!

Probabilistic Modelling in Machine Learning – p.9/126

Page 10: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Three Gaussians ...

Given that we know which data point comes from which Gaussian, thingsare easy:

- Collectdatapoints fromD that areknown to be generated fromG1 in D1.

- Estimate free parameters(mean, covariance matrix)of G1 (e.g. ML):

µ1 =1

|D1|

x∈D1

x,

Σ1 =1

|D1|

x∈D1

(x − µ1)(x − µ1)T .

Do the same forGaussiansG2 andG3.

- P (j) - proportions of sizes|Dj |.

Probabilistic Modelling in Machine Learning – p.10/126

Page 11: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Formulation through indicator variables

We can represent theknowledge about which data point came from whichGaussianthroughindicator variableszij , i = 1, 2, ..., N (data points),j = 1, 2, 3 (mixture components - Gaussians):

zij = 1, iff xi was generated byGj ;

zij = 0, otherwise.

Then,

µj =1

∑Ni=1 z

ij

N∑

i=1

zij · xi,

Σj =1

∑Ni=1 z

ij

N∑

i=1

zij · (xi − µj)(xi − µj)

T .

Probabilistic Modelling in Machine Learning – p.11/126

Page 12: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Now that you are an expert, model this...

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

oooops!

Probabilistic Modelling in Machine Learning – p.12/126

Page 13: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

The ground truth

-3 -2 -1 0 1 2 3 4-3

-2

-1

0

1

2

3

4

But how can I know that? –You cannot!

Probabilistic Modelling in Machine Learning – p.13/126

Page 14: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Don’t know values of the indicator variables...

The assignments of points to Gaussianszij arehidden/latent/non-observable variables!

Collect all the indicator variables in acompound(vector)latent variableZ = zij ∈ 0, 1N ·3.

Each setting ofZ representsone particular situation of assigning pointsxi

to GaussiansG1, G2 andG3.

This can be anything from trivial settings, such as all points come fromG1, to very mixed situations.

Probabilistic Modelling in Machine Learning – p.14/126

Page 15: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Probabilistic way of dealing with uncertain Z

Assume we already have some guess about where the Gaussians should be(µj) and what their shape is (Σj).

Since we don’t knowZ, we simplyneed to consider all possibilities(cases)for assignmentsZ.

Obviously, looking at the data,not all assignmentsZ will be equally likely.

This is expressed throughposteriorP (Z|D, G1, G2, G3) that evaluateshow likely, given the data and current positions/shapes of GaussiansG1, G2, G3, is the particular assignment schemeZ.

Probabilistic Modelling in Machine Learning – p.15/126

Page 16: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

GuessingZ, given the current model and data

zij ∈ 0, 1 is unobserved, so calculate its mean value instead -

”how muchzij wants to be set to 1”

EP (Z|D,G1,G2,G3)[zij ] = 0 · P (zij = 0|D, G1, G2, G3)

+1 · P (zij = 1|D, G1, G2, G3)

= P (zij = 1|D, G1, G2, G3)

= Rij

Rij is the‘responsibility’ of Gaussianj for the data pointxi.

Substitutezij in the crisp case calculations with ‘softer’ probabilisticRij .

Probabilistic Modelling in Machine Learning – p.16/126

Page 17: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Dealing with uncertainty in Z

Instead of

µj =1

∑Ni=1 z

ij

N∑

i=1

zij · xi

we have

µj =1

∑Ni=1R

ij

N∑

i=1

Rij · xi.

Instead of

Σj =1

∑Ni=1 z

ij

N∑

i=1

zij · (xi − µj)(xi − µj)

T

we have

Σj =1

∑Ni=1R

ij

N∑

i=1

Rij · (x

i − µj)(xi − µj)T .

Probabilistic Modelling in Machine Learning – p.17/126

Page 18: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

What next?

Having refined our model (positions and shapes of the Gaussians), refineour ideas about possible assignmentsZ of pointsxi to GaussiansG1, G2, G3. Of course, we still cannot be certain about exact values ofZ -sostill a probabilistic formulation!

Rij = P (Gj |xi) =

P (xi|Gj) · P (j)∑3

q=1 P (xi|Gq) · P (q)

Repeat the parameter estimation and assignment refinement steps until‘convergence’.

Probabilistic Modelling in Machine Learning – p.18/126

Page 19: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Apply to our data - iteration 1

-3 -2 -1 0 1 2 3

-2

-1

0

1

2

3Plot of data and covariances

Probabilistic Modelling in Machine Learning – p.19/126

Page 20: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Apply to our data - iteration 2

-3 -2 -1 0 1 2 3

-2

-1

0

1

2

3Plot of data and covariances

Probabilistic Modelling in Machine Learning – p.20/126

Page 21: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Apply to our data - iteration 30

-3 -2 -1 0 1 2 3

-2

-1

0

1

2

3Plot of data and covariances

Probabilistic Modelling in Machine Learning – p.21/126

Page 22: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Another latent variable model

Hidden Markov Model

Stationary emissions conditional on hidden (unobservable) states.

Hidden states represent basic operating "regimes" of the process.

Bag 2

Bag 3

Bag 1

Probabilistic Modelling in Machine Learning – p.22/126

Page 23: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Temporal structure - Hidden Markov Model

We haveM bags of ballsof differentcolors(Red -R, Green -G).

We are standing behind a curtain and at each point in time we select a bagj, draw (with replacement) a ball from it and show the ball to an observer.Color of the ball shown at timet is Ct ∈ R,G. We do this forT timesteps.

The observer can only see the balls, it has no access to the informationabout how we select the bags.

Assume: weselect bag at timet based only on our selection at the previoustime stept− 1 (1st-order Markov assumption).

Probabilistic Modelling in Machine Learning – p.23/126

Page 24: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

If only we knew ...

If we knew which bags were used at which time steps, things wouldbevery easy!... just counting

Hiddenvariableszjt :

zjt = 1, iff bag j was used at timet;

zjt = 0, otherwise.

P (bagj → bagk) =

∑T−1t=1 zjt · z

kt+1

∑Mq=1

∑T−1t=1 zjt · z

qt+1

[state transitions]

P (color = c | bagj) =

∑Tt=1 z

jt · δ(c = Ct)

g∈R,G

∑Tt=1 z

jt · δ(g = Ct)

[emissions]

Probabilistic Modelling in Machine Learning – p.24/126

Page 25: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

But we don’t ...

We need toestimate probabilities for hidden eventssuch as:

zjt · zkt+1 = 1

at timet - bagj, at the next time step - bagk

zjt · δ(c = Ct) = 1at timet - bagj, ball of colorc

Again, theprobability estimates need to be based on observed dataD andour current modelof state transition and emission probabilities.

Probabilistic Modelling in Machine Learning – p.25/126

Page 26: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Estimating values of the hidden variables

P (zjt · zkt+1 = 1 | D, current model) = Rj→k

t

P (zjt · δ(c = Ct) = 1 | D, current model) = Rj,ct

I will not deal with the crucial question of how to compute those posteriorsover hidden variables, given the observed data and current modelparameters.

This can be done efficiently -Forward-Backward algorithm.

Probabilistic Modelling in Machine Learning – p.26/126

Page 27: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Re-estimate the model

P (bagj → bagk) =

∑T−1t=1 zjt · z

kt+1

∑Mq=1

∑T−1t=1 zjt · z

qt+1

P (bagj → bagk) =

∑T−1t=1 Rj→k

t∑M

q=1

∑T−1t=1 Rj→q

t

[state transitions]

P (color = c | bagj) =

∑Tt=1 z

jt · δ(c = C(t))

g∈R,G

∑Tt=1 z

jt · δ(g = C(t))

P (color = c | bagj) =

∑Tt=1R

j,ct

g∈R,G

∑Tt=1R

j,gt

[emissions]

Probabilistic Modelling in Machine Learning – p.27/126

Page 28: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Learning - can we do better than hand-waving?

Observed data:D

Parameterized model of data itemsx: p(x|w)

Log-likelihood ofw: log p(D|w)

Train viaMaximum Likelihood:

wML = argmaxw

log p(D|w).

Probabilistic Modelling in Machine Learning – p.28/126

Page 29: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Complete data

Observed data:D

Unobserved data:Z (realization of compound hidden variableZ)

Complete data:(D,Z)

By marginalization ("integrate out the uncertainty inZ"):

p(D|w) =∑

Z

p(D,Z|w)

Probabilistic Modelling in Machine Learning – p.29/126

Page 30: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Concave function

uu’’

F(u)

u’ au’ + (1−a)u’’

F(au’ + (1−a)u’’)

aF(u’) + (1−a)F(u’’)

0 <= a <= 1

For any concave functionF : R → R and anyu′, u′′ ∈ R, a ∈ [0, 1]:

F (au′ + (1− a)u′′) ≥ aF (u′) + (1− a)F (u′′).

F

(

i

aiui

)

≥∑

i

aiF (ui), ai ≥ 0,∑

i

ai = 1

Probabilistic Modelling in Machine Learning – p.30/126

Page 31: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

A lower bound on log-likelihood

Pick ‘any’ distributionQ for hidden variableZ.∑

Z Q(Z) = 1, Q(Z) > 0.

log(·) is a concave function

log p(D|w) = log

(

Z

p(D,Z|w)

)

= log

(

Z

Q(Z)p(D,Z|w)

Q(Z)

)

≥∑

Z

Q(Z) log

(

p(D,Z|w)

Q(Z)

)

= F(Q,w)

Probabilistic Modelling in Machine Learning – p.31/126

Page 32: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Max F(Q,w) - lower bound on log-likelihood

Do ‘coordinate-wise’ ascent onF(Q,w).

F(Q,w)w

Q

Probabilistic Modelling in Machine Learning – p.32/126

Page 33: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Maximize F(Q,w) w.r.t. Q (w fixed)

F(Q,w) =∑

Z

Q(Z) log

(

p(D,Z|w)

Q(Z)

)

=∑

Z

Q(Z) log

(

p(Z|D,w) · p(D|w)

Q(Z)

)

=∑

Z

Q(Z) logp(Z|D,w)

Q(Z)+∑

Z

Q(Z) · log p(D|w)

= −∑

Z

Q(Z) logQ(Z)

p(Z|D,w)+ log p(D|w)

Z

Q(Z)

= −DKL[Q(Z)||P (Z|D,w)] + log p(D|w).

Probabilistic Modelling in Machine Learning – p.33/126

Page 34: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Maximize F(Q,w) w.r.t. Q (w fixed)

Sincelog p(D|w) is constant w.r.t.Q, we only need tomaximize−DKL[Q(Z)||P (Z|D,w)], which is equivalent tominimizing

DKL[Q(Z)||P (Z|D,w)] ≥ 0.

This is achieved whenDKL[Q(Z)||P (Z|D,w)] = 0, i.e. when

Q∗(Z) = P (Z|D,w)

Note:F(Q,w) = log p(D|w) - no longer a lose lower bound!

Probabilistic Modelling in Machine Learning – p.34/126

Page 35: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

E-step

Theoptimal way for guessingthevalues of hidden variablesZ is toset thedistribution ofZ to the posterior overZ, given the observed dataD andcurrent parameter settingsw.

Probabilistic Modelling in Machine Learning – p.35/126

Page 36: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Maximize F(Q,w) w.r.t. w (Q is fixed toQ∗)

F(Q∗,w) =∑

Z

Q∗(Z) log

(

p(D,Z|w)

Q∗(Z)

)

=∑

Z

Q∗(Z) log p(D,Z|w)−∑

Z

Q∗(Z) logQ∗(Z)

= EQ∗(Z)[ log p(D,Z|w)] +H(Q∗).

Since the entropy ofQ∗, H(Q∗), is constant (Q∗ is fixed), we only need tomaximizeEQ∗(Z)[log p(D, Z|w)]:

w∗ = argmaxw

EQ∗(Z)[ log p(D, Z|w)].

Probabilistic Modelling in Machine Learning – p.36/126

Page 37: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

M-step

Theoptimal way for estimating the parametersw is to select the parametervaluesw∗ thatmaximize the expected value of the complete datalog-likelihoodp(D,Z|w), where theexpectationis takenw.r.t. theposterior distribution over the hidden dataZ, P (Z|D,w) (our best guess).

Find a single parameter vectorw∗ for all hidden variable settingsZ (sincewe don’t know the true values ofZ), but while doing this,weight theimportance of each particular settingZ by the posterior probabilityP (Z|D,w).

Probabilistic Modelling in Machine Learning – p.37/126

Page 38: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

E-M Algorithm

Given the current parameter settingwold do:

E-step:

EstimateP (Z|D,wold), the posterior distribution overZ, given theobserved dataD and current parameter settingswold.

M-step:Obtain new parameter valueswnew by maximizing

EP (Z|D,wold)[log p(D, Z|w)].

Setwold := wnew and go to E-step.

Probabilistic Modelling in Machine Learning – p.38/126

Page 39: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Manifold learning (for ants)

Imagine you are an ant and you are faced with this 2-dim data set:

but you can comprehend only 1-dim patterns

We should be able to understand the data - it corresponds to a ‘noisy’1-dim manifold

Probabilistic Modelling in Machine Learning – p.39/126

Page 40: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Build a latent space model of data distributionI think I know how the data might have been generated

−1 +1

Compress, stretch, bend (non−linear map)

Add some noise

Everything can be "explained" using a line segment (Latent Space = computer screen)

Probabilistic Modelling in Machine Learning – p.40/126

Page 41: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Project the data

Project the data

Stretch back to a straight line(computer screen)

Probabilistic Modelling in Machine Learning – p.41/126

Page 42: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Constrained mixture models

A chain ofnoise models (Gaussians) along a 1-dim "mean data manifold"

Constrainedmixture ofnoise models - sphericalGaussians.

Still p(t) =∑M

j=1 P (j) · p(t|j), but now theGaussians are forced to havetheir means organized along a smooth 1-dim manifold.

Probabilistic Modelling in Machine Learning – p.42/126

Page 43: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Smooth embedding of continuous latent space

−1 +1

constrained mixture

low−dim latent space (continuous)

(smooth) non−linear embedding in high−dim model space

Probabilistic Modelling in Machine Learning – p.43/126

Page 44: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Smooth embedding of continuous latent space

σ σσ

φ (x)φ (x)

φ (x)

1

2

B

...

t 1

t 3

t 2

x1 x2 xM

φ φ

x3

f(x)

x

f(x ) f(x )f(x ) f(x )

φ

2

3

1

M

1 2 B

f(x)=W3xB

Probabilistic Modelling in Machine Learning – p.44/126

Page 45: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Data (roughly) along a 2-D manifold

Generalize the notion of ‘bicycle chain’ of codebook vectors: Takeadvantage of two-dimensional structure of the computer screen.Cover itwith a 2-dimensional grid of nodes.

x1

x3

x2

β

Data Space

Computer Screen

Probabilistic Modelling in Machine Learning – p.45/126

Page 46: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Generative Topographic Mapping

Latent space

Centres x

RBF net

Data space tn

i

Projection manifold

Probabilistic Modelling in Machine Learning – p.46/126

Page 47: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

GTM - model formulation

Models probability distribution in the (observable) high-dim dataspaceℜD by means of low-dim latent variables. Data is visualized inthe latent spaceH ⊂ ℜL (e.g[−1, 1]2).

Latent spaceH is covered with an array ofC latent space centersxc ∈ H, c = 1, 2, ..., C.

Non-linear GTM mapf : H → D is defined using a kernelregression –B fixed basis functionsφj : H → ℜ, collected inφ,(Gaussians of the same widthσ), D ×B matrix of weightsW:

f(x) = WD×B φ(x)

Probabilistic Modelling in Machine Learning – p.47/126

Page 48: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

GTM - model formulation - Cont’d

GTM creates a generative probabilistic model in the data space byplacing radially-symmetric GaussiansP (t| xc,W, β) (zero mean,inverse varianceβ) aroundf(xc), c = 1, 2, ..., C.

Defining auniform prior overxc, the GTM density model is

P (t| W, β) =1

C

C∑

c=1

P (t| xc,W, β)

The data is modelled as aconstrained mixture of Gaussians. GTMcan be trained using anEM algorithm.

Mixture of Gaussians where we sneaked in a non-linear model(low-dim manifold) where the Gaussian centers can lie.

Probabilistic Modelling in Machine Learning – p.48/126

Page 49: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

GTM - Data Visualization

Posteriorprobabilitythat thec-th Gaussian generatedtn,

rc,n =P (tn| xc,W, β)

∑Cj=1 P (tn| xj ,W, β)

.

The latent space representation of the pointtn, i.e. theprojection oftn, is taken to be themean

C∑

c=1

rcn xc

of the posteriordistributiononH.

Probabilistic Modelling in Machine Learning – p.49/126

Page 50: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Differential geometry on projection manifold

Unlike many other manifold learning methods, GTM provides explicitparametrized model for the data manifold!

Latent space - global co-ordiate chart.

Magnification Factors:We canmeasure stretch in the sheet. This can be used todetect the gapsbetween data clusters.

Directional Curvatures:We can also measure thedirectional curvature of the 2-D sheet embeddedin the high-dim data space.Visualize themagnitude and direction ofthelocal largestcurvaturesto seewhere and how the manifold is most folded.

Probabilistic Modelling in Machine Learning – p.50/126

Page 51: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Magnification Factors (detect clusters)

−1 +1projections

−1 +1latent space

contract

expand

contract

data space

Probabilistic Modelling in Machine Learning – p.51/126

Page 52: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Directional Curvatures (detect foldings)

x2

Latent space

x1

t

t 3

t 2

Data space

Tx

fM

h

0µ(0)µ(0)

DH

Mf ( )H

0

1

xx( )b

µ( )b

Probabilistic Modelling in Machine Learning – p.52/126

Page 53: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Toy Example

−4

−2

0

2

4

−4

−2

0

2

4−4

−3

−2

−1

0

1

2

3

4

−4

−2

0

2

4

−4

−2

0

2

4−4

−2

0

2

4

6

10

20

30

40

50

60

70

80

90

100

5

10

15

20

25

30

35

40

45

50

55

Projection manifoldData manifold

Dir. curvaturesMagn. factors

Probabilistic Modelling in Machine Learning – p.53/126

Page 54: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Directional Curvature

Thetangent vectorµ(0) to µ atµ(0) lies inTx0(dashed rectangle),

the tangent plane of themanifoldf(H) atµ(0).

LetΓ(1)r be a (column) vector of partial derivatives of the function

f = (f1, f2, ..., fD)T ,

with respect to ther-th latent space variable atx0 ∈ H,

LetΓ(1) be theD × L matrix

Γ(1) = [Γ

(1)1 ,Γ

(1)2 , ...,Γ

(1)L ].

The range of the matrixΓ(1) is the tangent planeTx0of the

projection manifoldΩ atf(x0) = µ(0).

Probabilistic Modelling in Machine Learning – p.54/126

Page 55: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Directional Curvature

Orthogonal projection ontoTx0is a linear operator described by the

projection matrix

Π = Γ(1)(

Γ(1))+

,

Decompose the second directional derivativeµ(0) of f into twoorthogonal components, one lying in the tangent spaceTx0

, the otherlying in its orthogonal complementT⊥

x0,

µ(0) = µ‖(0) + µ⊥(0), µ‖(0) ∈ Tx0, µ⊥(0) ∈ T⊥

x0.

µ‖(0) - changes in the first-order derivatives due to “varying speed ofparameterization”µ⊥(0) - changes in the first-order derivatives that are responsible forcurving of the projection manifoldΩ

Probabilistic Modelling in Machine Learning – p.55/126

Page 56: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Missing Data

Data pointtn divided into anobserved componentton and amissingcomponenttmn .

Having agenerative probabilistic model of the datacan help us todealwith missing values in a principled manner - treat them as latent variables!

tnm

tno

Probabilistic Modelling in Machine Learning – p.56/126

Page 57: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Toy Example – Missing Data

0 1 2 3 4 5 6 7

0

1

2

3

4

5

6

7After 15 iterations of training.

0 1 2 3 4 5 6 7

0

1

2

3

4

5

6

7After 15 iterations of training.

o – complete data points+ – centers of the Gaussian mixture components* – Filled in missing valuesdiscs – 2 standard deviations of the noise model.

Probabilistic Modelling in Machine Learning – p.57/126

Page 58: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Hierarchy of GTMsTop level visualization

This looks interesting!This looks interesting!

Level 2

Level 3

Level 1

Probabilistic Modelling in Machine Learning – p.58/126

Page 59: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

3 types of hidden variables!

1. Wedo not knowthe exactassignments of pointstn to GTMsN atlevel ℓ, but we do havemodel responsibilitiesP (N| tn) from theprevious step.P (N| tn) is the posterior probability that GTMN generatedtn.

2. AssumingthatGTM N at levelℓ generatedtn, wedo not knowtheexactassignments of its childrenM at levelℓ+ 1 to tn. We cancalculateparent-conditional responsibilitiesP (M| N , tn)

3. AssumingthatGTM M at levelℓ+ 1 generatedtn, wedo not knowthe exactassignments of its latent space centersxMi to tn. We canevaluateRM

i,n

Probabilistic Modelling in Machine Learning – p.59/126

Page 60: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Toy Example – HGTM

−4

−2

0

2

4

−4

−2

0

2

4−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

x1

Training set

x2

y

−4

−2

0

2

4

−4

−2

0

2

4−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

Data + child projection manifolds

Probabilistic Modelling in Machine Learning – p.60/126

Page 61: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Toy Example – HGTM plots

1

2

3

4

Class 1

Class 2

Class 3

Class 4

Class 1

Class 2

Class 3

Class 4

a b

Projections + Child modulated parent plot

Probabilistic Modelling in Machine Learning – p.61/126

Page 62: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Oil data

Data set arises from a physics-based simulation of non-invasive monitoringsystem, used to determine the quantity of oil in a multi-phase pipelinecontaining a mixture of oil, water and gas

1000 pointsin 12-dim space

Points in the data set are classified into3 different multi-phase flowconfigurations– homogeneous, annularandlaminar

Probabilistic Modelling in Machine Learning – p.62/126

Page 63: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Hierarchical GTM - Oil data

1

2

3

HomogeneousAnnular Laminar

1 2

1

23

4

1

2

3

Probabilistic Modelling in Machine Learning – p.63/126

Page 64: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Understanding the plot

HomogeneousAnnular Laminar

Probabilistic Modelling in Machine Learning – p.64/126

Page 65: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Magnification factors

−2

0

2

4

Probabilistic Modelling in Machine Learning – p.65/126

Page 66: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Directional Curvatures

5

10

15

20

Probabilistic Modelling in Machine Learning – p.66/126

Page 67: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Let’s get crazy - GTM in the model space!

Other data types=⇒ other noise models

Latent space

Centres x

RBF net

Data space tn

i

Projection manifold

Probabilistic Modelling in Machine Learning – p.67/126

Page 68: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Other data types=⇒ other noise models

Probabilistic framework is convenient- wecan deal with arbitrarydata types as long as an appropriate noise model can be formulated.

The principle will be demonstrated on sequential data, but extensionsto other data types (graphs) are possible.

Forsequential data

neednoise modelsthat take into accounttemporal correlationswithin sequences, e.g. Markov chains, HMMs, etc.

Probabilistic Modelling in Machine Learning – p.68/126

Page 69: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Latent Trait HMM (LTHMM)

GTM with HMM as the noise model!

For each HMM (latent center) we need to parameterize severalmultinomials

initial state probabilities

transition probabilities

emission probabilities (discrete observations)

Multinomials areparameterized through natural parameters.

Probabilistic Modelling in Machine Learning – p.69/126

Page 70: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

LTHMM

alphabet ofS symbols,S = 1, 2, ..., S.

Consider a set of symbolic sequences,s(n) = (s(n)t )t=1:Tn

,n = 1, 2, ..., N

With each latent pointx ∈ H, weassociate a generative distribution(HMM with K hidden states)over sequencesp(s|x).

p(s|x) =∑

h∈KTn

p(h1|x)Tn∏

t=2

p(ht|ht−1,x)

Tn∏

t=1

p(st|ht,x)

Assuming independently generated sequences, the likelihood is

L =

N∏

n=1

p(s(n)) =

N∏

n=1

1

C

C∑

c=1

p(s(n)|xc).

Probabilistic Modelling in Machine Learning – p.70/126

Page 71: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Smooth manifold in the k-state HMM space

LTHMM parameters are obtained through a parameterizedsmoothnon-linear mapping from the latent space (global coordinate chart) into theHMM natural parameter space (another global coordinate chart).

g(.) is the softmax function (the canonical inverse link function ofmultinomial distributions)

gk(

(a1, a2, ..., aℓ)T)

=expak

∑ℓi=1 expai

, k = 1, 2, ..., ℓ,

Free parameters:A(π) ∈ RK×B, A(T l) ∈ R

K×B andA(Bk) ∈ RS×B

Probabilistic Modelling in Machine Learning – p.71/126

Page 72: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

From latent points to (local) noise models

π(x) = p(h1 = k|x)k=1:K

= gk(A(π)φ(x))k=1:K

T (x) = p(ht = k|ht−1 = l,x)k,l=1:K

= gk(A(T l)φ(x))k,l=1:K

B(x) = p(s(n)t = s|ht = k,x)s=1:S,k=1:K

= gs(A(Bk)φ(x))s=1:S,k=1:K

Probabilistic Modelling in Machine Learning – p.72/126

Page 73: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Riemannian manifold of HMM

H

M

+1

+1

−1

x + dx

x

x1

x2

p(.|x+dx)

p(.|x)

V

2-dim manifoldM of local noise models (HMMs)p(·|x) parameterizedby the latent spacethrough a smooth non-linear mapping.M is embedded in manifoldH of all noise models of the same form.

Probabilistic Modelling in Machine Learning – p.73/126

Page 74: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Riemannian metric

Latent coordinatesx aredisplaced tox+ dx.How different are the corresponding noise models (HMMs)?Need to answer this in a parameterization-free manner...

Local Kullback-Leibler divergencecan be estimated by

D[p(s|x)‖p(s|x+ dx)] ≈ dxTJ(x)dx,

whereJ(x) is theFisher Information Matrix

Ji,j(x) = −Ep(s|x)

[

∂2 log p(s|x)

∂xi∂xj

]

that acts like ametric tensoron the Riemannian manifoldM

Probabilistic Modelling in Machine Learning – p.74/126

Page 75: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

LTHMM - Fisher Information Matrix

HMM is itself a latent variable model.J(x) cannot be analytically determined.

There are several approximation schemes and an efficient algorithmforcalculating theobservedFisher Information Matrix.

Probabilistic Modelling in Machine Learning – p.75/126

Page 76: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Induced metric in data space

Structured data types - careful with the notion of a metric in the data space.

LTHMM naturally induces a metric in the structured data space.

Two data items (sequences) are considered to be close (or similar)if bothof them are well-explained by the same underlying noise model (e.g.HMM) from the 2-dimensional manifold of noise models.

Distance between structured data items is implicitly defined by the localnoise models that drive topographic map formation.

If the noise model changes, the perception of what kind of data items areconsidered similar changes as well.

Probabilistic Modelling in Machine Learning – p.76/126

Page 77: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

LTHMM - training

Constrained mixture of HMMs is fitted byMaximum likelihoodusing anE-M algorithm

Two types of hidden variables:

which HMM generated which sequence(responsibility calculations is in mixture models)

within a HMM, what is thestate sequenceresponsible for generatingthe observed sequence(forward-backward-like calculations)

Probabilistic Modelling in Machine Learning – p.77/126

Page 78: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Two illustrative examples

Toy Data400 binary sequences of length 40 generated from4 HMMs (2hidden states) withidentical emission structure(the HMMsdifferedonly in transition probabilities). Each of the 4 HMMs generated 100sequences.

Melodic Lines of Chorals by J.S. Bach100 chorales. Pitches are represented in the space of one octave,i.e.the observation symbol space consists of 12 different pitch values.

Probabilistic Modelling in Machine Learning – p.78/126

Page 79: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Toy data

State Transitions

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Emissions

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

0.5

1

1.5

2

2.5

State−transition probabilities

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Emission probabilities

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Class 1

Class 2

Class 3

Class 4

Toy data: info matrix capped at 250

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

50

100

150

200

Probabilistic Modelling in Machine Learning – p.79/126

Page 80: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Bach chorals

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Latent space visualizationflats

infrequent flats/sharps

sharpsg−#f−g patterns

repeated #f notes in g−#f−g patterns g−#f−#f−#f−g

tritons c−#a−#d−d−c

Probabilistic Modelling in Machine Learning – p.80/126

Page 81: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Topographic formulation regularizes the model

0 5 10 15 20 25 30 352.05

2.1

2.15

2.2

2.25

2.3

2.35

2.4

2.45

2.5

2.55

epoch

nega

tive

log

likel

ihoo

d

traintest

Evolution of negative log-likelihood per symbol measured on the training(o) and test (*) sets(Bach chorals).

Probabilistic Modelling in Machine Learning – p.81/126

Page 82: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Topographic organization of eclipsing binaries

Line of sight of the observer is aligned with orbit plane of a two starsystem to such a degree that the component stars undergo mutualeclipses.

Even though the light of the component stars does not vary, eclipsingbinaries are variable stars - this is because of the eclipses.

The light curve is characterized by periods of constant light withperiodicdrops in intensity.

If one of the stars is larger than the other (primary star), one will beobscured by a total eclipse while the other will be obscured by an annulareclipse.

Probabilistic Modelling in Machine Learning – p.82/126

Page 83: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Eclipsing Binary Star - normalized flux

0 1 ... T−1 T

1

0.8

0.6

0.4

time

flux

Original lightcurve

0 1 ... T−1 T

0.4

0.5

0.6

0.7

0.8

0.9

1

time

flux

Shifted lightcurve

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

0.4

0.5

0.6

0.7

0.8

0.9

1

Phase−normalised lightcurve

phase

flux

primary eclipse

primary eclipse

Probabilistic Modelling in Machine Learning – p.83/126

Page 84: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Eclipsing Binary Star - the model

api

m

q.m

Parameters:Primary mass:m (1-100 solar mass)mass ration:q (0-1)eccentricity:e (0-1)inclination: i (0o − 90o)argument of periastron:ap (0o − 180o)log period:π (2-300 days)

Probabilistic Modelling in Machine Learning – p.84/126

Page 85: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Empirical priors on parameters

p(m, q, e, i, ap, π) = p(m)p(q)p(π)p(e|π)p(i)p(ap)

Primary mass density:

p(m) = a×mb

a =

0.6865, if 0.5×Msun ≤ m ≤ 1.0×Msun

0.6865, if 1.0×Msun < m ≤ 10.0×Msun

3.9, if 10.0×Msun < m ≤ 100.0×Msun

b =

−1.4, if 0.5×Msun ≤ m ≤ 1.0×Msun

−2.5, if 1.0×Msun < m ≤ 10.0×Msun

−3.3, if 10.0×Msun < m ≤ 100.0×Msun

Probabilistic Modelling in Machine Learning – p.85/126

Page 86: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Empirical priors on parameters

Mass ratio densityp(q) = p1(q) + p2(q) + p3(q)wherepi(q) = Ai × exp(−0.5 (q−qi)2

s2i

)

withA1 = 1.30, A2 = 1.40, A3 = 2.35q1 = 0.30, q2 = 0.65, q3 = 1.00s1 = 0.18, s2 = 0.05, s3 = 0.10

log-period density

p(π) =

1.93337π3 + 5.7420π2 − 1.33152π + 2.5205, if π ≤ log1018

19.0372π − 5.6276, if log1018 < π ≤ log10300

etc.

Probabilistic Modelling in Machine Learning – p.86/126

Page 87: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Flux GTM

Smooth parameterized mappingF from 2-dim latent space into the spacewhere 6 parameters of the eclipsing binary star model live.

Model light curves are contaminated by an additive observational noise(Gaussian). This gives a local noise model in the (time,flux)-space.

Each point on the computer screen corresponds to a local noise model and"represents" observed eclipsing binary star light curves that arewellexplained by the local model.

MAP estimation of the mappingF via E-M.

Probabilistic Modelling in Machine Learning – p.87/126

Page 88: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Outline of the model (1)

x

V

Latent space

−1

1

1

−1

021

(x)ΓM

ΩM

Parameter space

[x , x ]

θ

Apply smooth mappingΓ

Apply physical model

(computer screen)

parameter vector

coordinate vector

Probabilistic Modelling in Machine Learning – p.88/126

Page 89: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Outline of the model (2)

p(O|x)

H

ΩH

Distribution space

0 0.2 0.4 0.6 0.8 10.4

0.5

0.6

0.7

0.8

0.9

1

phase

flux

(x)ΓfJ

ΩJ

0 0.2 0.4 0.6 0.8 10.4

0.5

0.6

0.7

0.8

0.9

1

phase

flux

Apply Gaussian obervational noise

Apply physical model

Regression model space

Probabilistic Modelling in Machine Learning – p.89/126

Page 90: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Artificial fluxes - projections

Probabilistic Modelling in Machine Learning – p.90/126

Page 91: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Artificial fluxes - model

Probabilistic Modelling in Machine Learning – p.91/126

Page 92: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Real fluxes - clustering mode

0 10 20 30 40 50 60 70 80 90 1000.4

0.5

0.6

0.7

0.8

0.9

1

1.1cluster 1

0 10 20 30 40 50 60 70 80 90 1000.4

0.5

0.6

0.7

0.8

0.9

1

1.1cluster 2

0 10 20 30 40 50 60 70 80 90 100

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1cluster 3

0 10 20 30 40 50 60 70 80 90 1000.4

0.5

0.6

0.7

0.8

0.9

1

1.1cluster 4

0 10 20 30 40 50 60 70 80 90 1000.4

0.5

0.6

0.7

0.8

0.9

1

1.1cluster 5

0 10 20 30 40 50 60 70 80 90 1000.4

0.5

0.6

0.7

0.8

0.9

1

1.1cluster 6

Probabilistic Modelling in Machine Learning – p.92/126

Page 93: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Off-the-shelf methods may produce nonsense!

0

0.6

0.8

1

lightcurve in dataset

0 0.4

0.6

0.8

1

lightcurve in dataset

0 0.4

0.6

0.8

1

codebook vector

ρ

ρ

ρProbabilistic Modelling in Machine Learning – p.93/126

Page 94: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Real fluxes - projections + model

Primary mass Mass ratio Eccentricity

Inclination Argument Period

Probabilistic Modelling in Machine Learning – p.94/126

Page 95: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

3-color cDNA microarrays

Traditional dual-color cDNA microarrays – 2 different fluorescencedyes corresponding to two samples (e.g. “normal” and “disease”).

3-color cDNA microarrays – 3rd dye associated with yet anothersample hybridized to a single microarray.

Assess effects of a drug – essay hybridizing three samples:normal(dyed red),disease(dyed green),drug-treated(dyed blue).

IntensitiesR, G andB reflect expression levels of the genes in thenormal (healthy), disease and drug-treated samples, respectively.

Probabilistic Modelling in Machine Learning – p.95/126

Page 96: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

hexaMplot [Zhao06]

2-dim representation of R, G and B intensities suited for assessingthe drug effect on essayed genes

hexaMplot coordinates: log ratios of intensity pairsx1 = log2B/G andx2 = log2G/R.

Genes in theupper and lower half-planeareup- and down-regulated,respectively, by the disease.

Genes in theleft and right half-planeareup- and down-regulated,respectively, by the drug treatment, compared with the diseasesample.

Slant axisx2 = −x1 ⇒ log2B/R = 0. Expression levels of genesin the normal and drug-treated samples are the same.

Probabilistic Modelling in Machine Learning – p.96/126

Page 97: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

hexaMplot

Quadrant 1

G>R>B

G>B>R

B>G>R

R>G>B

log B/G

undesirable drug effect

undesirable drug effect

of the diseasedrug neutralizing the effect

B=R

R>B>G

B>R>G

Quadrant 3

Quadrant 4

Quadrant 2 log G/R

Probabilistic Modelling in Machine Learning – p.97/126

Page 98: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Assessing drug effects through hexaMplot

Drug neutralizes the effect of the disease on the essayed genes –gene representations cluster around the slant axis.

Deviations form the slant axis within the 4th and 2nd quadrants(x1 > 0, x2 < 0 andx1 < 0, x2 > 0, respectively) –still representdrug effects in the right direction.

Genes in 1st and 3rd quadrants(x1, x2 > 0 andx1, x2 < 0,respectively) –undesirable effect of the drug:enhancing the up-regulation, or suppressing the down-regulation ofthe gene by the disease.

Probabilistic Modelling in Machine Learning – p.98/126

Page 99: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Past work

[Zhao06 ] – correlation coefficient of hexaMplot gene representationscalculated and assessed for statistical significance.

[Zhao07 ] – more involved analysis.Detect groups of genes with similar expression patterns relative tothe disease and the drug.

- Each such group is alignTed along aline raystarting in thehexaMplot origin.- Directionof the ray signifies whether the drug has positive ornegative effect.- Anglemeasures the drug effect level

The lines were detected through Hough Transform

Probabilistic Modelling in Machine Learning – p.99/126

Page 100: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

H Transform

x

y

x

y0

0

y=ax+b

Probabilistic Modelling in Machine Learning – p.100/126

Page 101: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

H Transform

a

b

ob=−x a+yo

b=−x a+y1 1

Probabilistic Modelling in Machine Learning – p.101/126

Page 102: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Problem 1

HT applied to the differentially expressed genes only.Their detection done through fitting a global bi-variate Gaussian onhexaMplot gene representations and then applying a probabilitydensity threshold.

- “Hard” separation of genes into equally vs. differentially expressedgenes is not optimal – typically there will be a high density ofgenesaround the separating confidence ellipse.

- Results can be sensitive to the particular choice of the confidencevalue defining what is differentially expressed and what is not.

Probabilistic Modelling in Machine Learning – p.102/126

Page 103: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Problem 2

HT implicitly imposes a noise model that does not fit the natureofhexaMplot representations well.

- Induced noise model depends on the line parametrization used.

- (x1, x2) hexaMplot representations are negatively correlated andthere is no direct way of representing this fact in the standard HT.

Probabilistic Modelling in Machine Learning – p.103/126

Page 104: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Problem 3

Determination of the quantization level in the Hough space shouldreflect the amount of “measurement” noise in the hexaMplotfeatures.

The quantization level determines the amount of smoothing in theHough accumulator, which in turn has an effect on the number ofdistinct peaks (detected lines) in the Hough space.

Probabilistic Modelling in Machine Learning – p.104/126

Page 105: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Problem 4

Given a detected line, there is no principled way of quantifyingthestrength of association of the points with that line.

Probabilistic Modelling in Machine Learning – p.105/126

Page 106: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Our proposal – probabilistic HT

Address these shortcomings in the framework of aprincipledprobabilistic model based formulation.

All essayed genes are considered. The weaker and strongercontribution of equally and differentially expressed genes is obtainednaturally in a “soft” manner from theprobabilistic formulation of themodel behind the hexaMplot.

The modelexplicitly takes into account the size and the negativelycorrelated nature of the noiseassociated with hexaMplot generepresentations.

Both thestrength of association of individual genes with a particulargroup(line ray in hexaMplot) and thesupport for the group by theselected genescan bequantified in a principled mannerthroughposterior probabilities over the line angles, given the observations.

Probabilistic Modelling in Machine Learning – p.106/126

Page 107: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Probabilistic HT

p( |x)α

x1

2x

x

1

2

3

x

x

α

α H

α

αp( |x )α

p( |x )

p( |x )

3

2

1

(a) (b)

Probabilistic Modelling in Machine Learning – p.107/126

Page 108: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

The model

A line ray inR2 (hexaMplot space) starting in the origin at an angleα ∈ [−π/4, 7π/4).

Bi-variate zero-mean Gaussian measurement noise with covariancematrixΣX . The density of possible measurementsx = (x1, x2)

T ∈ R2 corresponding to the point(r cosα, r sinα) on

the line is given by

p(x|α, r) =1

2π|ΣX |1/2

exp

−1

2(xT − (r cosα, r sinα)) Σ−1

X (x − (r cosα, r sinα)T )

,

wherer > 0 is the (Euclidean) distance of the point on the line fromthe origin.

Probabilistic Modelling in Machine Learning – p.108/126

Page 109: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

The model - contd’

Prior knowledge about the parameter values(α, r) ∈ [−π/4, 7π/4)× [0,∞), summarized in the form of a priordistributionp(α, r).

Given an observationx, the induced uncertainty in the parameterspace is given by the posterior

p(α, r|x) =p(x|α, r) p(α, r)

[π/4,7π/4)×[0,∞) p(x|α′, r′) p(α′, r′) dα′dr′

.

To obtain the amount of support for the angle parameterα given theobservationx, we integrater from the posterior:

p(α|x) =∫

[0,∞)p(α, r|x) dr.

Probabilistic Modelling in Machine Learning – p.109/126

Page 110: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

The model - contd’

Given a set of observationsD = x1, x2, ..., xN, xi ∈ R2,

i = 1, 2, ..., N , accumulate the evidence contributions in the HoughspaceH

H(α;D) =1

N

N∑

i=1

p(α|xi).

Given that a line candidate with inclination angleα has beendetected by inspecting the peaks of the Hough accumulatorH(α;D),one can ask which points fromD are strongly associated with it.Consult the posteriorsp(α|xi), i = 1, 2, ..., N , and select pointsabove some threshold valueθ.

To enhance the threshold interpretability, we discretized the anglespaceH into a regular gridG = α1, α2, ..., αM and turned thedensitiesp(α|x) into probabilitiesP (αj |x) over theG.

Probabilistic Modelling in Machine Learning – p.110/126

Page 111: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

The model - contd’Calculate the probability thresholdθ ∈ (0, 1) asθ = κ/M ,κ ∈ (0,M), meaning that only observations with posteriors at leastκtimes greater than the uninformative distribution1/M will beconsidered.

Given a probability thresholdθ and a (discretized) angleα, the set ofselected points that support the line rayα reads:

Sθ(α) = x | x ∈ D, P (α|x) ≥ θ.

Check how much the set as a whole supports that line ray through theposterior

P (α|Sθ(α)) =p(Sθ(α)|α) P (α)

α′∈G p(Sθ(α)|α′) P (α′),

whereP (α′) is the prior distribution over the gridG.

Probabilistic Modelling in Machine Learning – p.111/126

Page 112: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

The model - contd’

Assuming independence of observations

p(Sθ(α)|α′) =

x∈Sθ(α)

p(x|α′)

=∏

x∈Sθ(α)

∫ ∞

0p(x|r, α′) p(r|α′)dr.

Here,p(x|r, α′) is the noise model andp(r|α′) is the conditionalprior onr.

Probabilistic Modelling in Machine Learning – p.112/126

Page 113: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Noise modelIt is usual to assume that the log intensities are normally distributed.

x = (x1, x2)T =

(

logB

G, log

G

R

)T

.

Consider 3 random variables (log intensities)Y1, Y2 andY3representinglogB, logG andlogR, respectively.hexaMplotrepresentations(x1, x2) correspond to two random variablesX1 = Y1 − Y2 andX2 = Y2 − Y3 coupled throughY2.

Even if we assume that the individual measurement errors of thethree log intensitiesY1, Y2 andY3 are independent, the implied noisein the hexaMplot coordinatesX1, X2 will be negatively correlated.This simply follows from that fact that whileY2 contributesnegatively toX1, its contribution toX2 is positive.

Probabilistic Modelling in Machine Learning – p.113/126

Page 114: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Noise model

Assuming that the measurement noise of the log intensityYi is a zeromean Gaussian with varianceσ2

i , i = 1, 2, 3, (X1, X2) will beGaussian distributed with covariance matrix

ΣX =

[

σ21 + σ2

2 −σ22

−σ22 σ2

2 + σ23

]

. (1)

We assume equal levels of measurement noise across the threecolors,σ2 = σ2

1 = σ22 = σ2

3,

ΣX = 2σ2

[

1 −12

−12 1

]

. (2)

Probabilistic Modelling in Machine Learning – p.114/126

Page 115: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Toy examplex1 = (0.1,−0.1)T , x2 = (−1, 0)T andx3 = (−1.75, 1.75)T

0 1 2 3 4 50

5

10

15

alpha

p(alpha | x)

0 1 2 3 4 50

0.5

1

1.5

2

2.5

3

alpha

r

p(alpha,r | x)

0 1 2 3 4 50

5

10

15

alpha

p(alpha | x)

0 1 2 3 4 50

0.5

1

1.5

2

2.5

3

alpha

r

p(alpha,r | x)

sigma=0.05

(a) (b)

(d)(c)

sigma=0.2

Probabilistic Modelling in Machine Learning – p.115/126

Page 116: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Real data

Analyse action of the drug Rg1 (dominant compound of the extractof ginsenosides in ginseng) on homocysteine-treated humanumbilical vein endothetial cells (HUVEC).

1128 genes assayed in four microarrays (4 repeats under the sameexperimental conditions).

Usual data normalization.

Probabilistic Modelling in Machine Learning – p.116/126

Page 117: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

hexaMplot

−1 −0.5 0 0.5 1 1.5

−1.5

−1

−0.5

0

0.5

1

Probabilistic Modelling in Machine Learning – p.117/126

Page 118: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Need a dedicated grouping mechanism...

GMM model + model evidence

−1.5 −1 −0.5 0 0.5 1 1.5 2

−1.5

−1

−0.5

0

0.5

1

GMM

Probabilistic Modelling in Machine Learning – p.118/126

Page 119: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Probabilistic HT - detected line rays

σ = 0.05, Sθ(α) chosen usingM = 126 andκ = 25

−1 −0.5 0 0.5 1 1.5

−1.5

−1

−0.5

0

0.5

1

hexaMplot

Probabilistic Modelling in Machine Learning – p.119/126

Page 120: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Probabilistic HT - accumulator

0 1 2 3 4 50

0.05

0.1

0.15

0.2

0.25

0.3

0.35

alpha

ht(alpha)

0 1 2 3 4 50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

alpha

ht(alpha)

0 1 2 3 4 50

0.2

0.4

0.6

0.8

alpha

ht(alpha)

(a) (b) (c)

Probabilistic Modelling in Machine Learning – p.120/126

Page 121: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Detected line rays

The shortest intervals(α−(σ), α+(σ)) containing the estimated line anglesand 95% of the posterior massP (·|Sθ(α)) around them. The intervals areshown for three levels of observational noise:σ = 0.05, σ = 0.3 andσ = 1.0.

line α (α−(0.05), α+(0.05)) (α

−(0.3), α+(0.3)) (α

−(1.0), α+(1.0))

1 -0.685 (-0.735,-0.684) (-0.736, -0.683) (-0.885,-0.535)

2 -0.585 ( -0.635,-0.584) (-0.635, -0.535) (-0.785,-0.435)

3 -0.285 (-0.335,-0.284) (-0.385, -0.235) (-0.685,-0.235)

4 1.965 (1.914,1.966) (1.665, 2.265) (-0.035, 3.015)

5 2.215 (2.165,2.216) (2.015, 2.365) (1.565, 2.765)

6 2.465 (2.415,2.466) (2.265, 2.615) (1.865, 3.015)

Probabilistic Modelling in Machine Learning – p.121/126

Page 122: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

Support for the detected groups

PosteriorsP (α|Sθ(α)) of the six detected lines (gene groups) for twolevels of observational noise:σ = 0.05 (a) andσ = 0.3 (b).

0 1 2 3 4 50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

alpha0 1 2 3 4 5

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

alpha

P(alpha | S)

(a) (b)

Probabilistic Modelling in Machine Learning – p.122/126

Page 123: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

GO analysis

line |Sθ(α)| GO term ID # genes # genes fromSθ(α) p-value

1 80 GO:0002675 87 37 0.00173

GO:0002525 85 36 0.00242

GO:0006953 85 36 0.00242

GO:0002527 86 36 0.00322

GO:0002543 86 36 0.00322

GO:0002539 60 27 0.00441

GO:0002540 60 27 0.00441

2 64 GO:0044424 176 41 0.00000

GO:0044444 175 41 0.00000

GO:0044446 168 39 0.00294

GO:0030117 165 38 0.00437

GO:0045265 162 37 0.00536

3 71 GO:0005488 178 51 0.00000

GO:0050794 163 55 0.00295Probabilistic Modelling in Machine Learning – p.123/126

Page 124: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

GO analysis

The first three lines with angles in(−π/4, 0) represent genes with(R,G,B) intensities satisfyingG < R < B.

The disease decreases expression of a gene, compared with itsnormal expression levelR, i.e.G < R. The drug eliminates thiseffect by overexpressing the genes,B > R.

Genes in the group corresponding to the 1st line are related to acuteinflammatory response (GO:0002675, GO:0002525) increasing forexample the concentration of non-antibody proteins in the plasma(GO:0006953), or increasing the intra- or extra-cellular levels ofprostaglandin (GO:0002539) and leukotriene (GO:0002540).

Probabilistic Modelling in Machine Learning – p.124/126

Page 125: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

GO analysis

The 3rd line groups genes that are related to binding mechanisms(GO:0005488) and breakdown of neutral lipids (GO:0046461),membrane lipids (GO:0046466) and glycerolipids (GO:0046503).

The disease also down-regulates genes related to pathways of thecomplement cascade which allow for the direct killing of microbesas well as regulation of other immune processes (GO:0001867,GO:0006957). The drug Rg1 corrects this situation by stimulatingthe pathways.

Probabilistic Modelling in Machine Learning – p.125/126

Page 126: Probabilistic Modelling in Machine Learning · Probabilistic modelling involves two main steps/tasks: 1. Design the model structure by considering Q1 and Q2. 2. Fit your model to

GO analysis

The 4th and 5th lines with angles in(π/2, 3π/2) represent geneswith (R,G,B) intensities satisfyingR < B < G.

Compared with its normal expression level, the expression of a geneis increased by the disease (G > R). The drug partially eliminatesthis effect by reducing the expression level toB, leavingB stillabove the normal expressionR.

The 6th line withα ∈ (3π/2, π) groups genes with(R,G,B)intensities satisfyingB < R < G.

The disease causes increased expression of a gene (G > R) and thedrug compensates for this effect by driving the gene expressionbelow the normal level (B < R).

While genes grouped together by the 4th line are associated withimmune and chronic inflammatory response, the genescorresponding to the 5th and 6th lines are again related to cellularcomponents and mechanisms effected by the disease.

Probabilistic Modelling in Machine Learning – p.126/126