infinite hierarchical hidden markov models

18
Infinite Hierarchical Hidden Markov Models Katherine A. Heller, Yee Whye Teh and Dilan Görür Lu Ren ECE@Duke University Nov 23, 2009 AISTATS 2009

Upload: haig

Post on 22-Feb-2016

166 views

Category:

Documents


0 download

DESCRIPTION

Infinite Hierarchical Hidden Markov Models. AISTATS 2009. Katherine A. Heller, Yee Whye Teh and Dilan Görür Lu Ren ECE@Duke University Nov 23, 2009. Outline. Hierarchical structure learning for sequential data Hierarchical hidden Markov model (HHMM) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Infinite Hierarchical Hidden Markov Models

Infinite Hierarchical Hidden Markov Models

Katherine A. Heller, Yee Whye Teh and Dilan Görür

Lu Ren ECE@Duke University

Nov 23, 2009

AISTATS 2009

Page 2: Infinite Hierarchical Hidden Markov Models

Outline

• Hierarchical structure learning for sequential data

• Hierarchical hidden Markov model (HHMM)

• Infinite hierarchical hidden Markov model (IHHMM)

• Inference and learning

• Experiment results and demonstrations

• Related work and extensions

Page 3: Infinite Hierarchical Hidden Markov Models

Multi-scale Structure

• Consider to infer correlated observations over long periods in the observation sequence.• Potential application: language multi-resolution structure learning, video structure discovery, activity detection etc.

Sequential data generated The sampled “states” used to generate the data

Page 4: Infinite Hierarchical Hidden Markov Models

Hierarchical HMM (HHMM)

1. Hierarchical Hidden Markov Models (HHMM) Multiscale models of sequences where each level of the model

is a separate HMM emitting lower level HMMs in a recursive manner.

The generative process of one HHMM example [2]

Page 5: Infinite Hierarchical Hidden Markov Models

2. The entire set of parametersWith a fixed model structure, the model is characterized by the following parameters [1]

with

with

with

3. Representing the HHMM as a DBN [2]• Simply assume all production states are at the bottom and the state of HMM at level and time is represented by .• specifies the complete “path” from the root to the leaf state.• Indicator variable control completion of the HHMM at level and time .

Hierarchical HMM (HHMM)

Page 6: Infinite Hierarchical Hidden Markov Models

An HHMM represented as a DBM [2]

Hierarchical HMM (HHMM)

Page 7: Infinite Hierarchical Hidden Markov Models

Infinite Hierarchical HMM (IHHMM)

IHHMM: allows the HHMM hierarchy to have a potentially infinite number of levels.

Observation: State:Also a state transition indicating variable is introduced:• indicate whether there is a completion of the HHMM at level right before time ;• indicate presence of a state transition from to• The conditional probability of is:

• There is an opportunity to transition at level only if there was a transition at level .

Page 8: Infinite Hierarchical Hidden Markov Models

Infinite Hierarchical HMM (IHHMM)The property implied by the structure:1. The number of transitions at level before a transition at

level occurs is geometrically distributed with a mean .

This implies that the expected number of time steps for which a

state at level persists in its current value is .

The states at higher levels persist longer.

2. The first non-transitioning level at time , has the distribution

is geometrically distributed with parameter if all

The IHHMM allows for a potentially infinite number of levels.

Page 9: Infinite Hierarchical Hidden Markov Models

Infinite Hierarchical HMM (IHHMM)

The generative process for given is similar to the HHMM:For the levels down to , the state is generated according to

The emissions matrix:

for the levels

Page 10: Infinite Hierarchical Hidden Markov Models

Inference and Learning

The IHHMM is performed using Gibbs sampling and a modified forward-backtrack algorithm.It iterates between the following two steps:1. Sampling state values with fixed parameters for each level

Compute forward messages from to :

replace with for Resample and along the backward pass from to :

Page 11: Infinite Hierarchical Hidden Markov Models

Inference and Learning

When the top level is reached, a new level above it will be created by setting all states with 1; If the level below the current top level has no state transitions, it becomes the new top level.

2. Sampling parameters given the current state:

Parameters are initialized as draws from the Dirichlet priors; Posteriors are calculated based on the counts of state transitions and emissions in the previous step.

Predicting new observations given the current state of the IHHMM:

1. Assume the top level learned from the IHHMM is , then calculate the following recursions from to :

Page 12: Infinite Hierarchical Hidden Markov Models

Inference and Learning

2. Compute the probability of observing from :

Page 13: Infinite Hierarchical Hidden Markov Models

Experiment Results

Sequential data generated

The sampled “states” used to generate the data

1. Data generated: sample sample sample

Page 14: Infinite Hierarchical Hidden Markov Models

Experiment Results2. Demonstrate the model can capture the hierarchical structure

The first data set consists of repeats of integers increasing from 1 to 7, followed by repetitions of integers decreasing from 5 to 1, repeated twice. The second data is the first one concatenated with another series of repeated increasing and decreasing sequences of integers. 7 states is used in the model at all levels.

b)

Page 15: Infinite Hierarchical Hidden Markov Models

Experiment Results

The predictive log probability of the next integer is calculated: HMM: 0.25 IHHMM: 0.31 HHMM: 0.30 (for 2-4 levels)

3. Spectral data from Handel’s Hallelujah chorus

Page 16: Infinite Hierarchical Hidden Markov Models

Experiment Results

4. Alice in Wonderland letters data set.

The difference in log predictive likelihood between IHHMM and a HMM learned by EM

The difference in log predictive likelihood between IHHMM and a one level HMM learned by Gibbs sampling

• The mean differences in both plots are positive, demonstrating that the IHHMM gives superior performance on this data.• The long tails signifies that there are letters which can be better predicted with the higher hierarchical levels.

Page 17: Infinite Hierarchical Hidden Markov Models

Final discussions 1. Relation to the HHMM: IHHMM is a nonparametric extension of the HHMM for an

unbounded hierarchy depth; The completion of an internal HHMM is governed by an

independent process.2. Other related work: Probabilistic context free grammars with multi-scale structure

learning; Infinite HMM, infinite factorial HMM;3. Future work: Make the number of states at each level infinite as well as the

infinite HMM; Higher order Markov chains; More efficient inference algorithms.

Page 18: Infinite Hierarchical Hidden Markov Models

Cited References

[1] S. Fine, Y. Singer, and N. Tishby. The hierarchical hidden Markov model: Analysis and applications. Machine Learning, 32: 41-62, 1998.

[2] K. Murphy and M.A. Paskin. Linear time inference in hierarchical HMMs. In Neural Information Processing Systems, 2001.