1 hidden process models rebecca hutchinson joint work with tom mitchell and indra rustandi

48
1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

Post on 20-Dec-2015

230 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

1

Hidden Process Models

Rebecca Hutchinson

Joint work with Tom Mitchell and Indra Rustandi

Page 2: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

2

Talk Outline

• fMRI (functional Magnetic Resonance Imaging) data

• Prior work on analyzing fMRI data

• HPMs (Hidden Process Models)

• Preliminary results

• HPMs and BodyMedia

Page 3: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

3

functional MRI

Page 4: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

4

fMRI Basics

• Safe and non-invasive

• Temporal resolution ~ 1 3D image every second

• Spatial resolution ~ 1 mm– Voxels: 3mm x 3mm x 3-5mm

• Measures the BOLD response: Blood Oxygen Level Dependent– Indirect indicator of neural activity

Page 5: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

5

The BOLD response

• Ratio of deoxy-hemoglobin to oxy-hemoglobin (different magnetic properties).

• Also called hemodynamic response function (HRF).

• Common working assumption: responses sum linearly.

Page 6: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

6

More on BOLD response

Sig

nal

Am

plitu

de

Time (seconds)

• At left is a typical BOLD response to a brief stimulation.

• (Here, subject reads a word, decides whether it is a noun or verb, and pushes a button in less than 1 second.)

Page 7: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

7

Page 8: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

8

Lots of features!• 10,000-15,000 voxels per image

Page 9: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

9

Study: Pictures and Sentences

• 13 normal subjects.

• 40 trials per subject.

• Sentences and pictures describe 3 symbols: *, +, and $, using ‘above’, ‘below’, ‘not above’, ‘not below’.

• Images are acquired every 0.5 seconds.

Read Sentence

View Picture Read Sentence

View PictureFixation

Press Button

4 sec. 8 sec.t=0

Rest

Page 10: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

10

The star is not below the plus.

Page 11: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

11

Page 12: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

12

+

---

*

Page 13: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

13

.

Page 14: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

14

fMRI Summary

• High-dimensional time series data.

• Considerable noise on the data.

• Typically small number of examples (trials) compared with features (voxels).

• BOLD responses sum linearly.

Page 15: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

15

Talk Outline

• fMRI (functional Magnetic Resonance Imaging) data

• Prior work on analyzing fMRI data

• HPMs (Hidden Process Models)

• Preliminary results

• HPMs and BodyMedia

Page 16: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

16

It’s not hopeless!

• Learning setting is tough, but we can do it!• Feature selection is key.

• Learn fMRI(t,t+8)->{Picture,Sentence}

Read Sentence

View Picture Read Sentence

View PictureFixation

Press Button

4 sec. 8 sec.t=0

Rest

Page 17: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

17

ResultsA B C D E F G

71.3% 91.2% 76.2% 96.3% 85.0% 66.2% 71.3%

• Gaussian Naïve Bayes Classifier.

• 95% confidence intervals per subject are +/- 10%-15%.

• Accuracy of default classifier is 50%.

• Feature selection: Top 240 most active voxels in brain.

Subject:

Accuracy:

H I J K L M Avg.95.0% 81.2% 90.0% 85.0% 65.0% 90.0% 81.8%

Subject:

Accuracy:

Page 18: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

18

Why is this interesting?• Cognitive architectures like ACT-R and

4CAPS predict cognitive processes involved in tasks, along with cortical regions associated with the processes.

• Machine learning can contribute to these architectures by linking their predictions to empirical fMRI data.

Page 19: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

19

Other Successes

• We can distinguish between 12 semantic categories of words (e.g. tools vs. buildings).

• We can train classifiers across multiple subjects.

Page 20: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

20

What can’t we do?

• Take into account that the responses for Picture and Sentence overlap.

• What does the response for Decide look like and when does it start?

Read Sentence

View Picture Read Sentence

View PictureFixation

Press Button

4 sec. 8 sec.t=0

Rest

Page 21: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

21

Talk Outline

• fMRI (functional Magnetic Resonance Imaging) data

• Prior work on analyzing fMRI data

• HPMs (Hidden Process Models)

• Preliminary results

• HPMs and BodyMedia

Page 22: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

22

Motivation

• Overlapping processes– The responses to Picture and Sentence could

overlap in space and/or time.

• Hidden processes– Decide does not directly correspond to the

known stimuli.

• Move to a temporal model.

Page 23: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

23

Hidden Markov Models?

• Can’t do overlapping processes – states are mutually exclusive.

• Markov assumption: given statet-1, statet is independent of everything before t-1.

• BOLD response: Not Markov!

t-1 t t+1 t+2

CogProc{Picture, Sentence,Decide}

fMRI

Page 24: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

24

factorial HMMs?

• Have more flexibility than we need.– Picture state sequence should not be {0 1 0 1

0 1 0 1…}

• Still have Markov assumption problem.

t-1 t t+1 t+2

Picture = {0,1}

Sentence = {0,1}

Decide = {0,1}

fMRI

Page 25: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

25

Hidden Process Models

Process ID = 3

Process ID = 2Process Instances:

Observed fMRI: cortical region 1:

cortical region 2:

Processes:

Name: Read sentenceProcess ID: 1Response:

Name: View PictureProcess ID: 2Response:

Name: Decide whetherconsistentProcess ID: 3Response:

Process ID = 1 Process ID = 1

Page 26: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

26

HPM Parameters• Set of processes, each of which has:

– a process ID– a maximum response duration R– emission weights for each voxel v [W(v,1),

…,W(v,t),…,W(v,R)] – a multinomial distribution over possible start

times within a trial [1,…,t,…,T]

• Set of standard deviations – one for each voxel 1,…,v,...,V]

Page 27: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

27

Interpreting data with HPMs

• Data Interpretation (int)– Set of process instances, each of which has:

• a process ID• a start time S

• To predict fMRI data using an HPM and int:– For each active process, add the response

associated with its processID to the prediction.

Page 28: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

28

Synthetic Data ExampleProcess 1: Process 2: Process 3:

Process responses:

Process instances:

Predicted data

ProcessID=1, S=1

ProcessID=2, S=17

ProcessID=3, S=21

Page 29: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

29

Our Assumptions

• Processes, not states.– One hidden variable – process start time.

• Known number of processes in the model.– e.g. Picture, Sentence, Decide – 3 processes

• Known number of instantiations of those processes.– e.g. numTrials*3 processes

• Each process has a unique signature.• Contributions of overlapping processes to the

same output variable sum linearly.

Page 30: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

30

The generative model

• Together HPM and interpretation (int) define a probability distribution over sequences of fMRI images:

where

P(yv,t|hpm,int) = N(v,t,v)

v,t = Wi.procID(v,t – start(i))i active process instances

Page 31: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

31

Inference

• Given:– An HPM– A set of data interpretations (int) of

processIDs and start times– Priors over the interpretations

• P(int=i|Y) P(Y|int=i)P(int=i)

Choose the interpretation i with the highest probability.

Page 32: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

32

Synthetic Data ExampleInterpretation 1:

Observed data

ProcessID=1, S=1

ProcessID=2, S=17

ProcessID=3, S=21

Interpretation 2:

ProcessID=2, S=1

ProcessID=1, S=17

ProcessID=3, S=23

Prediction 1

Prediction 2

Page 33: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

33

Learning the Model

• EM (Expectation-Maximization) algorithm

• E-step– Estimate a conditional distribution over the

start times of the process instances given the observed data, P(S|fMRI).

• M-step– Use the distribution from the E step to get

maximum-likelihood estimates of the HPM parameters {, W, }.

Page 34: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

34

More on the E-step

• The start times of the process instances are not necessarily conditionally independent given the data.– Must consider joint configurations.– With no constraints, TnInstances configurations. – 2000120 configurations for typical experiment.

• Can we consider a smaller set of start time configurations?

Page 35: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

35

Reducing complexity

• Prior knowledge– Landmarks

• Events with known timing that “trigger” processes. • One per process instance.

– Offsets• The interval of possible delays from a landmark to

a process instance onset. • One vector of n offsets per process.

• Conditional independencies– Introduced when no process instance could

be active.

Page 36: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

36

Before Prior Knowledge

Decide whether consistent

Read sentence

View pictureCognitive processes:

Observed fMRI:

cortical region 1:

cortical region 2:

Page 37: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

37

Decide whether consistent

Read sentence

View pictureCognitive processes:

Observed fMRI:

cortical region 1:

cortical region 2:

Landmarks:(Stimuli)

Sentence Presentation

PicturePresentation

Sentence offsets = {0,1} Picture

offsets = {0,1}Decide offsets = {0,1,2,3}

Landmarks go to process instances.

Offset values aredetermined byprocess IDs.

Prior Knowledge

Page 38: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

38

Conditional Independencies

Decide whether consistent

Read sentence

View picture

Observed fMRI:

cortical region 1:

cortical region 2:

Landmarks:(Stimuli)

Sentence Presentation

PicturePresentation

Sentence offsets = {0,1} Picture

offsets = {0,1}Decide offsets = {0,1,2,3}

Read sentence

View picture

Sentence Presentation

Sentence offsets = {0,1}

HERE

Page 39: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

39

More on the M-step

• Weighted least squares procedure – exact, but may become intractable for large

problems– weights are the probabilities computed in the

E-step

• Gradient ascent procedure – approximate, but may be necessary when

exact method is intractable– derivatives of the expected log likelihood of

the data with respect to the parameters

Page 40: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

40

Talk Outline

• fMRI (functional Magnetic Resonance Imaging) data

• Prior work on analyzing fMRI data

• HPMs (Hidden Process Models)

• Preliminary results

• HPMs and BodyMedia

Page 41: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

41

View PictureOr

Read Sentence

Read SentenceOr

View PictureFixation

Press Button

4 sec. 8 sec.t=0

Rest

picture or sentence? picture or sentence?

16 sec.

GNB:

picture or sentence?

picture or sentence?

HPM:

Preliminary Results

Page 42: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

42

GNB vs. HPM Classification

• GNB: non-overlapping processes• HPM: simultaneous classification of

multiple overlapping processes

• Average improvement of 15% in classification error using HPM vs GNB

• E.g., for one subject– GNB classification error: 0.14– HPM classification error: 0.09

Page 43: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

43trial 25

Learned models

Comprehend sentence

Comprehend picture

Page 44: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

44

Model selection experiments

• Model with 2 or 3 cognitive processes?– How would we know ground truth?– Cross validated data likelihood P(testData |

HPM)• Better with 3 processes than 2

– Cross validated classification accuracy• Better with 3 processes than 2

Page 45: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

45

Current work and challenges

• Add temporal and/or spatial smoothness constraints.

• Feature selection for HPMs.• Process libraries, hierarchies.• Process parameters (e.g. sentence

negated or not).• Model process interactions.• Scaling parameters for response

amplitudes to model habituation effects.

Page 46: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

46

Talk Outline

• fMRI (functional Magnetic Resonance Imaging) data

• Prior work on analyzing fMRI data

• HPMs (Hidden Process Models)

• Preliminary results

• HPMs and BodyMedia

Page 47: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

47

One idea…

Sensor 1:

Sensor 2:

Processes:

Name: Riding busProcess ID: 1Response:

Name: Eating Process ID: 2Response:

Name: WalkingconsistentProcess ID: 3Response:

Process instances:

Observed data:

ProcessID=3

ProcessID=2

ProcessID=1

Page 48: 1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi

48

Some questions

• What processes are interesting?

• What granularity/duration would processes have?

• What would landmarks be?

• Variable process durations needed?

• Better way to parameterize process signatures?