ted willke, senior principal engineer, intel labs at mlconf nyc

44
You thought what?! The promise of real-time brain decoding Ted Willke Intel Labs

Upload: sessionsevents

Post on 15-Jul-2015

538 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

You thought what?! The promise of real-time brain decoding

Ted Willke

Intel Labs

Page 2: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

2

Alvarez & Oliva, 2006

BUILDINGS PEOPLE

Page 3: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

What is attention?

“Every one knows what attention is. It is the taking possession by the mind, in clear and vivid form, of one out of what seem several simultaneously possible objects or trains of thought... It implies withdrawal from some things in order to deal effectively with others...”

– William James (1890)

A simple but important distinction: • Overt attention: moving your eyes • Covert attention: moving your mind’s eye

Courtesy of Nick Turk-Browne, Princeton 3

Page 4: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

The great controller

Perception Memory Learning

Atte

ntio

n

4 Courtesy of Nick Turk-Browne, Princeton

Perception

Page 5: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

5

The brain: The black box at the end of our necks

• Facts: Only 2% of body weight but

uses up to 20% of energy

~200B neurons

Neurons fire up to ~10 kHz

1K to 10K connections per neuron

• The cerebral neocortex (the “mammalian brain” associated with higher reasoning): ~20B neurons

~125 trillion synapses

There are more ways to organize the neocortex’s ~125 trillion synapses than stars in the known universe.

Page 6: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

stimulus (task)

mind brain dataset?

what is present in the mind as the task is performed?

Adapted from Francisco Pereira, Botvinick Lab, Princeton

computational model?

what is attended to in the mind as the task is

performed?

6

Page 7: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

Non-invasive neuroimaging

7

Electrical phenomena Metabolic phenomena

Positron Emission Tomography

Functional Magnetic Resonance Imaging (fMRI)

Magneto-Encephalography

(MEG)

Consumer EEG (<sensors)

Near-Infrared Spectroscopy (fNIRs)

Be

tte

r sp

ati

al

reso

luti

on

Lab/Medical EEG (>sensors)

Varying portability, temporal & spatial resolution. fMRI is the workhorse of brain research despite disadvantages of non-portability & expense

Page 8: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

Real-Time Functional MRI (rtfMRI)

8

metabolic brain

anatomical brain

Adapted from graphic by Jeremy Manning, Princeton

Page 9: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

stimulus (task)

mind brain rtfMRI

classifier

conclusions from structure of the learnt model

conclusions from feature choice

weights on features hidden layers

voxel location voxel behavior time within trial

dependent on prediction model

dependent on experiment

Adapted from Francisco Pereira, Botvinick Lab, Princeton 9

Page 10: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

Studying attention | dueling categories

% B

OL

D c

ha

ng

e

Time

Face attention

Scene (place) attention

Fusiform face area (FFA)

Parahippocampal place area (PPA)

e.g., O’Craven et al., 1999, Nature

10

Page 11: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

Studying attention | coupling hypothesis

Occipital cortex Ventral temporal cortex

V4 FFA

PPA

r

Al-Aidroos et al., 2012, Proc Natl Acad Sci 11

Page 12: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

Studying attention | coupling hypothesis

Al-Aidroos et al., 2012, Proc Natl Acad Sci

Face attention Scene attention

N = 7, *p < .05, **p < .01

12

Page 13: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

13

Standard types of fMRI analysis. (A) Univariate activation refers to the average amplitude of BOLD activity evoked by events of an experimental condition.

N B Turk-Browne Science 2013;342:580-584 *BOLD: blood oxygenation level–dependent (BOLD) contrast imaging

Page 14: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

14

Standard types of fMRI analysis. (A) Univariate activation refers to the average amplitude of BOLD activity evoked by events of an experimental condition.

N B Turk-Browne Science 2013;342:580-584 *MVPA: Multivariate Pattern Analysis *FCMA: Full Correlation Matrix Analysis,

Advanced Analysis MVPA FCMA

Basic (i.e. common) Analysis

Page 15: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

Offline fMRI image analysis experiment

data acquisition preprocessing

classifier testing analyze results

Processing time …

6 to 55 hours

voxel analysis 15 Courtesy of Nick Turk-Browne, Princeton

Page 16: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

16

real-time brain decoding for causal experimentation

Page 17: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

Studying attention | real-time neurofeedback

Attend to scene MORE

scene evidence

LESS scene evidence

Rewarded with stronger stimulus and easier task

Punished with degraded stimulus and harder task

Starting stimulus

17 Courtesy of Nick Turk-Browne, Princeton

Page 18: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

data acquisition real-time preprocessing

classifier testing update stimulus display

Processing time …

6 to 55 hours

real-time voxel analysis

Closed-loop rtfMRI neurofeedback system

18

Page 19: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

Studying attention | training and scoring

Neurofeedback

Use multivariate pattern analysis (MVPA) over whole-brain activity to decode attention to faces vs. scenes

Mean cross-validation accuracy = 78% ***

Norman et al. (2006), LaConte (2011) Regularized logistic regression (penalty = 1), *** p < 0.001 19

Page 20: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

20

Subject

Scanner

Page 21: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

Scoring sequence – your brain on scenes?

21

Page 22: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

22

This was done with MVPA. We’d also like to try FCMA to include connectivity information, but...

Page 23: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

A Big Data/HPC challenge Some facts:

To keep up with the rtfMRI scanner, must process full brain scan and provide feedback in <1sec (for a 2sec TR)

Raw image data for 1 subject, ~480 Gbytes

Some studies train on 100’s of subjects

If we run correlations across all subjects involves a lot of data movement

Processing is expensive:

N~100K voxels per time slice

O(N2) for basic preprocessing (minutes today)

O(N3) to process the full correlation matrix (hours today)

Raw fMRI Data

Patterns of correlated

voxels

Image Sources: Princeton Neuroscience Institute and Wikipedia

“Train classifier on 100’s of subjects during coffee break, classify a subject’s patterns in <1sec.”

23

Page 24: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

Machine Learning Workload Convergence

24

Education

Health

Banking

Manufacturing

Usages Workloads Machine Learning

Algorithms

High-level Libraries

Primitives Low-level Libraries

Hardware Platforms

Xeon

Xeon Phi

Xeon FPGA

Xeon Gfx

Add-in card

New ISA Transportation

Building Blocks

Intel can help accelerate a wide range of machine learning through a focus on key building blocks.

Page 25: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

25

Intel® Math Kernel Library (Intel® MKL)

Random Number Gen.

• Congruential

• Wichmann-Hill

• Mersenne Twister

• Sobol

• Neiderreiter

• Non-deterministic

Summary Statistics

• Kurtosis

• Variation coefficient

• Quantiles

• Order statistics

• Min/max

• Variance-covariance

Data Fitting

• Spline-based

• Interpolation

• Cell search

Linear Algebra

• BLAS, Sparse BLAS

• LAPACK solvers

• Sparse Solvers (DSS, PARADISO)

• Iterative solver (RCI)

• ScaLAPACK, PBLAS

Fast Fourier Transforms

• Multidimensional

• FFTW interfaces

• Cluster FFT

• Trig. Transforms

• Poisson solver

• Convolution via VSL

Vector Math

• Trigonometric

• Hyperbolic

• Exponential, Logarithmic

• Power / Root

Page 26: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

Unveiling Details of Knights Landing (Next Generation Intel® Xeon Phi™ Products)

2nd half ’15 1st commercial systems

3+ TFLOPS In One Package Parallel Performance & Density

On-Package Memory:

up to 16GB at launch

5X Bandwidth vs DDR4

Compute: Energy-efficient IA cores

Microarchitecture enhanced for HPC

3X Single Thread Performance vs Knights Corner

Intel Xeon Processor Binary Compatible

1/3X the Space

5X Power Efficiency

. . .

. . .

Integrated Fabric

Intel® Silvermont Arch. Enhanced for HPC

Processor Package

Conceptual—Not Actual Package Layout

Platform Memory: DDR4 Bandwidth and

Capacity Comparable to Intel® Xeon® Processors

Jointly Developed with Micron Technology

26

Page 27: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

FCMA Correlation Computation

27

voxe

ls

voxels

scan data

scan data

Correlations

Need Pearson’s correlation coefficient for each pair of voxels

34,470 voxels => over 500 million pairs

Functionality provided by Intel’s libraries

If scan data is normalized (mean-centered and unit norm) then Pearson correlation becomes matrix multiplication

Can use single-precision general matrix multiplication (SGEMM) built into Intel Math Kernel Library (MKL)

Current work is to improve SGEMM performance when computing with small numbers of scans (e.g. 12)

Thanks to Mike Anderson, Intel Labs

Page 28: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

FCMA Z-Score Computation

28

Correlations

Need to complete Z-score procedure across all correlation matrices produced by a single subject

Fisher transformation of each correlation coefficient => 0.5* ln((1+x)/(1-x))

Then , at each location in correlation matrix, subtract mean and divide by standard deviation across all correlation matrices

Acceleration using Single Instruction Multiple Data (SIMD) instructions

Correlation coefficients are grouped into contiguous vectors and processed using SIMD instructions to exploit data parallelism

Loop annotated with #pragma simd

Natural logarithm can also be vectorised using Intel Short Vector Math Library (SVML) to accelerate Fisher transformation

voxe

ls voxels

Thanks to Mike Anderson, Intel Labs

Page 29: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

Putting it all together: FCMA Z-score example

29

#pragma omp parallel for for(int v = 0 ; v < step*nSubs ; v++) { int s = v % nSubs; // subject id int i = v / nSubs; // voxel id float (*mat)[row] = (float(*)[row])&(voxels->corr_vecs[i*nTrials*row]); #pragma simd for(int j = 0 ; j < row ; j++) { float mean = 0.0f; float std_dev = 0.0f; for(int b = s*nPerSub; b < (s+1)*nPerSub; b++) { _mm_prefetch((char*)&(mat[b][j+32]), _MM_HINT_ET1); _mm_prefetch((char*)&(mat[b][j+16]), _MM_HINT_T0); float num = 1.0f + mat[b][j]; float den = 1.0f - mat[b][j];

num = (num <= 0.0f) ? 1e-4 : num; den = (den <= 0.0f) ? 1e-4 : den; mat[b][j] = 0.5f * logf(num/den); mean += mat[b][j]; std_dev += mat[b][j] * mat[b][j]; } mean = mean / (float)nPerSub; std_dev = std_dev / (float)nPerSub - mean*mean; float inv_std_dev = (std_dev <= 0.0f) ? 0.0f : 1.0f / sqrt(std_dev); for(int b = s*nPerSub; b < (s+1)*nPerSub; b++) { mat[b][j] = (mat[b][j] - mean) * inv_std_dev; } } } }

Several MPI processes running the above code

OpenMP divides independent voxels (dim1) and subjects across 60 Xeon Phi Cores

#pragma simd directive assigns consecutive voxels (dim2) to vector lanes

voxe

ls

voxels

Thanks to Mike Anderson, Intel Labs

Page 30: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

FCMA SVM

30

Co

rre

lati

on

wit

h v

oxe

l v

i Subjects, trials

Key is to find the most predictive voxels in the correlation matrix • Rows of the correlation matrix are the feature

vectors

Very large number of SVMs are trained • One for each voxel - O(35000) • Each trained SVM is cross validated and the top

few voxels are chosen for predictive analyses

Acceleration using custom SVM code • Kernel matrix precomputed as #dimensions <<

#data points • Ported parallel GPUSVM code to run on Xeon and

Xeon Phi platforms • Uses thread level and SIMD parallelism • Faster than libSVM

Thanks to Narayanan Sundaram, Intel Labs

Page 31: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

FCMA – Effect of Optimizations

31

0

1

2

3

4

5

6

7

Co

rre

lati

on

Z-s

core

SV

M

To

tal

Co

rre

lati

on

Z-s

core

SV

M

To

tal

Xeon Xeon Phi

Ru

nti

me

in

se

con

ds

(fo

r 1

7 s

ub

ject

s)

Before optimizations

After optimizations

1.7X speedup on Xeon 5.8X speedup on Xeon Phi Xeon Phi 2.1X faster than Xeon

Thanks to Yida Wang, Princeton, and Narayanan Sundaram

Page 32: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

32

Model-based approaches

Page 33: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

33

stimulus (task)

mind brain rtfMRI

classifier

conclusions from structure of the learnt model

conclusions from feature choice

weights on features hidden layers

voxel location voxel behavior time within trial

dependent on prediction model

dependent on experiment

Adapted from Francisco Pereira, Botvinick Lab, Princeton

Page 34: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

34

stimulus (task)

mind brain rtfMRI

classifier

Adapted from Francisco Pereira, Botvinick Lab, Princeton

Page 35: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

35

stimulus (task)

mind brain rtfMRI

model

Adapted from Francisco Pereira, Botvinick Lab, Princeton

predicted stimulus or task

Page 36: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

36

stimulus (task)

mind brain rtfMRI

model

Adapted from Francisco Pereira, Botvinick Lab, Princeton

predicted rtfMRI data

Page 37: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

37

Modeling | Topographic Factor Analysis

Manning JR, Ranganath R, Norman KA, Blei DM (2014) Topographic Factor Analysis: A Bayesian Model for Inferring Brain

Networks from Neural Data. PLoS ONE 9(5): e94914. doi:10.1371/journal.pone.0094914

http://127.0.0.1:8081/plosone/article?id=info:doi/10.1371/journal.pone.0094914

Page 38: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

38

Modeling | Topographic Factor Analysis

Manning JR, Ranganath R, Norman KA, Blei DM (2014) Topographic Factor Analysis: A Bayesian Model for Inferring Brain

Networks from Neural Data. PLoS ONE 9(5): e94914. doi:10.1371/journal.pone.0094914

http://127.0.0.1:8081/plosone/article?id=info:doi/10.1371/journal.pone.0094914

Page 39: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

39

Modeling | Topographic Factor Analysis

Manning JR, Ranganath R, Norman KA, Blei DM (2014) Topographic Factor Analysis: A Bayesian Model for Inferring Brain

Networks from Neural Data. PLoS ONE 9(5): e94914. doi:10.1371/journal.pone.0094914

http://127.0.0.1:8081/plosone/article?id=info:doi/10.1371/journal.pone.0094914

N trials V voxels voxel activations y K shared sources (µ, ) weights w

Page 40: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

40

Modeling | Topographic Factor Analysis

Manning JR, Ranganath R, Norman KA, Blei DM (2014) Topographic Factor Analysis: A Bayesian Model for Inferring Brain

Networks from Neural Data. PLoS ONE 9(5): e94914. doi:10.1371/journal.pone.0094914

http://127.0.0.1:8081/plosone/article?id=info:doi/10.1371/journal.pone.0094914

number of sources? specification of sources?

hyperparameter values? initialization of sources?

Page 41: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

41

Modeling | Topographic Factor Analysis

Manning JR, Ranganath R, Norman KA, Blei DM (2014) Topographic Factor Analysis: A Bayesian Model for Inferring Brain

Networks from Neural Data. PLoS ONE 9(5): e94914. doi:10.1371/journal.pone.0094914

http://127.0.0.1:8081/plosone/article?id=info:doi/10.1371/journal.pone.0094914

“mental state” mn during nth trial gives rise to behavioral data bn and neural data yn

Page 42: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

42

... is a work in progress....

more basic neuroscience

research more machine learning

speed and accuracy a look at other model-

based methods

Decoding your thoughts...

Page 43: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

43

Conclusions

Closed-loop rtfMRI amplifies and externalizes internal states that are difficult to access

Holds promise for people that suffer from mental disorders or simply want to improve brain performance

Intel is helping put the rt into rtfMRI and unlock the potential of this research

Page 44: Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC

Thanks Princeton Neuroscience Institute!

Jon Cohen — PNI Co-Founder, Professor of Neuroscience and Psychology

Matt Botvinick — Professor of Neuroscience and Psychology

Ken Norman — Professor of Neuroscience and Psychology

Nick Turk-Browne — Professor of Neuroscience and Psychology

Kai Li — Professor of Computer Science and Co-Founder of Data Domain Corporation

44