Empirical models of spiking in neural populations
Jakob H. Macke, Lars Büsing,
John P. Cunningham, Byron M. Yu,
Krishna V. Shenoy and Maneesh Sahani
NIPS 2011December 15, 2011
() 1 / 19
Characterizing neural functionI Single neuron recordings
I Simultaneous recordings from populations of neurons
−400 −200 0 200 400 600 800 1000 1200 1400
10
20
30
40
50
60
70
80
90
Neuro
n index
Time
→ need for appropriate statistical models
() 2 / 19
Characterizing neural functionI Single neuron recordings
I Simultaneous recordings from populations of neurons
−400 −200 0 200 400 600 800 1000 1200 1400
10
20
30
40
50
60
70
80
90
Neuro
n index
Time
→ need for appropriate statistical models() 2 / 19
Models of population activity: direct couplings or latent variables?
model class I
model class II
generalized linear models (GLM)
latent dynamical system (DS)
externalinput
directcouplings
observedneurons
externalinput
dynamical system
observedneurons
e.g. [Chornoboy et al., Biol Cybern, 1988]
[Gat et al., 97, Network], [Yu et al., NIPS, 2006]
[Paninski et al., Neuro Comp, 2004]
() 3 / 19
Models of population activity: direct couplings or latent variables?
model class I model class II
generalized linear models (GLM) latent dynamical system (DS)
externalinput
directcouplings
observedneurons
externalinput
dynamical system
observedneurons
e.g. [Chornoboy et al., Biol Cybern, 1988] [Gat et al., 97, Network], [Yu et al., NIPS, 2006][Paninski et al., Neuro Comp, 2004]
() 3 / 19
Statistical models offer insights into neural computation
model class I
I GLMs improve decoding of visualstimuli from retina recordings
[Pillow et al., Nature, 2008]
model class II
I DS facilitate single trial analysis ofcortical multi-electrode recordings
[Chruchland et al, Nature Neurosci., 2010]
I Brain-computer interfaces[Wu et al., IEEE Tans. Neural Systems, 2000]
→What are good models for cortical multi-electrode recordings?
() 4 / 19
Statistical models offer insights into neural computation
model class I
I GLMs improve decoding of visualstimuli from retina recordings
[Pillow et al., Nature, 2008]
model class II
I DS facilitate single trial analysis ofcortical multi-electrode recordings
[Chruchland et al, Nature Neurosci., 2010]
I Brain-computer interfaces[Wu et al., IEEE Tans. Neural Systems, 2000]
→What are good models for cortical multi-electrode recordings?
() 4 / 19
Statistical models offer insights into neural computation
model class I
I GLMs improve decoding of visualstimuli from retina recordings
[Pillow et al., Nature, 2008]
model class II
I DS facilitate single trial analysis ofcortical multi-electrode recordings
[Chruchland et al, Nature Neurosci., 2010]
I Brain-computer interfaces[Wu et al., IEEE Tans. Neural Systems, 2000]
→What are good models for cortical multi-electrode recordings?
() 4 / 19
Data: Multi-electrode recordings from primate cortex
−250 0 250 500 750 1000 1250
10
20
30
40
50
60
70
80
90
Exam
ple
raste
r (s
ingle
trial)
N
euro
n index
Target on Go cue
−250 0 250 500 750 1000 12500
10
20
30Data used in analyses
Popula
tion P
ST
H
R
ate
(hz)
I Data from Shenoy’s lab
I Array with 96 electrodes in motor / premotor areas
I 92 units = dimensions, 120 s of recordings , 10ms binning
() 5 / 19
Model class I: Generalized linear models (GLMs)
I Instantaneous firing rate λi(t) of neuron i
λi(t) = exp([
det. innovations︷︸︸︷b(t) + D︸︷︷︸
couplings
·
spike history︷︸︸︷s(t) ]i)
I Poisson observations
yi(t) | s(t) = Poisson(λi(t))
I We used up to five different decayingexponentials as basis functions for the spikehistory:time constants 0.1 - 80ms
b(t)
I We used L1-regularization of the coupling matrix D to avoid over-fitting
e.g. [Chornoboy et al., Biol Cybern, 1988], [Paninski et al., Neuro Comp, 2004]
() 6 / 19
Model class I: Generalized linear models (GLMs)
I Instantaneous firing rate λi(t) of neuron i
λi(t) = exp([
det. innovations︷︸︸︷b(t) + D︸︷︷︸
couplings
·
spike history︷︸︸︷s(t) ]i)
I Poisson observations
yi(t) | s(t) = Poisson(λi(t))
I We used up to five different decayingexponentials as basis functions for the spikehistory:time constants 0.1 - 80ms
b(t)
I We used L1-regularization of the coupling matrix D to avoid over-fitting
e.g. [Chornoboy et al., Biol Cybern, 1988], [Paninski et al., Neuro Comp, 2004]
() 6 / 19
Model class I: Generalized linear models (GLMs)
I Instantaneous firing rate λi(t) of neuron i
λi(t) = exp([
det. innovations︷︸︸︷b(t) + D︸︷︷︸
couplings
·
spike history︷︸︸︷s(t) ]i)
I Poisson observations
yi(t) | s(t) = Poisson(λi(t))
I We used up to five different decayingexponentials as basis functions for the spikehistory:time constants 0.1 - 80ms b(t)
I We used L1-regularization of the coupling matrix D to avoid over-fitting
e.g. [Chornoboy et al., Biol Cybern, 1988], [Paninski et al., Neuro Comp, 2004]
() 6 / 19
Model class I: Generalized linear models (GLMs)
I Instantaneous firing rate λi(t) of neuron i
λi(t) = exp([
det. innovations︷︸︸︷b(t) + D︸︷︷︸
couplings
·
spike history︷︸︸︷s(t) ]i)
I Poisson observations
yi(t) | s(t) = Poisson(λi(t))
I We used up to five different decayingexponentials as basis functions for the spikehistory:time constants 0.1 - 80ms b(t)
I We used L1-regularization of the coupling matrix D to avoid over-fitting
e.g. [Chornoboy et al., Biol Cybern, 1988], [Paninski et al., Neuro Comp, 2004]
() 6 / 19
Model class II: Latent dynamical system (DS) models
I Latent variable x(t) evolves linearly, withGaussian innovations
x(t + 1) = A x(t)︸ ︷︷ ︸dynamics
+
innovations︷ ︸︸ ︷b(t) + η(t)
I Gaussian linear dynamical system (GLDS):
y(t) | x(t) ∼ N (Cx(t) + d,R)
I Poisson linear dynamical system (PLDS):
yi(t) | x(t) ∼ Poisson(exp(Ci,:x(t) + di + Disi(t)))
I Parameters were learned by the EM algorithm
I We used a global Laplace approximation to the posterior for the PLDS
() 7 / 19
Model class II: Latent dynamical system (DS) models
I Latent variable x(t) evolves linearly, withGaussian innovations
x(t + 1) = A x(t)︸ ︷︷ ︸dynamics
+
innovations︷ ︸︸ ︷b(t) + η(t)
I Gaussian linear dynamical system (GLDS):
y(t) | x(t) ∼ N (Cx(t) + d,R)
I Poisson linear dynamical system (PLDS):
yi(t) | x(t) ∼ Poisson(exp(Ci,:x(t) + di + Disi(t)))
I Parameters were learned by the EM algorithm
I We used a global Laplace approximation to the posterior for the PLDS
() 7 / 19
Model class II: Latent dynamical system (DS) models
I Latent variable x(t) evolves linearly, withGaussian innovations
x(t + 1) = A x(t)︸ ︷︷ ︸dynamics
+
innovations︷ ︸︸ ︷b(t) + η(t)
I Gaussian linear dynamical system (GLDS):
y(t) | x(t) ∼ N (Cx(t) + d,R)
I Poisson linear dynamical system (PLDS):
yi(t) | x(t) ∼ Poisson(exp(Ci,:x(t) + di + Disi(t)))
I Parameters were learned by the EM algorithm
I We used a global Laplace approximation to the posterior for the PLDS
() 7 / 19
Model class II: Latent dynamical system (DS) models
I Latent variable x(t) evolves linearly, withGaussian innovations
x(t + 1) = A x(t)︸ ︷︷ ︸dynamics
+
innovations︷ ︸︸ ︷b(t) + η(t)
I Gaussian linear dynamical system (GLDS):
y(t) | x(t) ∼ N (Cx(t) + d,R)
I Poisson linear dynamical system (PLDS):
yi(t) | x(t) ∼ Poisson(exp(Ci,:x(t) + di + Disi(t)))
I Parameters were learned by the EM algorithm
I We used a global Laplace approximation to the posterior for the PLDS
() 7 / 19
Measuring performance: predicting neural firing from the population
I On test trials, for every neuron i :
I Compute cross-prediction E[yi,:|y\i,:] of neuron yi,: given all other neurons y\i,:
I Compute MSE between prediction and true trajectory
I We report improvement over a constant predictor (i.e. higher = better)
() 8 / 19
Cross-prediction: Dynamical systems outperform GLMs
0 20 40 60 80 1000
0.5
1
1.5
2
2.5
3x 10
−3
Percentage of zero−entries in coupling matrix
Var
−M
SE
GLM 10msGLM 100msGLM2 100msGLM 150msPLDS dim=5
I GLM performance as a function of L1-regularization parameter:controls zero-entries in coupling matrix
I PLDS with 5-dimensional state space
I For our recordings PLDS outperforms GLMs with different history filters
() 9 / 19
Cross-prediction: Poisson observations improve performance
2 4 6 8 10 12 14 16 18 202
2.2
2.4
2.6
2.8
3
3.2x 10
−3
Dimensionality of latent space
Var
−M
SE
GLDSPLDSPLDS 100ms
I Poisson observation model improves performance over Gaussian models
I Single neuron history filters result in an slight increase in performance
() 10 / 19
Matching statistics of the data I: temporal cross-correlations
latent dynamical systems (DS)
generalized linear models (GLM)
−100 −50 0 50 100
0.01
0.02
0.03
0.04
0.05
Time lag (ms)
Cor
rela
tion
GLDSPLDSPLDS 100msrecorded data
−100 −50 0 50 100
0.01
0.02
0.03
0.04
0.05
Time lag (ms)
Cor
rela
tion
GLM 10msGLM 100msGLM 150srecorded data
I Data: peaks at zero time lag and broad flanks
I DS: matches cross-correlations well
I GLM: difficulties to reproduce peak due to lack of instantaneous couplings
() 11 / 19
Matching statistics of the data I: temporal cross-correlations
latent dynamical systems (DS) generalized linear models (GLM)
−100 −50 0 50 100
0.01
0.02
0.03
0.04
0.05
Time lag (ms)
Cor
rela
tion
GLDSPLDSPLDS 100msrecorded data
−100 −50 0 50 100
0.01
0.02
0.03
0.04
0.05
Time lag (ms)
Cor
rela
tion
GLM 10msGLM 100msGLM 150srecorded data
I Data: peaks at zero time lag and broad flanks
I DS: matches cross-correlations well
I GLM: difficulties to reproduce peak due to lack of instantaneous couplings
() 11 / 19
Matching statistics of the data II: population spike counts
0 10 20 30 400
0.02
0.04
0.06
0.08
0.1
0.12
0.14
Number of spikes in 10ms bin
Fre
quen
cy
GLMGLM 2GLDSPLDSreal data
I Gaussian DS: under-estimates large population events
I Poisson DS: good match to the data
I GLM: no regularization; GLM2: optimal regularization
() 12 / 19
How much variance of the data can we explain using cross-prediction?
% of explained variance
Comparison with PSTH predictor
0 50 100 150 200 250 300 350 4000
5
10
15
20
25
Binsize (ms)
Per
cent
age
of v
aria
nce
expl
aine
d
PLDS, d=1d=2d=5d=10d=20
0 50 100 150 200 250 300 350 400
0.8
1
1.2
1.4
1.6
1.8
2
2.2
2.4
2.6
Binsize (ms)
Impr
ovem
ent o
ver
PS
TH
pre
dict
or
PLDS, d=1d=2d=5d=10d=20
I For small bins, more variance can be explained than would be possible if firingconditioned on latent state were Poisson
I PLDS models can explain up to twice as much variance as a model withindependent activity conditioned on PSTH
() 13 / 19
How much variance of the data can we explain using cross-prediction?
% of explained variance
Comparison with PSTH predictor
0 50 100 150 200 250 300 350 4000
5
10
15
20
25
Binsize (ms)
Per
cent
age
of v
aria
nce
expl
aine
d
d=1d=2d=5d=10d=20Poisson
0 50 100 150 200 250 300 350 400
0.8
1
1.2
1.4
1.6
1.8
2
2.2
2.4
2.6
Binsize (ms)
Impr
ovem
ent o
ver
PS
TH
pre
dict
or
PLDS, d=1d=2d=5d=10d=20
I For small bins, more variance can be explained than would be possible if firingconditioned on latent state were Poisson
I PLDS models can explain up to twice as much variance as a model withindependent activity conditioned on PSTH
() 13 / 19
How much variance of the data can we explain using cross-prediction?
% of explained variance Comparison with PSTH predictor
0 50 100 150 200 250 300 350 4000
5
10
15
20
25
Binsize (ms)
Per
cent
age
of v
aria
nce
expl
aine
d
d=1d=2d=5d=10d=20Poisson
0 50 100 150 200 250 300 350 400
0.8
1
1.2
1.4
1.6
1.8
2
2.2
2.4
2.6
Binsize (ms)
Impr
ovem
ent o
ver
PS
TH
pre
dict
or
PLDS, d=1d=2d=5d=10d=20
I For small bins, more variance can be explained than would be possible if firingconditioned on latent state were Poisson
I PLDS models can explain up to twice as much variance as a model withindependent activity conditioned on PSTH
() 13 / 19
Summary & conclusions
I Dynamical systems match the investigated cortical recordings better than GLMs
I Cross-prediction
I Various statistics, e.g. cross-correlation structure
I GLM modelling assumptions (direct couplings) might not be appropriate for thistype of data:
I Multi-electrode arrays sample a neural population very sparsely
I Most inputs presumably come from outside the sampled population
I Latent dynamical systems are arguably the more natural models formulti-electrode recording from cortex
() 14 / 19
Summary & conclusions
I Dynamical systems match the investigated cortical recordings better than GLMs
I Cross-prediction
I Various statistics, e.g. cross-correlation structure
I GLM modelling assumptions (direct couplings) might not be appropriate for thistype of data:
I Multi-electrode arrays sample a neural population very sparsely
I Most inputs presumably come from outside the sampled population
I Latent dynamical systems are arguably the more natural models formulti-electrode recording from cortex
() 14 / 19
Summary & conclusions
I Dynamical systems match the investigated cortical recordings better than GLMs
I Cross-prediction
I Various statistics, e.g. cross-correlation structure
I GLM modelling assumptions (direct couplings) might not be appropriate for thistype of data:
I Multi-electrode arrays sample a neural population very sparsely
I Most inputs presumably come from outside the sampled population
I Latent dynamical systems are arguably the more natural models formulti-electrode recording from cortex
() 14 / 19
Acknowledgements
I Jakob Macke (Gatsby Unit, UCL)
I Maneesh Sahani (Gatsby Unit, UCL)
I Krishna Shenoy & lab (Stanford)
I John Cunningham (Cambridge)
I Byron Yu (Carnegie Mellon)
Funding:
I Gatsby Charitable FoundationI EU-FP7 Marie Curie FellowshipI DARPA: REPAIR N66001-10-C-2010I EPSRC EP/H019472/1I NIH CRCNS R01-NS054283I NIH Pioneer 1DP1OD006409
() 15 / 19
() 16 / 19
Additional material
() 17 / 19
Cross-prediction: area under ROC
0 20 40 60 80 100
0.58
0.6
0.62
0.64
0.66
0.68
Percentage of zero−entries in coupling matrix
AU
C
GLM 10msGLM 100msGLM2 100msGLM 150msPLDS dim=5
5 10 15 200.64
0.65
0.66
0.67
0.68
Dimensionality of latent space
AU
C
GPFAGLDSPLDSPLDS 100ms
I Same qualitative results as measured by MSE
I Benefits of history filters for PLDS are more pronounced
() 18 / 19
Correlations in data are well reflected by DS models
data bin at 10ms 5-dimensional PLDS
Neuron index
Neuro
n index
−0.1
−0.05
0
0.05
0.1
Neuron index
Neuro
n index
−0.1
−0.05
0
0.05
0.1
() 19 / 19