latent variable / hierarchical models in computational neural science ying nian wu ucla department...

48
Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Upload: giles-flynn

Post on 14-Dec-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Latent Variable / Hierarchical Models in Computational

Neural Science

Ying Nian WuUCLA Department of Statistics

March 30, 2011

Page 2: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Outline• Latent variable models in statistics• Primary visual cortex (V1)• Modeling and learning in V1 • Layered hierarchical models• Joint work with Song-Chun Zhu and Zhangzhang Si

Page 3: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Latent variable models

Hidden

Observed

Learning: Examples

Inference:

Page 4: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Latent variable models Mixture model

Factor analysis

Page 5: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Latent variable models

Hidden

Observed

Learning: Examples

Maximum likelihood

EM/gradient

Inference / explaining away

E-step / imputation

Page 6: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Computational neural science

Z: Internal representation by neurons

Y: Sensory data from outside environment

Hidden

Observed

Connection weights

Hierarchical extension: modeling Z by another layer of hidden variables explaining Y instead of Z

Inference / explaining away

Page 7: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Source: Scientific American, 1999

Visual cortex: layered hierarchical architecture

V1: primary visual cortex simple cells complex cells

bottom-up/top-down

Page 8: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

1]}[2

1exp{)(

22

22

21

21 ixe

xxxG

Simple V1 cells Daugman, 1985

Gabor wavelets: localized sine and cosine waves

Transation, rotation, dilation of the above function

Page 9: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

)'()'(,'

,,,, xBxIBIx

sxsx

image pixels

V1 simple cells

,,sxB

respond to edges

Page 10: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Complex V1 cells Riesenhuber and Poggio,1999

2,,)(),( |,|max sxxAx BI

Image pixels

V1 simple cells

V1 complex cells

Local max

Local sum

•Larger receptive field •Less sensitive to deformation

Page 11: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Independent Component Analysis Bell and Sejnowski, 1996

CBcBcI NN B ...11

Nicpci ,...,1tly independen )(~

)dim(IN

IIC AB 1

mNNmmm CBcBcI B ,11, ...

mmm IIC AB 1

Laplacian/Cauchy

Page 12: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Hyvarinen, 2000

Page 13: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Sparse coding Olshausen and Field, 1996

Laplacian/Cauchy/mixture Gaussians

Nicpci ,...,1tly independen )(~

NNBcBcI ...11

mNNmmm BcBcI ,11, ...)dim(IN

Page 14: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Inference: sparsification, non-linear lasso/basis pursuit/matching pursuit mode and uncertainty of p(C|I) explaining-away, lateral inhibition

Nicpci ,...,1tly independen )(~

Sparse coding / variable selection

Learning: mNNmmm BcBcI ,11, ...

)dim(IN

A dictionary of representational elements (regressors)

NNBcBcI ...11

Page 15: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Olshausen and Field, 1996

Page 16: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

}exp{)(

1),(

,, j

jiiji vhW

WZVHp

Nihi ,...,1 ,

V

Restricted Boltzmann Machine Hinton, Osindero and Teh, 2006

P(V|H)P(H|V): factorized no-explaining away

hidden, binary

visible

Page 17: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Energy-based model Teh, Welling, Osindero and Hinton, 2003

)},(exp{),(

1)(

iiBIZ

Ip B

Features, no explaining-away

Maximum entropy with marginalsExponential family with sufficient stat

)},(exp{)(

1)(

,,,,,

sxsxs BI

ZIp

Zhu, Wu, and Mumford, 1997Wu, Liu, and Zhu, 2000

Markov random field/Gibbs distribution

Page 18: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Zhu, Wu, and Mumford, 1997Wu, Liu, and Zhu, 2000

Page 19: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Source: Scientific American, 1999

Visual cortex: layered hierarchical architecture

bottom-up/top-down

What is beyond V1?Hierarchical model?

Page 20: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Hierchical ICA/Energy-based model?

Larger featuresMust introduce nonlinearitiesPurely bottom-up

Page 21: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

P(V,H) = P(H)P(V|H) P(H) P(V’,H)

I

H

V

V’

Discriminative correction by back-propagation

Unfolding, untying, re-learning

Hierarchical RBM Hinton, Osindero and Teh, 2006

Page 22: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Hierarchical sparse coding

NNBcBcI ...11

,,sxB

Attributed sparse coding elements transformation group topological neighborhood system

UBcIii sx

n

ii

,,

1

Layer above : further coding of the attributes of selected sparse coding elements

Page 23: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Active basis modelWu, Si, Gong, Zhu, 10Zhu, Guo, Wang, Xu, 05

n-stroke templaten = 40 to 60, box= 100x100

Page 24: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Active basis model Wu, Si, Gong, Zhu, 10Zhu, et al., 05

Yuille, Hallinan, Cohen, 92

n-stroke templaten = 40 to 60, box= 100x100

Page 25: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

•Simplest AND-OR graph (Pearl, 84; Zhu, Mumford 06) AND composition and OR perturbations or variations of basis elements

•Simplest shape model: average + residual•Simplest modification of Olshausen-Field model•Further sparse coding of attributes of sparse coding elements

Simplicity

Page 26: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Bottom layer: sketch against texture

Only need to pool a marginal q(c) as null hypothesis • natural images explicit q(I) of Zhu, Mumford, 97• this image explicit q(I) of Zhu, Wu, Mumford, 97

Maximum entropy (Della Pietra, Della Pietra, Lafferty, 97; Zhu, Wu, Mumford, 97; Jin, S. Geman, 06; Wu, Guo, Zhu, 08) Special case: density substitution (Friedman, 87; Jin, S. Geman, 06)

p(C, U) = p(C) p(U|C) = p(C) q(U|C) = p(C) q(U,C)/q(C)

Page 27: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Shared sketch algorithm: maximum likelihood learning

Prototype: shared matching pursuit (closed-form computation)

Step 1: two max to explain images by maximum likelihood no early decision on edge detection Step 2: arg-max for inferring hidden variablesStep 3: arg-max explains away, thus inhibits (matching pursuit, Mallat, Zhang, 93)

Finding n strokes to sketch M images simultaneouslyn = 60, M = 9

Page 28: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Bottom-up sum-max scoring (no early edge decision)

Top-down arg-max sketching

1. Reinterpreting MAX1: OR-node of AND-OR, MAX for ARG-MAX in max-product algorithm2. Stick to Olshausen-Field sparse top-down model : AND-node of AND-OR Active basis, SUM2 layer, “neurons” memorize shapes by sparse connections to MAX1 layer Hierarchical, recursive AND-OR/ SUM-MAX Architecture: more top-down than bottom-up Neurons: more representational than operational (OR-neurons/AND-neurons)

Cortex-like sum-max maps: maximum likelihood inference

SUM1 layer: simple V1 cells of Olshausen, Field, 96MAX1 layer: complex V1 cells of Riesenhuber, Poggio, 99

Scan over multiple resolutions

Page 29: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Bottom-up detectionTop-down sketching

SUM1

MAX1

SUM2

arg MAX1

Sparse selective connection as a result of learningExplaining-away in learning but not in inference

Bottom-up scoring and top-down sketching

Page 30: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011
Page 31: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Scan over multiple resolutions and orientations (rotating template)

Page 33: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Adjusting Active Basis Model by L2 Regularized Logistic RegressionBy Ruixun Zhang

•Exponential family model, q(I) negatives Logistic regression for p(class | image), partial likelihood•Generative learning without negative examples basis elements and hidden variables•Discriminative adjustment with hugely reduced dimensionality correcting conditional independence assumption

L2 regularized logistic regressionre-estimated lambda’s

Conditional on: (1) selected basis elements (2) inferred hidden variables (1) and (2) generative learning

Page 34: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Active basis templates

Adaboost templates

# of negatives: 10556 7510 4552 1493 12217

• Arg-max inference and explaining away, no reweighting, • Residual images neutralize existing elements, same set of training examples

• No arg-max inference or explaining away inhibition• Reweighted examples neutralize existing classifiers, changing set of examples

double # elements

same # elements

Page 35: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Mixture model of active basis templates fitted by EM/maximum likelihood with random initialization

MNIST500 total

Page 36: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Learning active basis models from non-aligned imageEM-type maximum likelihood learning, Initialized by single image learning

Page 37: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Learning active basis models from non-aligned image

Page 38: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Learning active basis models from non-aligned image

Page 39: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011
Page 40: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Hierarchical active basis by Zhangzhang Si et al. •And-OR graph: Pearl, 84; Zhu, Mumford, 06•Compositionality and reusability: Geman, Potter, Chi, 02; L.Zhu, Lin, Huang, Chen,Yuille, 08•Part-based method: everyone et al. •Latent SVM: Felzenszwalb, McAllester, Ramanan, 08•Constellation model: Weber, Welling, Perona, 00

Lowlog-likelihood

Highlog-like

Page 41: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Simplicity

•Simplest and purest recursive two-layer AND-OR graph•Simplest generalization of active basis model

Page 42: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

AND-OR graph and SUM-MAX mapsmaximum likelihood inference

Cortex-like, related to Riesenhuber, Poggio, 99•Bottom-up sum-max scoring•Top-down arg-max sketching

Page 43: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Hierarchical active basis by Zhangzhang Si et al.

Page 44: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011
Page 45: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011
Page 46: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011
Page 47: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

Shape script by composing active basis shape motifsRepresenting elementary geometric shapes (shape motifs) by active bases (Si, Wu, 10) Geometry = sketch that can be parametrized

Page 48: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

UBcIii sx

n

ii

,,

1

),...,1,,(motif shape),...,1,,( nixnix iik

kii

Bottom-layer: Olshausen-Field (foreground) + Zhu-Wu-Mumford (background) Maximum entropy tilting (Della Pietra, Della Pietra, Lafferty, 97) white noise texture (high entropy) sketch (low and mid entropy) (reverse the central limit theorem effect of information scaling)

Build up layers: (1) AND-OR, SUM-MAX (top-down arg-MAX) (2) Perpetual sparse coding: further coding of attributes of the current sparse coding elements (a) residuals of attributes continuous OR-nodes (b) mixture model discrete OR-nodes

Summary