information geometry and neural netowrks
DESCRIPTION
Information Geometry and Neural Netowrks. Shun-ichi Amari RIKEN Brain Science Institute Orthogonal decomposition of rates and (higher-order) correlations Synchronous firing and higher correlations Algebraic singularities caused by multiple stimuli - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/1.jpg)
Information Geometryand Neural Netowrks
Shun-ichi Amari RIKEN Brain Science Institute Orthogonal decomposition of rates and (higher-order) correlations
Synchronous firing and higher correlations
Algebraic singularities caused by multiple stimuli
Dynamics of learning in multiplayer perceptrons
![Page 2: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/2.jpg)
Information GeometryInformation GeometryInformation GeometryInformation Geometry
Systems Theory Information Theory
Statistics Neural Networks
Combinatorics PhysicsInformation Sciences
Riemannian ManifoldDual Affine Connections
Manifold of Probability Distributions
Math. AI
![Page 3: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/3.jpg)
2
2
1; , ; , exp
22
xS p x p x
Information GeometryInformation Geometry ? ?Information GeometryInformation Geometry ? ?
p x
;S p x θ
Riemannian metric
Dual affine connections
( , ) θ
![Page 4: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/4.jpg)
Manifold of Probability DistributionsManifold of Probability DistributionsManifold of Probability DistributionsManifold of Probability Distributions
1 2 3 1 2 3
1,2,3 { ( )}
, , 1
x p x
p p p p p p
3p
2p1p
p
;M p x
![Page 5: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/5.jpg)
Two StructuresTwo StructuresTwo StructuresTwo Structures
Riemannian metric and affine connectionRiemannian metric and affine connection
2
2
: log
1, : ,
2
ij i j
p
ds g d d
p xD p q E
q x
ds D p x p x d
Fisher informationFisher information
log logiji j
g E p p
![Page 6: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/6.jpg)
Riemannian Structure
2 ( )
( )
( ) ( )
Euclidean
i jij
T
ij
ds g d d
d G d
G g
G E
![Page 7: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/7.jpg)
Affine Connection
covariant derivative
geodesic X=X X=X(t)
( )
c
i jij
X Y
s g d d
minimal distance
straight line
![Page 8: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/8.jpg)
1 2{ ( , )}S p x x1 2, 0,1x x
1 2{ ( ) ( )}M q x q x
Independent Distributions
![Page 9: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/9.jpg)
Neural Firing
1x 2x 3x nx
higher-order correlations
orthogonal decomposition
1 2( ) ( , ,..., )np p x x xx
[ ]i iE x
[ , ]ij i jv Cov x x
----firing rate
----covariance
![Page 10: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/10.jpg)
Information Geometryof Higher-Order Correlations ----orthogonal decomposition
Information Geometryof Higher-Order Correlations ----orthogonal decomposition
Riemannian metric
dual affine connections
Pythagoras theorem
Dual geodesics
,S p x
![Page 11: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/11.jpg)
Correlations of Neural FiringCorrelations of Neural Firing
1 2
00 10 01 11
1 1
2 1
,
, , ,
p x x
p p p p
p
p
11 00
10 01
logp p
p p
1x 2x
2
1
1 2{( , ), } orthogonal coordinates
firing ratescorrelations
![Page 12: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/12.jpg)
00110001011010100100110100
0101101001010
firing rates:correlation—covariance?
1x
2x
3x
00 01 10 11{ , , , }p p p p
1 2 12, ;
![Page 13: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/13.jpg)
1 2{ ( , )}S p x x1 2, 0,1x x
1 2{ ( ) ( )}M q x q x
Independent Distributions
![Page 14: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/14.jpg)
Pythagoras Theorem
p
qr
D[p:r] = D[p:q]+D[q:r]
p,q: same marginals
r,q: same correlations
1 2,
independent
correlations
( )[ : ] ( ) log
( )x
p xD p r p x
q x
estimation correlationtesting
invariant under firing rates
![Page 15: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/15.jpg)
01100101……. 110001011001……. 101000111100……. 1001
1x
2x
3x
No pairwise correlations, Triplewise correlation
1 2 3 1 2 3
1 2 1 2
( , , ) ( ) ( ) ( )
( , ) ( ) ( )
p x x x p x p x p x
p x x p x p x
![Page 16: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/16.jpg)
Pythagoras Decomposition of KL Divergence
( )p x
( )indp x
( )pairwise corrp x
only pairwise
independent
![Page 17: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/17.jpg)
Higher-Order Correlations
1 2, , ,
exp
n
i i ij i j ijk i j k
x x x
p x x x x x x
x
x
0M
1M
[ ]
[ ]i i
ij i j
E x
E x x
( , , ,...)
( , , ,...)
i ij ijk
i ij ijk
![Page 18: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/18.jpg)
Synfiring andHigher-Order Correlations
Amari, Nakahara, Wu, Sakai
![Page 19: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/19.jpg)
Neurons
1x nx
1i ix u
Gaussian [ ]i i ju E u u
2x
Population and Synfire
![Page 20: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/20.jpg)
Population and Synfire
hswu jiji ii ux 1
(1 )i iu h
, 0, 1i N
s
1x nx
2
[ ]
[ ] 1
i j
i
E u u
E u
![Page 21: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/21.jpg)
timesame at the fire neurons Prob ipi
(1 )
Pr{ 1} Pr{ 0}
i n in i
i i
C F F
F x u
Pr{ }1
i
h
![Page 22: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/22.jpg)
timesame at the fire neurons Prob ipi
Pr{ neurons fire}r
ir P nr
n
( , ) nH r nzq r e e d FrFr
nz 1 log 1 log
2
2
dt 2
1 2
0
2thaehaFF
![Page 23: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/23.jpg)
1 22 1( , ) exp[ { ( ) } ]
2(1 ) 2 1q r c F h
1 2
1...
( , ) exp{ ...}
(1/ )k
i i ij i j ijk i j k
ki i i
p x x x x x x
O n
x
![Page 24: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/24.jpg)
Synfiring
1( ) ( ,..., )
1n
i
p p x x
r x q rn
x
( )q r
r
![Page 25: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/25.jpg)
Bifurcation
r
rP
ix : independent---single delta peak pairwise correlated
higher-order correlation !
![Page 26: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/26.jpg)
Shun-ichi AmariRIKEN Brain Science Institute
Collaborators: Si Wu Hiro Nakahara
Field Theory of Population CodingField Theory of Population Coding
![Page 27: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/27.jpg)
* *|x r z x
*r z f z x z
2
2exp
2
zf z
a
Population Coding and Neural Field
z
![Page 28: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/28.jpg)
Population Encoding
r z f z x z
ˆdecoding r z x
x
f (z-x)
r(z)
z
z
![Page 29: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/29.jpg)
Noise
2
2
22
0
' '
', ' 1 ' exp
2
z
n z z h z z
z zh z z n z z n
b
b
z
![Page 30: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/30.jpg)
Probability Model
2
12
( ) exp2
nQ r z x c r z f z x h r z f z x
1 1 , ' ' 'r z h r z r z h z z r z dzdz
1 ' ' '' ' '', , h z z h z z dz z z
r z f z x z
![Page 31: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/31.jpg)
Fisher information
2*
* | log
dx
xrQdExI
Cramer-Rao
)(
1ˆ
*
2*
xIxxE
![Page 32: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/32.jpg)
Fourier Analysis
1
2i zf z F f z e dz
' 1
2i zh z z H h z e dz
222
22
FnI d
H
![Page 33: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/33.jpg)
Fisher Information
2 2
2 2
2
22 2
21 2
a
b
n eI d
n b n e
3 2
3 2
1) No correlation 0
2) Uniform correlations
1
nI
ab
nI
a
![Page 34: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/34.jpg)
2 3
2
3) Limited range correlations
1
1 '
14) Wide range correlations:
10 1
5) Special case: 1, 2
cb
nn
Ia c
bn
I A dc
b a
I
![Page 35: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/35.jpg)
Dynamics of Neural Fields
, , ,
u z tu z t w z z u z t dz
uc r z
ShapingDetectingDecoding
![Page 36: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/36.jpg)
How the Brain Solves Singularity in Population Coding
S. Amari and H. Nakahara
RIKEN Brain Science Institute
![Page 37: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/37.jpg)
1x 2xZ
1x 2xZ
![Page 38: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/38.jpg)
Neural Activity
1 2
11 2 2
1
1; , , exp
2
log log
: Fisher information matrix
iji j
ij
r z v z x v z x z
Q r z v x x r f h r f
Q QI E
I I
![Page 39: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/39.jpg)
Parameter Space
v
1x2x
![Page 40: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/40.jpg)
2 1
1 2
1
: difference
1 : center of gravity
, ,
Fisher information degenerates as 0
Cramer-Raoparadigm: error
u x x
w v x vx
w u v
u
I
![Page 41: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/41.jpg)
2 2 1 3 3 1 1
2 3
1
2
3
; 1
1 1 2 1, ,
2 6
f z H z H z z
v v v v vw u u
g
I g
g
![Page 42: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/42.jpg)
: Jacobian singular
T
J
I J I J
![Page 43: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/43.jpg)
2
3
2
~ 1
1~
1~
1~i
w O
u Ou
v Ou
x Ou
w
![Page 44: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/44.jpg)
synfiring resolves singularity
1 1 2
2 1 2
phase 1:
:
f z v z x v z x
f z v z x v z x
1 , 1v v
: regular as 0I u
![Page 45: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/45.jpg)
1x 2xZ
1x 2xZ
![Page 46: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/46.jpg)
synfiring mechanism
1z
2z
common multiplicative noisecommon multiplicative noise
![Page 47: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/47.jpg)
S.Amari and H.Nagaoka,
Methods of Information GeometryAMS &Oxford Univ Press, 2000
![Page 48: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/48.jpg)
Mathematical Neurons
i iy w x h w x
x y( )u
u
![Page 49: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/49.jpg)
Multilayer Perceptrons
i iy v n w x
21; exp ,
2
, i i
p y c y f
f v
x x
x w x
x y
1 2( , ,..., )nx x x x
1 1( ,..., ; ,..., )m mw w v v
![Page 50: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/50.jpg)
Multilayer Perceptron
1 1,
,
, ; ,
i i
m m
y f
v
v v
x θ
w x
θ w w
neuromanifold( )x
space of functions
![Page 51: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/51.jpg)
Neuromanifold
• Metrical structure
• Topological structure
![Page 52: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/52.jpg)
Riemannian manifold
22
ij i j
T
ds d
g d d
d G d
j
i
d
log ( | ; ) log ( | ; )( ) [ ]ij
i j
p y x p y xg E
![Page 53: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/53.jpg)
Geometry of singular modelGeometry of singular model
y v n w x
W
v| | 0v w
![Page 54: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/54.jpg)
Gaussian mixtureGaussian mixture
1 2 1 2; , , 1p x v w w v x w v x w
21 1exp
22x x
1 2: singular , 1 0 w w v v
1w
2w
v
![Page 55: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/55.jpg)
Topological Singularities
S
M
![Page 56: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/56.jpg)
singularities
![Page 57: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/57.jpg)
Singularity of MLP---example
![Page 58: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/58.jpg)
Backpropagation ---gradient learningBackpropagation ---gradient learningBackpropagation ---gradient learningBackpropagation ---gradient learning
1 1
2
examples : , , , training set
1( , ; ) ,
2 log , ;
t ty y
E y x y f
p y
x x
x
x
,
t t
i i
E
f v
x w x
![Page 59: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/59.jpg)
Information Geometry of MLPInformation Geometry of MLP
Natural Gradient Learning : S. Amari ; H.Y. Park
1
1 1 1 11 1 T
t t t t
EG
G G G f f G
![Page 60: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/60.jpg)
1 1 2 2( ) ( )y v w x v w x n
1 2
1 2
w w w
v v v
2 1
2 1
u w w
z v v
x y
1w
2w
z
1w
2w
1v
2v
![Page 61: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/61.jpg)
![Page 62: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/62.jpg)
2 hidden-units
1 1 2
1 2
1 2
2 1
2 1
2
: y v v n
w w w
v
u w w
v vz
v
v
v
2w x w x
![Page 63: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/63.jpg)
Dynamics of Learning
1,
( , ), ( , )
( , ),
( , )
d dl G l
dt dt
du dzf u z k u z
dt dt
du f u z
dz k u z
2 2 1
log2
u z z c
![Page 64: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/64.jpg)
The teacher is on singularity
2 2 3
2 4
2
1( )4
1( )4
1( )4
duA z u
dtdz
A z zudt
dz zu
du z
2 2 1log
2u z z c
![Page 65: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/65.jpg)
The teacher is on singularity
2 2 3
2 4
2
1( )4
1( )4
1( )4
duA z u
dtdz
A z zudt
dz zu
du z
2 2 1log
2u z z c
![Page 66: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/66.jpg)
![Page 67: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/67.jpg)
![Page 68: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/68.jpg)
![Page 69: Information Geometry and Neural Netowrks](https://reader035.vdocuments.mx/reader035/viewer/2022062304/568145be550346895db2c920/html5/thumbnails/69.jpg)