the free-energy principle : a rough guide to the brain ? k friston
DESCRIPTION
The free-energy principle : a rough guide to the brain ? K Friston. Computational Modeling of Intelligence 11.03.04.(Fri) Summarized by Joon Shik Kim. Sufficient Statistics. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: The free-energy principle : a rough guide to the brain ? K Friston](https://reader036.vdocuments.mx/reader036/viewer/2022081520/5681661d550346895dd96e6d/html5/thumbnails/1.jpg)
The free-energy principle: a rough guide to the brain?
K FristonComputational Modeling of Intelli-
gence11.03.04.(Fri)
Summarized by Joon Shik Kim
![Page 2: The free-energy principle : a rough guide to the brain ? K Friston](https://reader036.vdocuments.mx/reader036/viewer/2022081520/5681661d550346895dd96e6d/html5/thumbnails/2.jpg)
Sufficient Statistics• Quantities which are sufficient to pa-
rameterise a probability density (e.g., mean and covariance of a Gaussian density).
2
2( )21( )
2
x
p x e
![Page 3: The free-energy principle : a rough guide to the brain ? K Friston](https://reader036.vdocuments.mx/reader036/viewer/2022081520/5681661d550346895dd96e6d/html5/thumbnails/3.jpg)
Surprise• or self-information is the negative
log-probability of an outcome. An improbable outcome is there-fore surprising.
ln ( | )p y y
: sensory input
: action
![Page 4: The free-energy principle : a rough guide to the brain ? K Friston](https://reader036.vdocuments.mx/reader036/viewer/2022081520/5681661d550346895dd96e6d/html5/thumbnails/4.jpg)
Kullback-Leibler Divergence• Information divergence, information
gain, cross or relative entropy is a non-commutative measure of the dif -ference between two probability dis-tributions.
( )( || ) ( ) log( )KLp xD P Q p x dxq x
![Page 5: The free-energy principle : a rough guide to the brain ? K Friston](https://reader036.vdocuments.mx/reader036/viewer/2022081520/5681661d550346895dd96e6d/html5/thumbnails/5.jpg)
Conditional Density• or posterior density is the probability
distribution of causes or model pa-rameters, given some data; i.e., a probabilistic mapping from observed data to causes.
( | ) ( )( | )( )
P E H P HP H EP E
( ) ( | ) ( )i ii
P E p E H p H
EH( )p H
( | )p H E
: evi-dence: hypothe-sis: prior: posterior
![Page 6: The free-energy principle : a rough guide to the brain ? K Friston](https://reader036.vdocuments.mx/reader036/viewer/2022081520/5681661d550346895dd96e6d/html5/thumbnails/6.jpg)
Generative Model• or forward model is a probabilistic
mapping from causes to observed consequences (data). It is usually specified in terms of the likelihood of getting some data given their causes (parameters of a model) and priors on the parameters
( | ) ( | ) ( )p w D p D w p w
![Page 7: The free-energy principle : a rough guide to the brain ? K Friston](https://reader036.vdocuments.mx/reader036/viewer/2022081520/5681661d550346895dd96e6d/html5/thumbnails/7.jpg)
Prior• The probability distribution or density
on the causes of data that encode beliefs about those causes prior to observing the data.
( )p H or ( )p w
![Page 8: The free-energy principle : a rough guide to the brain ? K Friston](https://reader036.vdocuments.mx/reader036/viewer/2022081520/5681661d550346895dd96e6d/html5/thumbnails/8.jpg)
Empirical Priors• Priors that are induced by hierarchi-
cal models; they provide constraints on the recognition density in the usual way but depend on the data.
![Page 9: The free-energy principle : a rough guide to the brain ? K Friston](https://reader036.vdocuments.mx/reader036/viewer/2022081520/5681661d550346895dd96e6d/html5/thumbnails/9.jpg)
Bayesian Surprise• A measure of salience based on the
divergence between the recognition and prior densities. It measures the information in the data that can be recognised.
![Page 10: The free-energy principle : a rough guide to the brain ? K Friston](https://reader036.vdocuments.mx/reader036/viewer/2022081520/5681661d550346895dd96e6d/html5/thumbnails/10.jpg)
Entropy• The average surprise of outcomes
sampled from a probability distribu-tion or density. A density with low en-tropy means, on average, the out-come is relatively predictable.
( ) ln ( )S p x p x dx
![Page 11: The free-energy principle : a rough guide to the brain ? K Friston](https://reader036.vdocuments.mx/reader036/viewer/2022081520/5681661d550346895dd96e6d/html5/thumbnails/11.jpg)
Ergodic• A process is ergodic if its long term
time-average converges to its en-semble average. Ergodic processes that evolve for a long time forget their initial states.
1
0
1 1lim ( )( )
nk
n k
f T x fdn x
![Page 12: The free-energy principle : a rough guide to the brain ? K Friston](https://reader036.vdocuments.mx/reader036/viewer/2022081520/5681661d550346895dd96e6d/html5/thumbnails/12.jpg)
Free-energy• An information theory measure that
bounds (is greater than) the surprise on sampling some data, given a gen-erative model.
( , | )F y E TS
ln ( , | ) ln ( , )q qp y q
( ( ; ) || ( | )) ln ( | )F D q p y p y m
![Page 13: The free-energy principle : a rough guide to the brain ? K Friston](https://reader036.vdocuments.mx/reader036/viewer/2022081520/5681661d550346895dd96e6d/html5/thumbnails/13.jpg)
Generalised Coordinates• of motion cover the value of a vari-
able, in its motion, acceleration, jerk and higher orders of motion. A point in generalised coordinates corre-sponds to a path or trajectory over time.
, ', '',...u u u u
![Page 14: The free-energy principle : a rough guide to the brain ? K Friston](https://reader036.vdocuments.mx/reader036/viewer/2022081520/5681661d550346895dd96e6d/html5/thumbnails/14.jpg)
Gradient Descent• An optimization scheme that finds a
minimum of a function by changing its arguments in proportion to the negative of the gradient of the func-tion at the current value.
( 1) ( ) Ew t w tw
![Page 15: The free-energy principle : a rough guide to the brain ? K Friston](https://reader036.vdocuments.mx/reader036/viewer/2022081520/5681661d550346895dd96e6d/html5/thumbnails/15.jpg)
Helmholtz Machine• Device or scheme that uses a gener-
ative model to furnish a recognition density. They learn hidden structure in data by optimising the parameters of generative models.
![Page 16: The free-energy principle : a rough guide to the brain ? K Friston](https://reader036.vdocuments.mx/reader036/viewer/2022081520/5681661d550346895dd96e6d/html5/thumbnails/16.jpg)
Stochastic• The successive states of stochastic
processes that are governed by ran-dom effects.
![Page 17: The free-energy principle : a rough guide to the brain ? K Friston](https://reader036.vdocuments.mx/reader036/viewer/2022081520/5681661d550346895dd96e6d/html5/thumbnails/17.jpg)
Free Energy
![Page 18: The free-energy principle : a rough guide to the brain ? K Friston](https://reader036.vdocuments.mx/reader036/viewer/2022081520/5681661d550346895dd96e6d/html5/thumbnails/18.jpg)
Dynamic Model of World and Recog-nition
![Page 19: The free-energy principle : a rough guide to the brain ? K Friston](https://reader036.vdocuments.mx/reader036/viewer/2022081520/5681661d550346895dd96e6d/html5/thumbnails/19.jpg)
Neuronal Architecture
![Page 20: The free-energy principle : a rough guide to the brain ? K Friston](https://reader036.vdocuments.mx/reader036/viewer/2022081520/5681661d550346895dd96e6d/html5/thumbnails/20.jpg)
What is the computational role of neuromodulation?
• Previous treatments suggest that modulatory neu-rotransmitter have distinct roles; for example, ‘dopamine signals the error in reward prediction, serotonin controls the time scale of reward predic-tion, noradrenalin controls the randomness in ac-tion selection, and acetylcholine controls the speed of memory update. This contrasts with a single role in encoding precision above. Can the apparently diverse functions of these neuro-transmitters be understood in terms of one role (encoding precision) in different parts of the brain?
![Page 21: The free-energy principle : a rough guide to the brain ? K Friston](https://reader036.vdocuments.mx/reader036/viewer/2022081520/5681661d550346895dd96e6d/html5/thumbnails/21.jpg)
Can we entertain ambiguous per-cepts?
• Although not an integral part of the free-en-ergy principle, we claim the brain uses uni-modal recognition densities to represent one thing at a time. Although, there is compelling evidence for bimodal ‘priors’ in sensorimotor learning, people usually assume the ‘recogni-tion’ density collapses to a single percept, when sensory information becomes available. The implicit challenge here is to find any elec-trophysiological or psychological evidence for multimodal recognition densities.
![Page 22: The free-energy principle : a rough guide to the brain ? K Friston](https://reader036.vdocuments.mx/reader036/viewer/2022081520/5681661d550346895dd96e6d/html5/thumbnails/22.jpg)
Does avoiding surprise suppress salient information?
• No; a careful analysis of visual search and atten-tion suggests that: ‘only data observations which substantially affect the observer’s beliefs yield (Bayesian) surprise, irrespectively of how rare or informative in Shannon’s sense these observa-tions are.’ This is consistent with active sampling of things we recognize (to reduce free-energy). However, it remains an interesting challenge to formally relate Bayesian surprise to the free-en-ergy bound on (Shannon) surprise. A key issue here is whether saliency can be shown to depend on top-down perceptual expectations.
![Page 23: The free-energy principle : a rough guide to the brain ? K Friston](https://reader036.vdocuments.mx/reader036/viewer/2022081520/5681661d550346895dd96e6d/html5/thumbnails/23.jpg)
Which optimisation schemes does the brain use?
• We have assumed that the brain uses a de-terministic gradient descent on free-energy to optimise action and perception. However, it might also use stochastic searches; sampling the sensorium randomly for a percept with low free-energy. Indeed, there is compelling evi-dence that our eye movements implement an optimal stochastic strategy. This raises inter-esting questions about the role of stochastic searches; from visual search to foraging, in both perception and action.