robotic models of active perception
TRANSCRIPT
Robo$c Models of Ac$ve Percep$on
Dimitri Ognibene, PhD Laboratory for Morphological Computa:on and Learning
(www.thrish.org)
To subs:tute humans in dangerous jobs is one of the
main goals of robo:cs
The ac$ons in these pictures are already possible for robots
of today.
However…..
Perceiving in these environments is very complex:
• Unstructured • Changing
• Many different objects of different scales and shapes • Occlusions
• Other agents to perceive and coordinate with
Currently only humans are able to cope with such level of perceptual complexity…
And humans perceive ac$vely…
Active Perception
Ognibene & Demiris 2013
• Robo:cs • Neuroscience • Automa:c Diagnosis • Smart Devices & Environments
• Data mining
Foveal Vision (What does it mean to perceive ac:vely?)
7
Foveal Vision (What does it mean to perceive ac:vely?)
Try to grasp an apple with foveal vision.. Seeing becomes like sampling and remembering
Foveal Vision (What does it mean to perceive ac:vely?)
Try to grasp an apple with foveal vision.. Seeing becomes like sampling and remembering
Foveal Vision (What does it mean to perceive ac:vely?)
Try to grasp an apple with foveal vision.. Seeing becomes like sampling and remembering
Foveal Vision (What does it mean to perceive ac:vely?)
Try to grasp an apple with foveal vision.. Seeing becomes like sampling and remembering
Foveal Vision (What does it mean to perceive ac:vely?)
Try to grasp an apple with foveal vision.. Seeing becomes like sampling and remembering
Foveal Vision (What does it mean to perceive ac:vely?)
Try to grasp an apple with foveal vision.. Seeing becomes like sampling and remembering
Ac:ve Percep:on (AP) Issues* • Where to look? • What to remember? • When to stop looking and start ac:ng?
– Enough informa:on? – Enough :me? – Acquired informa:on s:ll valid?
*See also The Frame Problem
Where to look? Use only image sta:s:cs?
I] & Baldi 2010
Main limits of base saliency models are: • No task informa:on • Do not consider limited field of view
Where To look? Informa:on on Demand
Yarbus 1967 16
Where to look? Context and task informa:on used to drive
percep:on to the target
Vogel & de Freitas 2008
Unknown Task or Goal • Task/Goal depending on other agents’ presence/goals
• Mul:ple affordances required for the task
Ognibene & Demiris IJCAI 2013
Ac:ve Percep:on and Mirror Neurons
19
• Encode ac:on goal • Abstracts trajectory • Needs percep:ons
Can Motor Control System predict others’
ac:ons?
Human Robot Interaction as a Distributed Dynamic Event
Ognibene & Demiris 2013
Predic:ve Ac:on Recogni:on
Field of view
Ognibene & Demiris 2013
Effec:ve Percep:on-‐Environment Coupling is necessary for :mely
Recogni:on and Survival
Predic:ve Ac:on Recogni:on
Field of view
Ognibene & Demiris IJCAI 2013
Effec:ve Percep:on-‐Environment Coupling is necessary for :mely
Recogni:on and Survival
Field of view
Different hypotheses of target posi:on Equally probable, not seen
Ognibene & Demiris IJCAI 2013
See also “Percep:ons as hypotheses: saccades as experiments, Friston et al.
2012”
Perceive to reduce uncertainty
Field of view
Hand movement changes distribu:on
Ognibene & Demiris IJCAI 2013
Perceive to reduce uncertainty
Field of view
Saccade to target hypothesis
Ognibene & Demiris IJCAI 2013
Perceive to reduce uncertainty
Field of view
No target at posi:on observed
Ognibene & Demiris IJCAI 2013
Perceive to reduce uncertainty
Field of view
Update Distribu:on
Ognibene & Demiris 2013
Perceive to reduce uncertainty
2. Active Event Recognition
In this section the AER is defined and a solution based on a mixture of KFusing Information Gain (AERIG) is described.
Problem definition. The graphical model in figure 2 displays the formulationof the problem. The discrete hidden stochastic variable V represents the classof the event which is taking place, characterised by a di↵erent dynamic of theenvironment that the agent must predict and recognise. The environment iscomposed of a fixed set of elements E = {e1, e2 . . . eN} and thus its state X t attime t is composed of the states Xt
i of the di↵erent elements. For each value of Vthe evolution of X t is determined by a di↵erent dynamic system with di↵erentindependence conditions between the elements. At each time step the agentreceives for each element i an observation ot
i which depends on the currentconfiguration of the sensors ✓t. The states and observations are continuousvariables.
At every time step the goal of the system is to select the configuration ✓t
that will minimise the expected uncertainty over V (quantified by entropy H):
✓t = argmin✓
t
Z
O
p(ot|o0...t�1, ✓t)H(V |oo...t, ✓0...t)dot (1)
Proposed solution. For the recognition of the event and for the selection ofthe sensors configuration it is necessary to compute the posterior P (v|ot; ✓t).Given a prior distribution P (v,xt
1:N ) = P (xt1:N |v)P (v) and the independence
of the observed event from the sensor configuration P (v|✓) = P (v), the updateexpression of the posterior P (v|ot+1✓t+1) can be derived through the use of theBayes rule:
P (v|ot+1, ✓t+1) =P (ot+1|v, ✓t+1)P (v)
P (ot+1|✓t+1)(2)
The computation of eq.1 and eq.15 in the general case can pose severe compu-tational complexities. The solution proposed is based on the assumption that,once v is fixed, the dynamics is linear and the probability distributions are nor-mal. This enables the use of a mixture of KF with a distinct KF for each valueof v. Denoting with ot+1
v,✓t+1 the mean expected observation and with St+1v,✓t+1
its covariance matrix, both of which are conditioned on v and ✓ and computedduring the KF update, the following can be derived:
✓t+1 = argmin✓
t+1
X
v
P (v)⇣12ln |St+1
v,✓
t+1 |+Z
O
N (o; ot+1v,✓
t+1 ,St+1v,✓
t+1) ln(P (o|✓t+1))do⌘ (3)
Where |S| is the determinant of a matrix S. The first order Taylor expansion
5
Info Gain Percep:on Control for Inten:on An:cipa:on
Minimizing event uncertainty (condi:onal entropy H(v|..))
Ognibene & Demiris IJCAI 2013
2. Active Event Recognition
In this section the AER is defined and a solution based on a mixture of KFusing Information Gain (AERIG) is described.
Problem definition. The graphical model in figure 2 displays the formulationof the problem. The discrete hidden stochastic variable V represents the classof the event which is taking place, characterised by a di↵erent dynamic of theenvironment that the agent must predict and recognise. The environment iscomposed of a fixed set of elements E = {e1, e2 . . . eN} and thus its state X t attime t is composed of the states Xt
i of the di↵erent elements. For each value of Vthe evolution of X t is determined by a di↵erent dynamic system with di↵erentindependence conditions between the elements. At each time step the agentreceives for each element i an observation ot
i which depends on the currentconfiguration of the sensors ✓t. The states and observations are continuousvariables.
At every time step the goal of the system is to select the configuration ✓t
that will minimise the expected uncertainty over V (quantified by entropy H):
✓t = argmin✓
t
Z
O
p(ot|o0...t�1, ✓t)H(V |oo...t, ✓0...t)dot (1)
Proposed solution. For the recognition of the event and for the selection ofthe sensors configuration it is necessary to compute the posterior P (v|ot; ✓t).Given a prior distribution P (v,xt
1:N ) = P (xt1:N |v)P (v) and the independence
of the observed event from the sensor configuration P (v|✓) = P (v), the updateexpression of the posterior P (v|ot+1✓t+1) can be derived through the use of theBayes rule:
P (v|ot+1, ✓t+1) =P (ot+1|v, ✓t+1)P (v)
P (ot+1|✓t+1)(2)
The computation of eq.1 and eq.15 in the general case can pose severe compu-tational complexities. The solution proposed is based on the assumption that,once v is fixed, the dynamics is linear and the probability distributions are nor-mal. This enables the use of a mixture of KF with a distinct KF for each valueof v. Denoting with ot+1
v,✓t+1 the mean expected observation and with St+1v,✓t+1
its covariance matrix, both of which are conditioned on v and ✓ and computedduring the KF update, the following can be derived:
✓t+1 = argmin✓
t+1
X
v
P (v)⇣12ln |St+1
v,✓
t+1 |+Z
O
N (o; ot+1v,✓
t+1 ,St+1v,✓
t+1) ln(P (o|✓t+1))do⌘ (3)
Where |S| is the determinant of a matrix S. The first order Taylor expansionof P (o|✓) at point ot+1
v results in:
✓t+1 ⇡ argmin✓
X
v
P (v)
"1
2ln |St+1
v,✓t+1 | + lnVX
v0
⇣P (v0) N (ov,✓; o
t+1v0,✓t+1 ,S
t+1v0,✓t+1)
⌘#(4)
5
Info Gain Using Kalman Filters
Expected entropy for hypothesis v
Difference of predic:ons between the models
Gaze target during event observa:on
0 5 10 15 200
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Time stesp
performer besttarget besttarget not bestperformer not bestRa
:o of saccade
s on the elem
ent
Ognibene & Demiris IJCAI 2013
Modelling the temporal coupling of percep$on with external events
Results
Ognibene & Demiris IJCAI 2013
Results
Ognibene & Demiris IJCAI 2013
Mul:ple Complex Simultaneous Ac:vi:es
Hierarchical Ac:on Representa:on to Represent Temporal Structure
Probabilis$c Grammars Dynamic Bayes Network
Lee, Ognibene, Chang , Kim, Demiris (Submimed)
Lee, Ognibene, Chang , Kim, Demiris (Submimed)
STARE Spa:o-‐Temporal Amen:on Reloca:on for Mul:ple Structured Ac:vi:es Detec:on
Ac:ve Percep:on (AP) Issues
• Where to look? • What to remember? • When to stop looking and start ac:ng?
– Enough informa:on? – Enough :me? – Is acquired informa:on s:ll valid?
Ac:ve Percep:on Issues
• Why has evolu:on selected amen:on and reduc:on of percep:ve space for many species?
• Why does a massively parallel system, like the brain, needs to use a serial mechanism like amen:on?
Is AP just useful to cope with hidden informa$on?
Ac:ve Percep:on Issues
• How are decision making and planning affected by AP? How computa:on is affected by AP?
• Is AP in the brain reflected by a peculiar kind of “ac:ve processing”?
• How is learning affected by AP? • How are representa:ons affected by AP? • How can the brain self-‐organise to support AP? • How would a dysfunc:on of AP be manifest?
Ac:ve Percep:on Issues
• How are decision making and planning affected by AP? How computa:on is affected by AP?
• Is AP in the brain reflected by a peculiar kind of “ac:ve processing”?
• How is learning affected by AP? • How are representa:ons affected by AP?
• How can the brain self-‐organise to support AP? • How would a dysfunc:on of AP be manifest?
Percep:on control is strongly dependent on the task
Learning a new task may require learning a new percep:on control policy
Ac:ve Percep:on and Learning
40
Ognibene & Baldassare, IEEE TAMD, 2014
Foveal Vision and Saliency Map May Speed-‐Up Learning of “Ecological Tasks”
Ognibene & Baldassare, IEEE TAMD, 2014
Subjec:ve and efficient representa:ons
42
Ognibene & Baldassarre, IEEE TAMD, 2014
Agent has a fovea and can see colors only at the center of its field of view Agent is rewarded if it touched the red block The red block is always on the leq of the green blocks Green blocks are very easy to find Blue blocks are randomly posi:oned distractors What will be the right ac$on to do, the right representa$on to learn for the blue object?
Subjec:ve and efficient representa:ons
43
Ognibene & Baldassarre, IEEE TAMD, 2014
What will be the right ac$on to do, the right representa$on to learn for the blue object? While a random ac$on was expected due to random posi:on of the blue block, the agent learns a well organised representa:on. It moves from the blue block up, down on the same column or right. The policy learnt by the agent for the green and red blocks biased the agent percep$on of the blue object making it a landmark to find the red object and the agent behaviour effec$ve even without memory.
Subjec:ve and efficient representa:ons
44
Ognibene & Baldassarre, IEEE TAMD, 2014
The policy learnt by the agent for the blue an red blocks biased the agent percep:on of the blue object while making its behaviour effec:ve. The agent starts usually from the green object and moves to elements in the leq adjacent column expec:ng to find the red object. This leads to ignore the blue blocks that are not in the columns at the leq of the green blocks (those inside the orange circle). Next picture shows the resul:ng perceived structure of the world.
Subjec:ve and efficient representa:ons
45
Ognibene & Baldassarre, IEEE TAMD, 2014
Perceived World biased by Ac$ve Percep$on The policy learnt by the agent for the green and red blocks biased the agent percep$on of the blue object making it a landmark to find the red object and the agent behaviour effec$ve even without memory.
Sequence of observa:ons and their frequency
(grey) aqer learning
Subjec:ve and efficient representa:ons
46
Ognibene & Baldassarre, IEEE TAMD, 2014
Representa:ons Evolu:on
47
G
R
B
Ognibene et al, SAB 2008
Representa:ons are not formed in a uniform way.
The system shows a sequen:al forma:on of different areas of ac:vity. This may be due to the selec:ve aspect of ac:ve percep:on which enables percep:on and change only
on a subset of s:muli.
Representa:ons Evolu:on
48
G
R
B
Ognibene et al, SAB 2008
Representa:ons Evolu:on
49
G
R
B
Ognibene et al, SAB 2008
As representa:ons are not formed in a uniform way the same is true for the behaviours acquired by the
agent. The sequen:al forma:on of
different areas of ac:vity may not only be reflected in the behaviours sequen:ally acquired but also be caused by the increasing capability of the agent due to acquiring other
behaviours and give place to “scaffolding” supported by AP
Representa:ons Evolu:on
50
G
R
B
Ognibene et al, SAB 2008
Ac:ve Percep:on Issues
• How are decision making and planning affected by AP? How computa:on is affected by AP?
• Is AP in the brain reflected by a peculiar kind of “ac:ve processing”?
• How are representa:ons affected by AP? • How is learning affected by AP? • How can the brain self-‐organise to support AP? • How would a dysfunc:on of AP be manifest?
Inten:on aware resource alloca:on in 3D Tracking for Precision Manipula:on
Ini:al Improvements
Introduc:on of constraints for spa:o-‐temporal consistency and op:misa:on to exploit GPUs and mul:core CPUs
…. but STILL TOO SLOW
Ini:al Improvements
Inten:on Aware Resource Alloca:on in 3D Tracking for Precision
Manipula:on Humans are able of fast adap:ve reac:ons to unforeseen events…
which requires fast (maybe imprecise) percep:on
Inten:on Aware Resource Alloca:on in 3D Tracking for Precision
Manipula:on
DARWIN Attention
3D Pose Estimator
Depth Image
Mask Builder
Tracker
Other Object Masks
External Motion Info
Mask
OcclusionOcclusion
OcclusionImage
Camera Image
ID
ImageImage
ConfidenceConfidence
Confidence
OUTPUT
3D Posture ConfidenceClass ID
2D Object Detector
DARWIN Cognitive Architecture3D Posture ConfidenceClass ID
Rendered Image
3D Posture ConfidenceClass ID
Rendered ImageRendered
Image
3D Posture ConfidenceClass ID
OBJECT REPRESENTATION
3D Posture ConfidenceClass ID
Rendered Image
3D Posture ConfidenceClass ID
Rendered ImageRendered
Image
3D Posture ConfidenceClass ID
Intentions Predictions
Context Sensitive Resource Allocation
Appearence Based Fast Tracker
Complex visual percep:on system running on parallel hardware with direct and indirect dependencies
between the components
Inten:on Aware Resource Alloca:on in 3D Tracking for Precision
Manipula:on
DARWIN Attention
3D Pose Estimator
Depth Image
Mask Builder
Tracker
Other Object Masks
External Motion Info
Mask
OcclusionOcclusion
OcclusionImage
Camera Image
ID
ImageImage
ConfidenceConfidence
Confidence
OUTPUT
3D Posture ConfidenceClass ID
2D Object Detector
DARWIN Cognitive Architecture3D Posture ConfidenceClass ID
Rendered Image
3D Posture ConfidenceClass ID
Rendered ImageRendered
Image
3D Posture ConfidenceClass ID
OBJECT REPRESENTATION
3D Posture ConfidenceClass ID
Rendered Image
3D Posture ConfidenceClass ID
Rendered ImageRendered
Image
3D Posture ConfidenceClass ID
Intentions Predictions
Context Sensitive Resource Allocation
Appearence Based Fast Tracker
Inten:on Aware Resource Alloca:on in 3D Tracking for Precision
Manipula:on
DARWIN Attention
3D Pose Estimator
Depth Image
Mask Builder
Tracker
Other Object Masks
External Motion Info
Mask
OcclusionOcclusion
OcclusionImage
Camera Image
ID
ImageImage
ConfidenceConfidence
Confidence
OUTPUT
3D Posture ConfidenceClass ID
2D Object Detector
DARWIN Cognitive Architecture3D Posture ConfidenceClass ID
Rendered Image
3D Posture ConfidenceClass ID
Rendered ImageRendered
Image
3D Posture ConfidenceClass ID
OBJECT REPRESENTATION
3D Posture ConfidenceClass ID
Rendered Image
3D Posture ConfidenceClass ID
Rendered ImageRendered
Image
3D Posture ConfidenceClass ID
Intentions Predictions
Context Sensitive Resource Allocation
Appearence Based Fast Tracker
Ac$ve Percep$on and Computa$on to reduce uncertainty
• Intrinsic scene saliency: maximise expected overall predictability (e.g. an object moving will make salient also nearby objects that may occlude it or deviate it)
• Agent Inten:on -‐> rise saliency changing predic:ons
1. Humans apply certain strategies to detect hard abnormali:es in soq :ssues
2. Op:mally chosen speed and load of tac:le probing will lead to improved tumour detec:on and bemer clinical outcomes
3. Embodied percep.on of environment should be considered to define op:mal probing behaviour
Jelizaveta Konstan:nova
Laboratory for Morphological Computa$on and Learning
(Thrish.org KCL)
Nantachai Sornkarn
Thrishantha Nanayakkara (PI)
Embodied Percep:on and Tac:le Explora:on
Embodied Percep:on
[1]J. Konstan$nova, M. Li, M. Gautam, P. Dasgupta, K. Althoefer and T. Nanayakkara. “Behavioral Characteris:cs of Manual Palpa:on to Localize Hard Nodules in Soq Tissues”, in press, IEEE Transac$ons on Biomedical Engineering, 2014.
[2] Nantachai Sornkarn, Thrishantha Nanayakkara, Mamhew Howard, “Internal Impedance Control Helps Informa:on Gain in Embodied Percep:on”, in IEEE Interna:onal Conference on Robo:cs and Automa:on (ICRA), 2014
Human Robot Hap:c Guidance
Anuradha Ranasinghe
Thrishantha Nanayakkara (PI)
Guiding agent can be modeled as 3rd order predic:ve model using a simple linear auto-‐regressive model (Arx). Human follower can be molded as 2nd order reac:ve control policy.
The guider can modulate the pulling force in response to the confidence level of the follower.
Confidence of the fo l lower correlates to model v irtual damping and can be ac$vely measured
Ac:ve Percep:on Issues
• How are decision making and planning affected by AP? How computa:on is affected by AP?
• Is AP in the brain reflected by a peculiar kind of “ac:ve processing”?
• How are representa:ons affected by AP? • How is learning affected by AP? • How can the brain self-‐organise to support AP? • How would a dysfunc:on of AP be manifest?
Predic:ve Coding Mumford 1992 Rao and Ballard 1999
Friston 2005 Spratling 2008 Hinton 2007 Clark 2013
Hierarchical Bayesian Predic:ve (Genera:ve) Model Predic:ons flow backward and predic:on errors forward (fast reac:on) Accumula:on of sensory evidence reduces Predic:on Error (or Surprisal) and realises both Perceptual inference and Learning in a Unified Framework Amen:on can be understood as inferring the level of [un]certainty (c.f., Kalman gain)
Figure from Feldman & Friston 2010
Ac:ve Inference Friston 2003,2010 BBS Review by Clark 2013
Ac:ve Inference is a generalisa:on of Predic:ve Coding to Ac:on comple:ng the Sensorimotor Loop
Ac:ons reduce predic:on error by realising predic:ons, e.g. predicted propriocep:ve state results in a predic:on error which produces a reac:on (e.g., reflects)
Innate priors and interac:on with the environment determine behaviour – no need for norma:ve quan::es like reward Varia:onal Free Energy allows to consider – in a tractable (approximate) analy:cal form – predic:ons and predic:on error under uncertainty Ac:on, Percep:on, Learning and Planning are unified under the same computa:onal principle
Ac:ve Inference and Ac:ve Percep:on
P !u | !o,γ( ) =σ (γ ⋅Q(π ))Qτ (π ) = EQ(oτ |π )[ln P(oτ |m)]" #$$$ %$$$
+ EQ(oτ |π )[D[Q(sτ | oτ ,π ) ||Q(sτ |π )]]" #$$$$$$ %$$$$$$Extrinsic value Epistemic value
Friston, Rigoli, Ognibene et al (submimed)
Agent priors on behaviour π now contain an epistemic/explora:ve part: an agent will tend to execute ac:ons that reduce its uncertainty about states of the world (c.f., maximise informa:on gain) Epistemic value corresponds to the Bayesian Surprise. Empirically people tend to direct their gaze towards salient visual features with high Bayesian surprise (I] & Baldi 2009)
Minimising Predic:on Error in a trivial way may lead an agent to get stuck in the non-‐adap:ve states, precluding Explora:ve Behaviour
Collaborators
Karl Friston (UCL)
Hector Geffner (UPF)
Thrish Nanayakkara
(KCL)
Kris De Meyer (KCL)
Giovanni Pezzulo (CNR)
Giuseppe Giglia (Uni Pa)
Yiannis Demiris (Imperial)
Gianluca Baldassarre
(CNR)
Vito Trianni (CNR)
Kyuhwa Lee (EPFL)