social settings
TRANSCRIPT
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 1/43
Paul Fitzpatrick lbr-vision
±± Face to FaceFace to Face ±±
Robot vision in social settings
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 2/43
Paul Fitzpatrick lbr-vision
R obot vision in social settings
Humans (and robots) recover infor mation aboutobjects from the light they reflect
Human head and eye movements give clues toattention and motivation
In a social context, people constantly read eachother¶s actions for these clues
Anthropomorphic robots can partak e in thisimplicit communication, giving smooth andintuitive inter action
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 3/43
Paul Fitzpatrick lbr-vision
Humanoid robotics at MIT
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 4/43
Paul Fitzpatrick lbr-vision
Eye tilt
Left eye panRight eye pan
Camera withwide field of
view
Camera with
narrow field of
view
Kism
et
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 5/43
Paul Fitzpatrick lbr-vision
Kism
et
Built by Cynthia Breazeal to explore
expressive social exchange between humans
and robots
± Facial and vocal expression
± Vision-mediated inter action (collabor ation with
Brian Scassellati, Paul Fitzpatrick )± Auditory-mediated inter action (collabor ation
with Lijin Aryananda)
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 6/43
Paul Fitzpatrick lbr-vision
Vision-m
ediated interaction
Visual attention
± Driven by need for high-resolution view of a particular
object, for example, to find eyes on a f ace± Mar k s the object around which behavior is organized
± Manipulating attention is a powerf ul way to influence
behavior
Pattern of eye/head movement± Gives insight into level of engagement
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 7/43
Paul Fitzpatrick lbr-vision
Expressing visual attention
Attention can be deduced from behavior
Or can be expressed more directly
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 8/43
Paul Fitzpatrick lbr-vision
Building an attention syste
m
Current inputPre-attentive filters
Saliency map Tracker
Eye movementBehavior system
Modulator y influence
Primar y information flow
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 9/43
Paul Fitzpatrick lbr-vision
Skin tone Color Motion Habituation
Weighted
by behavioral
relevance
Pre-attentive filters
Initiating visual attention
Current input Saliency map
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 10/43
Paul Fitzpatrick lbr-vision
Example filter ± skin tone
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 11/43
Paul Fitzpatrick lbr-vision
Image pixel p(r,g,b) is NOT considered sk in tone if:
r < 1.1vg (red component f ails to dominate green sufficiently)
r < 0.9 v b (red component is excessively dominated by blue)
r > 2.0 v max(g,b) (red component completely dominates)
r < 20 (red component too low to give good estimate of r atios)
r > 250 (too satur ated to give good estimate of r atios)
Lots of things that are not sk in pass these tests
But lots and lots of things that are not sk in f ail
Skin tone filter ± details
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 12/43
Paul Fitzpatrick lbr-vision
high sk in gain,
low color saliency gain
Look ing time ±
86% f ace, 14% block
Look ing time ±
28% f ace, 72% block
low sk in gain,
high color saliency gain
Modulating vis
ual attention
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 13/43
Paul Fitzpatrick lbr-vision
Manipulating vis
ual attention
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 14/43
Paul Fitzpatrick lbr-vision
slipped« «recovered
Maintaining visual attention
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 15/43
Paul Fitzpatrick lbr-vision
Persistence of attention
Want attention to be responsive to changing
environment
Want attention to be persistent enough to
per mit coherent behavior
Tr ade-off between persistence and
responsiveness needs to be dynamic
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 16/43
Paul Fitzpatrick lbr-vision
Influences on attention
Current inputPre-attentive filters
Saliency map Tracker
Eye movementBehavior system
Modulator y influence
Primar y information flow
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 17/43
Paul Fitzpatrick lbr-vision
Vergence
angle
Left eye
Right eye
Ballistic saccade
to new target
Smooth pursuit andvergence co-operate
to track object
Eyem
ovem
ent
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 18/43
Paul Fitzpatrick lbr-vision
Eye/neckm
otor control
Neck movements combine :-
± Attention-driven orientation shifts
± Affect-driven postur al shifts
± Fixed action patterns
Eye movements combine :-
± Attention-driven orientation shifts
± Turn-tak ing cues
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 19/43
Paul Fitzpatrick lbr-vision
Social a
mplification of
m
otor acts Active vision involves choosing a robot¶s pose to
f acilitate visual perception.
Focus has been on immediate physicalconsequences of pose.
For anthropomorphic head, active vision str ategiescan be ³read´ by a human, assigned an intent
whichma
y then be com
pleted beyond the robot¶simmediate physical capabilities.
Robot¶s pose has communicative value, to whichhuman responds.
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 20/43
Comfortable
interaction distance
Too close ±
withdrawal
response
Too far ±
calling
behavior
Person draws
closer
Person
backs off
Comfortable interaction
speed
Too fast ± irritation
responseToo fast,
Too close ±
threat response
Beyond
sensor
range
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 21/43
Paul Fitzpatrick lbr-vision
Video: Withdrawal response
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 22/43
Paul Fitzpatrick lbr-vision
Eye tilt
Left eye panRight eye pan
Camera withwide field of
view
Camera with
narrow field of
view
Kism
et¶s cam
eras
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 23/43
Paul Fitzpatrick lbr-vision
Simplest camera configuration
Single camer a± Multiple camer a systems require caref ul calibr ation for cross-
camer a correspondence
Wide field of view± Don¶t k now where to look beforehand
Moving infrequently relative to the r ate of visualprocessing
± Ego-motion complicates visual processing
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 24/43
Skin
Detector
Color
Detector
Motion
Detector
Face
Detector
Wide
Frame Grabber
MotionControl
Daemon
Right
Frame Grabber
Left
Frame Grabber
Right Foveal
Camera
Wide
CameraLeft Foveal
Camera
Eye-Neck
Motors
W W W W
Attention
Wide
Tracker
Foveal
Disparity
Smooth Pursuit
& Vergence
w/ neck comp.
VOR
Saccade
w/ neck
comp.
Fixed Action
Pattern
Affective
Postural Shiftsw/ gaze comp.
Arbitor
Eye-Head-Neck Control
DisparityBallistic movement Locus of attentionBehaviors
Motivations
5, 5, 5. ..
U, d U d U2
s s s
U, d U d U2
p p p
U,
d U d U
2
f f f U,
d U d U
2
v v v
Wide
Camera 2
Tracked
target
Salient target
Eye
finder
Left
Frame Grabber
Distance
to target
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 25/43
Paul Fitzpatrick lbr-vision
Missing components
High acuity vision ± for example, to find
eyes within a f ace
± Need camer as that sample a narrow field of
view at high resolution
Binocular view, for stereoscopic vision
± Need paired camer as± May need wide or narrow fields of view,
depending on application
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 26/43
Paul Fitzpatrick lbr-vision
Missing: high acuity vision
Typical visual task s require both high acuity and a
wide field of view
High acuity is needed for recognition task s and for controlling precise visually guided motor
movements
A wide field of view is needed for search task s,
for tr ack ing multiple objects, compensating for
involuntary ego-motion, etc.
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 27/43
Paul Fitzpatrick lbr-vision
Biological solution
A common tr ade-off found in biological systems
is to sample part of the visual field at a high
enough resolution to support the first set of task s,and to sample the rest of the field at an adequate
level to support the second set.
This is seen in animals with foveate vision, such
as humans, where the density of photoreceptors ishighest at the center and f alls off dr amatically
towards the periphery.
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 28/43
Paul Fitzpatrick lbr-vision
Simulated example
Compare size of eyes and ears intr ansfor med image ± eyes are closer tocenter, and so are better represented
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 29/43
Foveated vision
(From C. Graham, ³Vision and Visual Perception´)
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 30/43
Paul Fitzpatrick lbr-vision
Mechanical approximations
Imaging surf ace with varying sensor density
(Sandini et al)
Distorting lens projecting onto conventionalimaging surf ace (K uniyoshi et al)
Multi-camer a arr angements (Scassellati et al)
Cam
er as with
zoom
control directly tr ade-off acuity with field of view (but can¶t have both)
Or do something completely different!
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 31/43
Paul Fitzpatrick lbr-vision
Multi-camera arrangement
Wide view camer a gives context
used to select region at which to
point narrow view camer a
If target is close and moving,
must respond quick ly and accur ately,
or won¶t gain any infor mation at all
W ide field of view,
low acuity
Narrow field of view,
high acuity
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 32/43
Paul Fitzpatrick lbr-vision
Mixing fields of view
Small distance between camer as with wide and narrow
field of views, simplifying mapping between the two
Centr al location of wide camer a allows head to be oriented
accur ately independently of the distance to an object
Allows coarse open-loop control of eye direction from
wide camer a ± improves gaze stability
Eye tilt
Left eye panRight eye pan
Fixed with respect
to head
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 33/43
Paul Fitzpatrick lbr-vision
Wide
View
camera
Narrow
view
camera
Objectof interest
Field of view
Rotate
camera
New field
of view
Tip-toeing around 3D
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 34/43
Paul Fitzpatrick lbr-vision
Using 3D
Eye tilt
Left eye panRight eye pan
Fixed with respect
to head
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 35/43
Skin
Detector
Color
Detector
Motion
Detector
Face
Detector
Wide
Frame Grabber
MotionControl
Daemon
Right
Frame Grabber
Left
Frame Grabber
Right Foveal
Camera
Wide
CameraLeft Foveal
Camera
Eye-Neck
Motors
W W W W
Attention
Wide
Tracker
Foveal
Disparity
Smooth Pursuit
& Vergence
w/ neck comp.
VOR
Saccade
w/ neck
comp.
Fixed Action
Pattern
Affective
Postural Shiftsw/ gaze comp.
Arbitor
Eye-Head-Neck Control
DisparityBallistic movement Locus of attentionBehaviors
Motivations
5, 5, 5. ..
U, d U d U2
s s s
U, d U d U2
p p p
U, d U d U2
f f f U, d U d U
2
v v v
Wide
Camera 2
Tracked
target
Salient target
Eye
finder
Left
Frame Grabber
Distance
to target
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 36/43
Paul Fitzpatrick lbr-vision
Kismet¶s little secret
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 37/43
NTspeech synthesis
affect recognition
LinuxSpeech
recognition
Face
Control
Emotion
Percept
& Motor
Drives &
Behavior
L
Tracker
Attent.system
Dist.
totarget
Motionfilter
Eyefinder
Motor ctrl
audiospeechcomms
Skinfilter
Color filter
QNX
CORBA
sockets,
CORBA
CORBA
dual-port
RAM
Cameras
Eye, neck, jaw motorsEar, eyebrow, eyelid,
lip motors
Microphone
Speakers
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 38/43
Paul Fitzpatrick lbr-vision
R obots looking at humans
Responsiveness to the human f ace is vital
for a robot to partak e in natur al social
exchange
Need to locate and tr ack f acial features, and
recover their semantic content
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 39/43
Paul Fitzpatrick lbr-vision
Hair Forehead
Eye/brow
Cheeks/nose
Mouth/jaw
TOP
BOTTOM
LEFT
Hair
Skin Eye Bridge
Hair
SkinEye RIGHT
Modeling the face
Match oriented regions on f ace againstvertical model to isolate eye/browregion
Match eye/brow region againsthorizontal model to find eyes, bridge
Each model scans one spatialdimension, so can for mulate as HMM,allowing f ast optimization of match
V ertical face model
Horizontal
eye/brow model
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 40/43
Paul Fitzpatrick lbr-vision
R obots looking at robots
It is usef ul to link the robot¶s representation of its own f ace
with that of humans.
Bonu
s: Allows robot-robot inter action vi
ahuma
n protocol.
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 41/43
Paul Fitzpatrick lbr-vision
Conclusion
Vision community wor k ing on improving
machine perception of human
But equally important to consider human
perception of machine
Robot¶s point of view must be clear to
human, so they can communicateeffectively ± and quick ly!
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 42/43
Paul Fitzpatrick lbr-vision
Video: Turn taking
8/7/2019 Social Settings
http://slidepdf.com/reader/full/social-settings 43/43
Paul Fitzpatrick lbr-vision
Video: Affective intent