laboratory for perceptual robotics – department of computer science hierarchical mechanisms for...
Post on 19-Dec-2015
220 views
TRANSCRIPT
Laboratory for Perceptual Robotics – Department of Computer Science
Hierarchical Mechanisms for Robot Programming
Shiraj Sen Stephen Hart Rod Grupen Laboratory for Perceptual Robotics
University of Massachusetts AmherstMay 30, 2008
NEMS ‘08
2Laboratory for Perceptual Robotics – Department of Computer Science
OutlineHierarchical mechanisms
for robot programming
representationprogrammin
g
ActionPotential functions
Value functions
State representation
user defined
reinforcementlearning
intrinsicextrinsic
3Laboratory for Perceptual Robotics – Department of Computer Science
Hierarchical Actions
Σ G
H
Σ G
H
Σ G
H
forcevelocity
references
feedbacksignals
ϕpotential fields
Φvalue functions greedy traversal
avoids local minimum
programs
closed loopprimitive actions
4Laboratory for Perceptual Robotics – Department of Computer Science
Primitive Action Programming Interface
Sensory Error () Visual (uref)
Tactile (fref) Configuration
variables (θref) Operational
Space(xref)
Potential Functions () Spring potential fields
(ϕh)
Collision-free motion fields (ϕc)
Kinematic conditioning fields (ϕcond)
Motor Variables ()Subsets of : Configuration
Variables Operational
Space Variables
primitive actions:
a =Nullspace Projection
a1 a2
5Laboratory for Perceptual Robotics – Department of Computer Science
State Representation
Discrete abstraction of action dynamics. 4-level logic in control predicate pi
no reference ()
convergenceunknown X
-
1
0 descending gradient
6Laboratory for Perceptual Robotics – Department of Computer Science
Hierarchical Programming
A program is defined as a MDP over a vector of controller predicates:
S = p1 … pN
Absorbing states in the value function capture “convergence” of programs.
X
-
1
0
Learn value functions using reinforcement learning
7Laboratory for Perceptual Robotics – Department of Computer Science
StackInsertGraspTouch
Catalog
Intrinsic Reward
Goal: build deep control knowledge
Reward controllable interaction with the world• controllers with direct feedback from the external world.
Track
X
-
1
0
convergence event
X
-
1
0
8Laboratory for Perceptual Robotics – Department of Computer Science
Experimental Demonstration
Motor units• Two 7-DOF Barrett WAMs• Two 4-DOF Barrett Hands• 2-DOF pan/tilt stereo head
Sensory feedback• Visual
• Hue• Saturation• Intensity• Texture
• Tactile • 6-axis finger-tip F/T sensors
• ProprioceptiveDexter
9Laboratory for Perceptual Robotics – Department of Computer Science
STAGE 1: SaccadeTrack - 25 Learning Episodes
atrack
atrack
atrack
asaccade asaccade
X 1X 0
1 X
0 X
X -
X X
Sst = psaccade ptrack
rewarding action
Track-saturation
10Laboratory for Perceptual Robotics – Department of Computer Science
Srg = pst preach pgrab
STAGE 2: ReachGrab - 25 Learning Episodes
rewarding action
TouchTrack-saturation
11Laboratory for Perceptual Robotics – Department of Computer Science
STAGE 2: ReachGrab - 25 Learning Episodes TouchTrack-saturation
12Laboratory for Perceptual Robotics – Department of Computer Science
STAGE 3: VisualInspect - 25 Learning Episodes
Svi = prg pcond ptrack(blue)
TouchTrack-saturation
Track-blue
rewarding action
13Laboratory for Perceptual Robotics – Department of Computer Science
STAGE 3: VisualInspect - 25 Learning Episodes
TouchTrack-saturation
Track-blue
14Laboratory for Perceptual Robotics – Department of Computer Science
STAGE 4: Grasp – User Defined Reward
X - -
1 X XX X X
ReachGrab
X
-
1
0
X 0 0 X 1 1
X 1 0
X 0 1
amoment aforce
TouchTrack-saturation
Grasp
Track-blue
Sgrasp = prg pmoment pforce
rewarding action
15Laboratory for Perceptual Robotics – Department of Computer Science
STAGE 5: PickAndPlace – User Defined Reward
atransport amoment
X
-
1
0
X X X
Grasp
X 0 - X 0 0
X - -
1 X X X 1 1X 1 0
Spnp = pg ptransport pmoment
rewarding action
16Laboratory for Perceptual Robotics – Department of Computer Science
Conclusions
Mechanisms for creating hierarchical programs.• recursive formulation of potential functions and value functions.
control theoretic representation for action, state, and intrinsic reward.
Experimental demonstration of programming manipulation skills using staged learning episodes.
Intrinsic reward pushes out new behavior and models the affordances of objects.
17Laboratory for Perceptual Robotics – Department of Computer Science
Thank You