local optimal feedback control: principles and applications

Slide 1Djordje Mitrovic
Outline
2/31
• Why optimal motor control? • Little recap on optimal control • Approximative optimal control • Iterative Linear Quadratic Gaussian (ILQG)
– Theory and examples • If time: Adaptive Optimal Control ILQG-LD • Demo ILQG on real system
Examples of optimality principles in motor control
3/31
Industrial
Target
Pick solution that is optimal/minimal w.r.t. some cost function: o Min. Jerk, (Flash & Hogan ,1985) o Min. Torque change, (Uno et al.,1989) o Min. Endpoint variance, (Harris & Wolpert, 1998) o Min. Energy (Todorov, 2002,…)
]q[q;x &=
Mini-Recap: Optimality Principles (2/3)
Example: cost function for “energy-optimal” arm reaching:
∑ =
t dlThEtv τττττπ )))(,(),(,()((),( xπxxx
Question: Can you think of examples for F, g and G in robot/human systems?
6/31
OFC Problem: Find the control law that minimises the expected cost defined in the cost function v.
Linear dynamics & Quadratic cost functions:
Nonlinear dynamics and non-quadratic cost function
BuAxx +=& Lxu −=Global optimal feedback control law
Apply approximative methods ILQG (Li & Todorov, 2004) ILQR DDP (Jacobson and Mayne,1970)
Approximative (local) Optimal Feedback Control methods
7/31
Basic idea: if dynamics are non-linear and costs are non- quadratic, one can still apply the LQ solution approximately around a nominal trajectory and use local solutions to iteratively improve the nominal solution.
iteration 1
iteration 2
iteration 3
iteration N … …
8/31
1. Create initial control sequence, apply it to the (nonlinear) dynamics, then obtain a corresponding state sequence.
iteration 1 iteration 2
iteration 3
iteration N … …
2. Construct linear approximation to the dynamics & quadratic approximation to the cost; We get a LQG optimal control problem with respect to the state and control deviations.
3. Solve the LQG problem, obtain optimal control deviation sequence, and add it to the given control sequence. Go to step 1, or exit if converged.
ILQG detailed 1/6
This is an LQG problem
Question: What would be good initial trajectories for a robot arm and why?
state & control deviations
ILQG detailed 2/6
10/31
2. …
Question: What are the potential “dangers” with these partial derivative terms of cost, dynamics and noise?
Linearisation terms
Cost to go is in quadratic form
ILQG detailed 4/6
14/09/2009 12/31
3. … compute affine control law of the form: l: Open-loop component
L: Feedback control gain
Local optimal control law
for i= 1 to n-1
for i= 1 to n-1
for i= n-1 to 1
for i= n-1 to 1
1. Create initial control sequence; apply it to the (nonlinear) dynamics & obtain a corresponding state sequence & cost.
2. Construct linear approximation to the dynamics and quadratic approximation to the cost;
3. Setup local LQG optimal control problem w.r.t. the state and control deviations. Find control law l and L.
4. Obtain the optimal control deviation sequence, and add it to the initial control sequence.
Cost converged?
14/31
What we get out of ILQG: – Optimal control sequence: – Corresponding optimal state sequence: – Locally optimal feedback control law:
11..ku −
ILQG: Application example 1
• Planar human arm model with 2 joints and 6 muscles.
redundancy in the dynamics!
• Application on Barrett WAM with 4 kinematic degrees of freedom.
redundancy in the kinematics !
Open loop optimal control: Min. Jerk Optimal Feedback Control: ILQG
video1 video2
• Feedback gains follow the so called minimum intervention principle (Todorov, 2004) “The system only corrects an error if it is beneficial to the task at hand”
7991..u 8001..x7991..L
18/02/2010 18/31
cost
Notation:
19/31
Standard equation of motion for robot arm
Forward dynamics function f(x,u).
• Smoothing the discontinuities
20/31
• Difficult to achieve real time computations due to the linearisation steps within ILQG.
• Potential solutions: – Choose larger simulation time steps dt
– Sub-sampling approach – Try to compute analytic derivatives
Question: What is the problem with this approach?
Adaptive Optimal control
21/31
• What if the dynamics change over time, for example due to an added tool?
We can use machine learning techniques to learn the dynamics online directly from sensorimotor feedback of the plant.
Dynamics Learning with LWPR
u],q[q, &
23/31
24/31
Velocity-dependent Force Field
Cost Function:
Can predict the “ideal observer” adaptation behaviour under complex force fields due to the ability to work with adaptive dynamics.
Demo: ILQG on antagonistic arm
25/31
• Build system • Identify dynamics • Compute ILQG solution for a task • Transfer solution to control real robot
Demo: Define Dynamics
2. Calibrate position sensor. 3. Calibrate acceleration sensor. 4. Collect training data and fit
parameters.
Demo: Compute ILQG solution for a task • Dynamics (extended state)
• Setup general cost function
Energy (Spring-level)
Energy Motor-level
Demo: Run control law on Robot • Here we run ILQG in a receding horizon mode a.k.a.
model predictive control: 1. Compute ILQG for a short time horizon 2. Apply control law for one time step on the real robot 3. Read out resulting state of the arm, go to step 1.
29/31
Energy (Spring-level)
Energy Motor-level
Summary • Optimal motor control • Local optimal feedback control for nonlinear and high
dimensional (redundant) systems • ILQG as one specific method • Adaptive optimal control – ILQG-LD • Real world application – be aware of the limitations for
real world implementations
Reading & further information
• Detailed notes on course homepages • ILQG code online • Emanuel Todorov’s homepage a good resource
for optimal control. See his book chapter) • Drop me an email for questions or remarks:
Outline
Iterative Linear Quadratic Gaussian (ILQG)
ILQG detailed 1/6
ILQG detailed 2/6
ILQG detailed 3/6
ILQG detailed 4/6
ILQG detailed 5/6
ILQG detailed 6/6
Adaptive Optimal control
Learning OFC: Advantages
Demo: Define Dynamics
Demo: Identify Dynamics
Demo: Run control law on Robot
Summary

local optimal feedback control: principles and applications

Documents