online optimal impedance planning for legged...

Online Optimal Impedance Planning for Legged Robots

Franco Angelini1,2,3, Guiyang Xin4, Wouter J. Wolfslag4, Carlo Tiseo4,Michael Mistry4, Manolo Garabini1,3, Antonio Bicchi1,2,3 and Sethu Vijayakumar4

Abstract— Real world applications require robots to operatein unstructured environments. This kind of scenarios may leadto unexpected environmental contacts or undesired interactions,which may harm people or impair the robot. Adjusting thebehavior of the system through impedance control techniquesis an effective solution to these problems. However, selectingan adequate impedance is not a straightforward process.Normally, robot users manually tune the controller gains withtrial and error methods. This approach is generally slowand requires practice. Moreover, complex tasks may requiredifferent impedance during different phases of the task. Thispaper introduces an optimization algorithm for online planningof the Cartesian robot impedance to adapt to changes inthe task, robot configuration, expected disturbances, externalenvironment and desired performance, without employing anydirect force measurements. We provide an analytical solutionleveraging the mass-spring-damper behavior that is conferredto the robot body by the Cartesian impedance controller. Stabil-ity during gains variation is also guaranteed. The effectivenessof the method is experimentally validated on the quadrupedalrobot ANYmal. The variable impedance helps the robot totackle challenging scenarios like walking on rough terrain andcolliding with an obstacle.

I. INTRODUCTION

The development of more flexible and adaptable con-trol architectures and technologies has initiated a shift ofparadigm in robotics. This new trend aspires to move robotsfrom traditional industrial cages to unstructured environ-ments. For example, there has been an increasing interest inthe deployment of robots in hazardous environments, suchas mines, disaster area, nuclear sites, space and offshoreplatforms. Despite the great improvements in controllingrobots in challenging dynamic condition, they are still notsufficiently reliable and adaptable to be employed in suchlocations. Especially, the intrinsic unpredictability of suchenvironments makes the exteroceptive information not reli-able. Thus, moving around in such conditions may severelyaffect the robot: there could be unexpected interactions ordisturbances, which may destabilize or damage the robot. An

This work was funded by the European Commission Horizon 2020 WorkProgramme: THING ICT-2017-1 780883 and MEMMO ID: 780684, as wellas EPSRCs RAI Hubs for Extreme and Challenging Environments: ORCAEPR026173/1 and NCNR EPR02572X/1.

1Centro di Ricerca “Enrico Piaggio”, Universita di Pisa, Largo LucioLazzarino 1, 56126 Pisa, Italy

2Soft Robotics for Human Cooperation and Rehabilitation, FondazioneIstituto Italiano di Tecnologia, via Morego, 30, 16163 Genova, Italy

3Dipartimento di Ingegneria dell’Informazione, Universita di Pisa, LargoLucio Lazzarino 1, 56126 Pisa, Italy

4Institute for Perception, Action, and Behaviour, School of Informatics,The University of Edinburgh, Informatics Forum, 10 Crichton Street,Edinburgh, EH8 9AB, United [email protected]

Fig. 1. Quadrupedal robot ANYmal walking on rough terrain withonline impedance planning.

adaptive behavior may help to mitigate or even avoid thesedangerous scenarios. Impedance control [1] tackles theseissues by directly setting the robot stiffness and damping.In other words, this technique allows to specify the forceproduced in response to a motion imposed by the environ-ment, leading to the desired robust and adaptive behavior.

Applying impedance control to legged robots is a commonpractice in literature [2], [3], [4], [5], [6]. However, thereis currently no established basis upon which to choose theimpedance gains for a specific system or scenario. Tuningthese parameters is generally tedious and time- consuming.An experienced robot programmer may be able to find somefine-tuned fixed gains that achieve the desired behavior, butdifferent tasks and different robot configurations may requiredifferent parameters. Thus, it is of paramount importance toreplace the typical trial and error impedance modulation withan automatic one.

Literature proposes several approaches to solve this prob-lem. Learning techniques [7], [8], [9], [10], [11] are an effec-tive method to achieve this goal. The price of these strategiesis that they require demonstrations from humans, rich datasets, or, generally, a learning process that may require severaliterations to converge. Thus, despite their efficacy, thesemethods can not be employed in every scenario.

In [12] the authors formulate a robust optimization prob-lem to compute the optimal control action and compliancevalue for a robot-environment interaction scenario underuncertainties. This techinique is validated on a handover task,however, it is an offline approach.

(a) Step force disturbance in the low impedance case. (b) Wall interaction in the low impedance case.

(c) Step force disturbance in the high impedance case. (d) Wall interaction in the high impedance case.

Fig. 2. Interactions in low and high impedance cases. (a,c) step force disturbance with module equal to 170N. (b,d) interaction witha wall (brown box)) due to a roll movement. The impedance of the robot is set as low in (a) and (b), and it is set as high in (c) andin (d). In the case of low impedance the robot falls in case of an external force disturbance, but it is more robust to contacts with theenvironment. In the case of high impedance the robot is able to reject force disturbances, but the contact with the wall destabilizes it.This is highlighted by the dashed red circle: the robot loses support on two legs.

Other strategies focus on an online adaptation of theimpedance based on sensory feedbacks (mostly force sen-sors). In [13] the authors employ adaptive dynamic pro-gramming to find the impedance gains that minimize thetracking error and interaction forces with an unknown envi-ronment. The results are validated with simulations. In [5] animpedance tuning method based on time-varying Lyapunovstability margins is proposed and experimentally verified.The goal is to guarantee the balancing of a humanoid robotduring walking. In [14] is presented an impedance controllerthat automatically regulates the impedance of the robotbased on expected interaction forces and local sensory data(vision, tracking error and interaction forces). The authorsexperimentally validate it on a manipulation task.

All these works rely on force measurements to modifythe impedance of the robot accordingly, which may notbe available on the robot. In this paper we propose amethod for online Cartesian impedance planning of a robotwithout utilizing direct force measurements. The idea is toleverage the mass-spring-damper model resulting from theemployment of a task-space impedance controller. Insteadof changing the impedance based on force measurements,we define a constrained optimization problem in which thecost index aims at highest compliance (corresponding to themost interaction-tolerant robot behavior), and the constraintsguarantee the desired task performance. Such constraints takeinto account: desired maximum tracking error, robot config-uration, robot stability, expected disturbances and accuracywith which the environment is perceived. To the best ofauthors’ knowledge, this is the first time a legged robotwithout external force measurements is able to modulate itsimpedance online.

In this work we show that variable gains lead to better per-formance and robustness to unexpected interactions. We willrely on the impedance controller proposed in [3], developinga module that will plan the gains of this controller online.In particular, we will focus on the impedance modulation ofthe torso of the robot. Note that in this case the employmentof any force sensor at the point of contact is particularly

challenging since the interaction force could be applied atany point of the robot body. The proposed method will beexperimentally validated on the quadrupedal robot ANYmal[6] (Fig. 1).

II. PROBLEM DEFINITION

Legged robots are developed to solve tasks in unstructuredenvironment, thus their impedance modulation is a trade-offbetween performance in achieving the task and robustnessto unexpected environmental interactions. Low impedanceimproves the compliance to undesired contacts, while highimpedance improves tracking performance and disturbancerejection. Fig. 2 depicts two examples of this trade-off.The case of study is the response of the robot ANYmalin presence of a step force disturbance (Fig. 2(a,c)) or aninteraction with the environment (Fig. 2(b,d)). Both casesare analyzed with low (Fig. 2(a,b)) and high impedance(Fig. 2(c,d)). Comparing Fig. 2(a) and Fig. 2(c) shows thathigh impedance allows to reject force disturbances. On theother hand, high impedance (Fig. 2(d)) causes large contactforces that destabilize the robot, while low impedance allowssmaller contact forces (Fig. 2(b)), thus more robustness.

The aim of this paper is thus to plan the impedanceof the robot online, without relying on direct force sensormeasurements. Applying a task-space impedance controltechnique to a generic robotic system leads to a resultingbehavior described by the model [15], [3]

Λc(q) ¨x+D ˙x+Kx =−Fext , (1)

where x ∈ Rn, ˙x ∈ Rn and ¨x ∈ Rn are the task-space posi-tioning error vector w.r.t. the desired position vector x ∈ Rn

and its derivatives, n is the dimension of the task (usuallyn = 3 or n = 6), q ∈ Rnq are the generalized coordinates,Λc ∈ Rn×n is the constraint-consistent operational spaceinertia matrix [3], and D ∈ Rn×n and K ∈ Rn×n are thedamping and the stiffness matrices (i.e. the gain matricesof the impedance controller), respectively. Finally, Fext ∈Rn

is the vector of the external force disturbances. Note thatthe constraint-consistent operational space inertia matrix Λc

is usually configuration dependent. This is a more generalcase compared to studying a fixed inertia matrix M. Suchgenerality is necessary, because impedance control requiresa force sensor to modulate the inertia matrix.

Fig. 3 shows an overview of the control architecture.The impedance controller computes the input to have thedesired impedance behavior. This behavior is chosen by theimpedance planner, i.e. an optimization problem that we aregoing to formulate. The cost function is designed to reducethe contact forces resulting from a displacement imposed bythe environment, i.e. to reduce the system impedance. Theconstraints are designed to guarantee the desired performancein terms of tracking error and stability. The performance isset by boundaries that are chosen online by an other module.This module merges the information given by the robotstate, the environment and the user inputs. During the robotmotion, the configuration, the surrounding environment andthe user inputs (e.g. task) will change, leading to differentcontroller gains.

To formulate the optimization problem we leverage thesimplified model (1) resulting from the application of animpedance control technique. This solution allows to com-pute and modify online the impedance parameters. The goalof this paper is to improve the robustness of the robot, thuswe will define a cost function J(D,K) that minimizes theimpedance. Decreasing the impedance will lower accuracy,thus we will rely on the constraints to guarantee the desiredminimum performance. In particular, we can relate trackingperformance and stability (in the case of a legged robot) tothe position error, bounding it accordingly. We will assumethat the disturbances are due to non-zero initial conditions,i.e. x0 6= 0 and ˙x0 6= 0. Furthermore, we will assume thateach element x0,i and ˙x0,i of the initial condition vectors isbounded. Each element di, j and ki, j of the matrices D andK will also have its own bound due to the actuation systemand robot design. Hence, the problem formulation is

arg minD,K

J(D,K)

s.t. ldi, j ≤ di, j ≤ udi, j ∀ i, j = 1 . . .n

lki, j ≤ ki, j ≤ uki, j ∀ i, j = 1 . . .n

max(x0, ˙x0)

|xi(t)| ≤ bi ∀ i = 1 . . .n, ∀ t ∈ [0,∞)

s.t. Λc ¨x+D ˙x+Kx = 0lx0,i ≤ x0,i ≤ ux0,i ∀ i = 1 . . .n

l ˙x0,i≤ ˙x0,i ≤ u ˙x0,i

∀ i = 1 . . .n ,

(2)

where a boundary bi is defined (and changed online) for eachof the n axes of the robot. These values define the desiredperformance of the robot and are linked to the external envi-ronment, stability (i.e. support polygon) or desired trackingperformance (i.e. maximum tracking error).

Eq. (2) is a minimization problem with a maximizationconstraint. Since the initial conditions are uncertain, this is arobust optimization problem, similar to [12]. In Sec. III wewill show how to transform this robust optimization problemin a deterministic optimization problem.

Fig. 3. Scheme of the method. The robot is controlled withan impedance controller, whose gains are chosen online by theproposed impedance planner. Robot configuration, environment anduser inputs determine the performance boundaries that need to beguaranteed.

During the execution of the task, the configuration of therobot (thus Λc(q)) will change, together with the bounds bi,leading to different optimal gain matrices D and K.

It is worth noting also that in (2) Fext = 0, i.e. there areno force disturbance, hence we are assuming that all thedisturbances are due to non-null initial conditions. The initialvelocity error ˙x0,i can be interpreted as an external impulsivedisturbance Fext,i = Aδ (t) acting along the axis xi, where Ais the amplitude of the impulse and δ (t) is the Dirac Delta.

III. METHOD

In this section we describe the proposed impedance modu-lation method. Although this method could be applied to anyimpedance controller, we will rely on the one proposed in[3], in particular, we will focus on the impedance modulationof the center of mass (COM) of the quadrupedal robotANYmal. Under this circumstance, (1) refers to the task-space dynamics of the COM, thus x is the position errorof the COM w.r.t. the desired position given by the motionplanner. Note also that, in this case, n = 6.

In the following we will transform the robust optimizationproblem (2) in a deterministic optimization one. Furthermore,we will report a sufficient condition to guarantee the stabilityof the system with time varying gains.

A. System Decoupling and Transient Response

The constraint-consistent operational space inertia matrixΛc depends on the current configuration q. Modulatingalso the inertia matrix would require a torque sensor that,generally, is not installed on the robot. Here we assume thatΛc can be considered constant between two gain updates, i.e.that we can replace Λc with a constant matrix M ∈ Rn×n.Furthermore, since we are focusing on the impedance of thetorso, we have that Λc is diagonally dominant, thus, we canapproximate M as a diagonal matrix. Since D and K are thegain matrices of the impedance controller, we can set them asdiagonal too, decoupling (1) into n SISO sub-systems and (2)into n distinct optimization problems. Note that Λc can notbe approximated as diagonal in every system. Later, we willbriefly discuss how to extend the proposed method to full Λcmatrices, leaving a deeper investigation to future works.

The resulting SISO model of the system under analysis is

mi ¨xi +di ˙xi + kixi =−Fext,i , (3)

where mi ∈ R is the i-th diagonal element of the constantinertia matrix M, di ∈ R is the i-th diagonal element of thedamping matrix D, and ki ∈ R is the i-th diagonal elementof the stiffness matrix K. xi ∈ R, ˙xi ∈ R and ¨xi ∈ R are thei-th elements of x, ˙x and ¨x, i.e. the positioning error vectorand its derivatives. Fext,i ∈ R is i-th element of the externalforce disturbance vector Fext, i.e. the disturbance acting onthe axis xi. Hereinafter, if it is not strictly necessary, we willomit the subscript i for the sake of readability.

The system (3) will present one of the three well-knownbehaviors: under damped, critically damped or over damped.Since the optimizer will iteratively change the optimal valuefor d and k, the system could switch between these threecases. We opt to restrict our analysis to the critically dampedcase because it is the fastest to converge without oscillations,and, in legged robot stance phases, overshoots are usuallyavoided [2], [16], [5]. Under this choice, the impedancecontroller gains are related, reducing the dimensionality ofthe parameter space of (2). We choose to optimize over thedamping term d because it simplifies the resulting expression.Consequently, the stiffness k is

k =d2

4m. (4)

Remark 1: Note that the proposed method could be ap-plied also to set under-damped or over-damped gains, whichare not reported here for the sake of space. Their formula-tions differ only due to slight variations in the constraints.

B. Cost Function

Our goal is to achieve a behavior robust to environmentalinteractions. This can be obtained by decreasing the robotimpedance, thus the values of the gains d and k. Since we arefocusing on the critically damped case, the gains are relatedby (4). Hence, we can adopt the quadratic cost function

J(d) = d2 . (5)

The idea is to have a robot as soft as possible, while relyingon the constraints to guarantee stability and desired minimumperformance.

C. Constraints

To guarantee the desired performance and the robot stabil-ity, the idea is to compute the worst possible position errorand to tune the impedance gains accordingly. The evolutionof the position error is defined by (3). Given the boundaryconditions x0 and ˙x0, the analytical solution of (3) is

x(t) =(

x0 +

(˙x0 +

x0d2m

)t)

e−d t2m . (6)

Substituting (6) into (2) allows to remove the dynamicsfrom the optimization problem. Also considering the simpli-fications discussed in Sec. III-A and the cost function (5),

the optimization SISO sub-problem is

arg mind

d2

s.t. ld ≤ d ≤ ud

max(x0, ˙x0)

∣∣∣∣(x0 +

(˙x0 +

x0d2m

)t)

e−d t2m

∣∣∣∣≤ b ∀ t ∈ [0,∞)

s.t. lx0 ≤ x0 ≤ ux0

l ˙x0≤ ˙x0 ≤ u ˙x0

,(7)

where b is the strictest bound for the axis under analysis.Our goal is to respect the boundaries b (stability and

performance) at every time step in presence of unknowninitial conditions, hence we have to focus on the worstcase, i.e. the case where the signs of x0 and ˙x0 are equal.Thus, without loss of generality, we assume that x0 ≥ 0 and˙x0 ≥ 0. This allows us to remove the absolute value fromthe maximization constraint. We also add the time t as adecision variable in the maximization constraint of (7). Inother words, instead of considering every position error x(t),we will consider only its peak value over time. Thus, defining

x0,max , max(|lx0 |, ux0) , ˙x0,max , max(|l ˙x0|, u ˙x0

) , (8)

the maximization constraint of (7) can be rewritten as

max(x0, ˙x0,t)

(x0 +

(˙x0 +

x0d2m

)t)

e−d t2m

s.t. 0≤ x0 ≤ x0,max

0≤ ˙x0 ≤ ˙x0,max

t ≥ 0 .

(9)

At this point, we transform the robust optimization prob-lem (7) into a deterministic optimization problem. UsingKKT conditions [17] we prove that the value

x0 = x0,max˙x0 = ˙x0,max

t =4m2 ˙x0,max

d(2m ˙x0,max +dx0,max),

(10)

is the argument of the maximum of (9).Proof: The KKT conditions applied to (9) are

(1+ d t

2m

)e−

d t2m +µ1−µ2 = 0

te−d t2m +µ3−µ4 = 0((

1− d t2m

)˙x0− d2 t

4m2 x0

)e−

d t2m +µ5 = 0

−x0 ≤ 0x0− x0,max ≤ 0− ˙x0 ≤ 0˙x0− ˙x0,max ≤ 0−t ≤ 0µi ≥ 0 ∀i = 1 . . .5µ1 · (−x0) = 0µ2 · (x0− x0,max) = 0µ3 · (− ˙x0) = 0µ4 · ( ˙x0− ˙x0,max) = 0µ5 · (−t) = 0 .

(11)

Given (11) we have two stationary points, i.e. (x0, ˙x0, t) =(x0,max, 0, 0) and (10). Recalling that, for the examinedproblem, d > 0 and m > 0 and given (6), direct computationshows that (10) is the maximum of the function underanalysis.

Using (10) allows us to replace the robust optimizationproblem with a deterministic one. The maximization con-straint in (7) is replaced by

max(x0, ˙x0,t)

(x(t)) =2m ˙x0,max +dx0,max

de

(−2m ˙x0,max

2m ˙x0,max+dx0,max

)≤ b .

(12)It is worth noting that (12) is a nonlinear constraint.

Applying a conservative approach is possible to find an upperbound of (12) that is a linear constraint, i.e.

max(x0, ˙x0,t)

(x(t))≤ x0,max +2m · ˙x0,max

e ·d. (13)

Proof: Given the linearity of x0 and ˙x0 in (6), we have

x(t) = x0η1(d, t)+ ˙x0η2(d, t) , (14)

where

η1(d, t),(

1+d

2mt)

e−d t2m , η2(d, t), te−

d t2m . (15)

Maximizing (14), and recalling that the maximum of thesum of two terms is lesser or equal to the sum of themaximum of the two terms, we obtain

max(x0, ˙x0,t)

(x(t)) = max(x0, ˙x0,t)

(x0 ·η1(d, t)+ ˙x0 ·η2(d, t))≤

≤max(x0,t)

(x0 ·η1(d, t))+max( ˙x0,t)

( ˙x0 ·η2(d, t)) .(16)

Assuming x0, ˙x0, η1 and η2 as positive we have

max(x0,t)

(x0 ·η1(d, t)) = max(x0)

(x0) ·max(t)

(η1(d, t))

max( ˙x0,t)

( ˙x0 ·η2(d, t)) = max( ˙x0)

( ˙x0) ·max(t)

(η2(d, t)) .(17)

From (8) we have that max(x0)(x0) = x0,max andmax( ˙x0,t)(

˙x0) = ˙x0,max. Then, we can analytically compute themaximum of η1 and η2 differentiating over t and equatingthe derivative to 0, obtaining

max(t)

(η1(d, t)) = 1 , max(t)

(η2(d, t)) =2me ·d

. (18)

This leads to the thesis.

D. Closed Form Solution

Eq. (13) can be substituted for (12) in (7), leading to thedeterministic optimization problem

arg mind

d2

s.t. d ≥2m · ˙x0,max

(b− x0,max) · eld ≤ d ≤ ud ,

(19)

that has a closed form solution

d = min(

max(

ld,2m · ˙x0,max

(b− x0,max) · e

), ud

). (20)

Eq. (20) and (4) allow the robot programmer to directlymap the task requirements in the impedance parameters ofthe robot. ld is fixed by the actuators, m is given by theconfiguration of the robot, and x0,max and ˙x0,max are linkedto the expected disturbances, in particular, ˙x0,max is linked tothe expected impulsive force disturbance.

Finally, b is related to the stability margin of the robot andthe performance/task requirements. For example, b could beset as the distance between the edges of the support polygonand the planned COM position or as the distance between thehip and the foot of the robot to avoid limb singularities. Theuser can also manually set the desired tracking performanceby imposing a b value. This is useful, for example, if thetask changes from walking to manipulation. The boundary bfor each axes x will be set accordingly as the strictest of theset. During the execution of the task, m and b will change,leading to different gains values.

Remark 2: The structure of (20) ensures the existenceof a valid solution. However, dynamic model inaccuracies,hardware limitations and overly stringent task requirementsmight make it impossible to satisfy the desired performancein practice.

E. Stability Analysis

To prove that the gain variation does not lead to instability,we will rely on Theorem 1 presented in [18]. This theoremstates that a sufficient condition for the asymptotic stabilityof the equilibrium at the origin of the system

x(t)+d(t)m(t)

x(t)+k(t)m(t)

x(t) = 0 , (21)

is that d(t)/m(t) is continuous and bounded over time,k(t)/m(t) is positive, differentiable and bounded, and that

12

d(k(t)/m(t))dt

(k(t)m(t)

)−1

+d(t)m(t)

> 0 . (22)

Let us consider system (3). In this case all the conditionsof Theorem 1 in [18] are satisfied except for (22). Given (4),condition (22) becomes

d(t)>−d2(t)m(t)

+d(t)m(t)

m(t). (23)

Eq. (23) imposes a maximum changing rate for the damp-ing coefficient. Fulfilling condition (23) assures the stabilityof the equilibrium at the origin (i.e. zero tracking error)during the gain variation.

Eq. (23) is always fulfilled during a damping increment,but it may not be during a damping reduction. To fulfill thiscondition we approximate d as a difference quotient, thus(23) becomes

d(t +Ts)> d(t)− d2(t)Ts

m(t)+

d(t)m(t)Ts

m(t), (24)

where Ts is the sampling time. The term d(t + Ts) is thesolution (20) of the optimization problem. If (24) is satisfied,we can set the next damping value as (20), otherwise, wewill limit it to the maximum variation imposed by (24).

Note that we can not compute exactly m(t) online. Wecould approximate it using the planned motion, but since theterm d(t)m(t)Ts/m(t) is very small (the inertia matrix variesslowly in the system under analysis), we can substitute itwith a constant value tuned on the worst case scenario.

IV. SIMULATION RESULTS

In this section we test this method with two simulations.In each simulation we used Ts = 0.0025s x0,max = [0.0218m,0.0160m, 0.0057m, 0.0192rad, 0.0097rad, 0.0301rad]T and˙x0,max = [0.1729m/s, 0.1240m/s, 0.0635m/s, 0.4331rad/s,0.2396rad/s, 0.1633rad/s]T . These values are the maxi-mum position and velocity error occurred while walk-ing on an uneven terrain with the simulated robotusing the fixed impedance gains employed in [3].The impedance limits are set as lk = [300·1T

3 N/m,200·1T

3 Nm/rad]T , uk = [1800·1T3 N/m, 400·1T

3 Nm/rad]T , ld =[230·1T

3 Ns/m, 19Nms/rad, 36Nms/rad, 38Nms/rad]T andud = [450·1T

3 Ns/m, 34Nms/rad, 85Nms/rad, 69Nms/rad]T ,where 1T

3 , [1, 1, 1].In the first simulation we test the variation of the

impedance gains due to a change in configuration, thus achange in Λc. We simulate the robot approaching a tableand crawling under it. For this simulation the boundariesimposed by the user as task requirements are b = [0.04·1T

3 m,π/6 · 1T

3 rad]T . The results are reported in Fig. 4. Differentbackground colors define the different task phases: walking,transition and crawling. For the sake of space and readabilitywe report only the results for the translational axes, analo-gous result were obtained for the rotational ones. Fig 4(a)shows the change in the inertia matrix diagonal elementsdue to the change of configuration. This is considered bythe optimization problem that increases the damping in the xaxis (Fig. 4(b)). Thus, the tracking error continues to fulfillthe margin b imposed by the user (Fig. 4(c)). Note that alsothe translational stiffness changes as described by (4), butwe do not report its behavior here for the sake of space.

In the second simulation we test the variation of theimpedance gains due to a change in the task requirements.We simulate that the user increases the required trackingperformance from b = [0.06m, 0.055m, 0.05m, π/6 ·1T

3 rad]T

to b = [0.025m, 0.04m, 0.04m, 0.01·1T3 rad]T and then de-

creases them again. Fig. 5(a) shows that the imposed trackingperformance continues to be satisfied, while in Fig 5(b) and5(c) the variation of the impedance gains is reported. Thisshows how the user can easily change task without anytedious tuning of the gains. An increment in the gains doesnot cause instability as discussed in Sec. III-E. This meansthat the abrupt change in the gains does not compromise thestability, while the change at t = 37s must be slowed down topreserve it. The inserts in Fig. 5(b) highlight this difference.Note that there is an analogous behavior for the rotationalimpedance gains.

V. EXPERIMENTAL RESULTS

In this section we test the proposed method on the robotANYmal [6]. This is a quadrupedal platform with point feet.

This system weighs ∼ 30kg, is ∼ 0.5m tall, and each linkof its legs is ∼ 0.3m long. The robot presents a total of 12identical actuators (ANYdrive [6]), three per legs. These jointunits are compact series elastic actuators, i.e. actuators wherea mechanically compliant element is embedded between thegearbox output and the joint output shaft.

We used Ts = 0.0025s x0,max = [0.0337m, 0.0360m,0.0193m, 0.0231rad, 0.0279rad, 0.0211rad]T and ˙x0,max =[0.2157m/s, 0.1812m/s, 0.1264m/s, 0.2712rad/s, 0.2447rad/s,0.1409rad/s]T . These values are the maximum position andvelocity error occurred while walking on an uneven ter-rain with the real robot using the fixed gains employedin [3]. The impedance limits are set as lk = [300·1T

3 N/m,200·1T

3 Nm/rad]T , uk = [1800·1T3 N/m, 400·1T

3 Nm/rad]T , ld =[230·1T

3 Ns/m, 19Nms/rad, 36Nms/rad, 38Nms/rad]T andud = [450·1T

3 Ns/m, 34Nms/rad, 85Nms/rad, 69Nms/rad]T .Please, for what concern the experiments, refer also to thevideo attached to this paper.

A. Identification Procedure

Since the system model used in the impedance controlleris not accurate enough, there is a mismatch between thesimulated dynamics and the real one. In particular, we havethat the damping behavior of the system along the trans-lational axes is different from the modeled one. In order toreduce this issue, we used a least square approach to identifythe intrinsic damping and the mass of the system along thetranslational axes. We set the damping gain of the impedancecontroller to zero, and we applied several displacements tothe robot using different stiffness gains. In particular, weapplied the displacements x0 = {10, 20, 35, 50}mm for thex axis, x0 = {10, 20, 35, 50}mm for the y axis and x0 ={50, 100, 150, 170}mm for the z axis. Each displacementwas applied three times for each tested stiffness value. Intotal we tested three stiffness values: k = 800N/m, k =1000N/m and k = 1200N/m.The result of this identificationprocess is that dx = 50.4885Ns/m, dy = 64.4859Ns/m anddz = 75.6533Ns/m, while mx, my and mz are approximatelyequal to the diagonal elements of the matrix Λc in thedefault configuration. The identified damping values werethen imposed as an offset to the optimal values (20).

B. Obstacle Interaction Experiment

In this experiment the robot touches an unperceivedobstacle (placed at ∼ 0.6m from the ground) whilewalking. We repeated the experiment with low1 (K =diag([500·1T

3 N/m, 200·1T3 Nm/rad]), D = diag([90·1T

3 Ns/m,10·1T

3 Nms/rad])), high (K = diag([1500·1T3 N/m,

300·1T3 Nm/rad]), D = diag([200·1T

3 Ns/m, 80·1T3 Nms/rad]))

and variable impedance (b = [0.06m, 0.055m, 0.05m,π/6 · 1T

3 rad]). Fig. 6 shows a photo-sequence for the highand variable impedance cases. The low impedance case isomitted due to similarity to the variable gains scenario. Inthe case of high impedance the interaction force is too largecausing a failure on the robot. Conversely, in the case of

1Note that these values are the same employed in [3].

(a) Diagonal term of the inertia matrix. (b) Translational damping gains. (c) Translational tracking error.

Fig. 4. Simulation results of the crawling task. (a) The change of configuration causes a change in the inertia matrix of the robot. Thisis considered by the optimization problem that increases the damping gain for the x axis (b). (c) The tracking error does not exceed theperformance margins imposed by the user. Analogous results were obtained for the stiffness gains, not reported here for the sake of space.

(a) Translational tracking error. (b) Translational damping gains. (c) Translational stiffness gains.

Fig. 5. Simulation of a change in the task requirements. The user imposes a new desired maximum tracking error (a) and the methodautomatically set the new impedance gains (b,c). The inserts in (b) highlight the difference between increasing and decreasing the gains.The latter may cause instability so the variation must be slowed down as discussed in Sec. III-E. The fluctuation of the stiffness gainsin (c) are caused by the variation of the inertia matrix, via (4). Analogous results were obtained for the rotational impedance gains, notreported here for the sake of space.

Fig. 6. Photo-sequence of the robot touching an unperceivedobstacle. In the case of high impedance the robot can not cope withthe motion displacement caused by the environment. In the case ofvariable impedance, the robot is able to handle that disturbance.

variable impedance the robot is able to resist to the motiondisplacement caused by the environment. This shows thebenefit of minimizing the impedance, which is achieved

thanks to the choice of the cost function (5).

C. Rough Terrain Experiment

In this experiment the robot walks on rough terrain (Fig.1) using the controller proposed in [3] combined with lo-comotion planner described in [19]. The terrain is realizedrandomly combining 0.4×0.4m wooden tiles with differentslope steepness ({π/36, π/18, π/12}rad) and orientation.During the motion, the task requirements are changed fromb = [0.06m, 0.055m, 0.05m, π/6 ·1T

3 rad]T to b = [0.02·1T3 m,

π/6 ·1T3 rad]T and then reseted to the previous values. Fig. 7

reports the tracking error and the impedance variation duringthe experiment. The error stays in between the boundarieseven when these boundaries are tightened by the operatorbetween t = 38s and t = 53s. The roughness of the terrainand the changes in the support polygon cause variation on theimpedance gains throughout the motion. This allows to keepbalance and to achieve the desired tracking performance.Note that the damping coefficient reach smaller values thanld due to the offset imposed by the intrinsic property ofthe system identified in Sec. V-A. Despite the differentconditions between simulation and this experiment the results

(a) Translational tracking error. (b) Translational damping coefficients. (c) Translational stiffness coefficients.

Fig. 7. Walking on a rough terrain experiment. (a) The user imposes a new desired maximum tracking error at t = 38s and resets it att = 53s. The tracking error is in between the boundaries. (b,c) The method automatically set the new impedance gains. The roughnessof the terrain and the changes in the support polygon cause variation on the impedance gains throughout the motion.

are qualitatively similar. Therefore, this experiment supportsthe soundness of the proposed method.

VI. CONCLUSIONSIn this work we presented a method to plan the impedance

of legged robots online to improve their robustness to un-expected interactions with the environment. This methodoptimizes the impedance based on the desired performanceand expected disturbances, without relying on direct forcemeasurements. Theoretical analysis provided a closed formsolution to the robust optimization problem and guaranteesof the system stability during the gain variation.

The proposed method was tested on simulation and onhardware in challenging scenarios involving rough terrainand unperceived obstacle interaction. Results show that thistechnique can vary the controller gains online, adapting themto different robot configurations or task requirements. Indeed,the user can change the desired tracking performance withno need of tedious and time-consuming tuning procedure.

Future works will focus on extending this method tosystem with full inertia matrices such as manipulators. Inthe case this matrix is not diagonal, the decoupled approachused in this paper can not be applied. However, the discussedoptimization problem could be tackled evaluating the maxi-mum error via mixed integer optimization.

REFERENCES

[1] Neville Hogan. Impedance control: An approach to manipulation:Part iiimplementation. Journal of dynamic systems, measurement, andcontrol, 107(1):8–16, 1985.

[2] Jong Hyeon Park. Impedance control for biped robot locomotion.IEEE Transactions on Robotics and Automation, 17(6):870–882, 2001.

[3] Guiyang Xin, Hsiu-Chin Lin, Joshua Smith, Oguzhan Cebe, andMichael Mistry. A model-based hierarchical controller for leggedsystems subject to external disturbances. In 2018 IEEE InternationalConference on Robotics and Automation (ICRA), pages 4375–4382.IEEE, 2018.

[4] Claudio Semini, Victor Barasuol, Thiago Boaventura, Marco Frigerio,Michele Focchi, Darwin G Caldwell, and Jonas Buchli. Towards versa-tile legged robots through active impedance control. The InternationalJournal of Robotics Research, 34(7):1003–1020, 2015.

[5] Emmanouil Spyrakos-Papastavridis, Peter RN Childs, and Nikos GTsagarakis. Variable impedance walking using time-varying lyapunovstability margins. In 2017 IEEE-RAS 17th International Conferenceon Humanoid Robotics (Humanoids), pages 318–323. IEEE, 2017.

[6] Marco Hutter, Christian Gehring, Andreas Lauber, Fabian Gunther,Carmine Dario Bellicoso, Vassilios Tsounis, Peter Fankhauser, RemoDiethelm, Samuel Bachmann, Michael Blosch, et al. Anymal-toward legged robots for harsh environments. Advanced Robotics,31(17):918–931, 2017.

[7] Haifa Mehdi and Olfa Boubaker. Impedance controller tuned byparticle swarm optimization for robotic arms. International Journalof Advanced Robotic Systems, 8(5):57, 2011.

[8] Jonas Buchli, Freek Stulp, Evangelos Theodorou, and Stefan Schaal.Learning variable impedance control. The International Journal ofRobotics Research, 30(7):820–833, 2011.

[9] Miao Li, Hang Yin, Kenji Tahara, and Aude Billard. Learningobject-level impedance control for robust grasping and dexterousmanipulation. In 2014 IEEE International Conference on Roboticsand Automation (ICRA), pages 6784–6791. IEEE, 2014.

[10] Tasuku Yamawaki, Hiroki Ishikawa, and Masahito Yashima. Iterativelearning of variable impedance control for human-robot cooperation.In 2016 IEEE/RSJ International Conference on Intelligent Robots andSystems (IROS), pages 839–844. IEEE, 2016.

[11] Chenguang Yang, Gowrishankar Ganesh, Sami Haddadin, SvenParusel, Alin Albu-Schaeffer, and Etienne Burdet. Human-like adapta-tion of force and impedance in stable and unstable interactions. IEEEtransactions on robotics, 27(5):918–930, 2011.

[12] G Maria Gasparri, Filippo Fabiani, Manolo Garabini, Lucia Pallottino,M Catalano, Giorgio Grioli, R Persichin, and Antonio Bicchi. Robustoptimization of system compliance for physical interaction in uncertainscenarios. In 2016 IEEE-RAS 16th International Conference onHumanoid Robots (Humanoids), pages 911–918. IEEE, 2016.

[13] Shuzhi Sam Ge, Yanan Li, and Chen Wang. Impedance adaptationfor optimal robot–environment interaction. International Journal ofControl, 87(2):249–263, 2014.

[14] Pietro Balatti, Dimitrios Kanoulas, Giuseppe F Rigano, Luca Mu-ratore, Nikos G Tsagarakis, and Arash Ajoudani. A self-tuningimpedance controller for autonomous robotic manipulation. In 2018IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS), pages 5885–5891. IEEE, 2018.

[15] Luigi Villani and Joris De Schutter. Force control. In SpringerHandbook of Robotics, pages 195–220. Springer, 2016.

[16] Tomomichi Sugihara and Yoshihiko Nakamura. Contact phase invari-ant control for humanoid robot based on variable impedant invertedpendulum model. In 2003 IEEE International Conference on Roboticsand Automation (Cat. No. 03CH37422), volume 1, pages 51–56. IEEE,2003.

[17] Stephen Boyd and Lieven Vandenberghe. Convex optimization. Cam-bridge university press, 2004.

[18] Alexander O Ignatyev. Stability of a linear oscillator with variableparameters. Electronic Journal of Differential Equations, 1997(17):1–6, 1997.

[19] Peter Fankhauser, Marko Bjelonic, C Dario Bellicoso, TakahiroMiki, and Marco Hutter. Robust rough-terrain locomotion witha quadrupedal robot. In 2018 IEEE International Conference onRobotics and Automation (ICRA), pages 1–8. IEEE, 2018.

online optimal impedance planning for legged...

Documents