analysis of theoretical and numerical properties of

1

Analysis of Theoretical and NumericalProperties of Sequential Convex Programming

for Continuous-Time Optimal ControlRiccardo Bonalli, Thomas Lew, and Marco Pavone

Abstract— Sequential Convex Programming (SCP) hasrecently gained significant popularity as an effectivemethod for solving optimal control problems and has beensuccessfully applied in several different domains. However,the theoretical analysis of SCP has received comparativelylimited attention, and it is often restricted to discrete-timeformulations. In this paper, we present a unifying theoret-ical analysis of a fairly general class of SCP proceduresfor continuous-time optimal control problems. In addition tothe derivation of convergence guarantees in a continuous-time setting, our analysis reveals two new numerical andpractical insights. First, we show how one can more easilyaccount for manifold-type constraints, which are a definingfeature of optimal control of mechanical systems. Second,we show how our theoretical analysis can be leveraged toaccelerate SCP-based optimal control methods by infusingtechniques from indirect optimal control.

Index Terms— Optimal control, Nonlinear systems, Con-strained control, Algebraic/geometric methods, Variationalmethods.

I. INTRODUCTION

S INCE its first appearance more than five decades ago, Se-quential Convex Programming (SCP) [1], [2] has proven

to be a powerful and reliable algorithmic framework for non-convex optimization, and it has recently gained new popularityin aerospace [3]–[6] and robotics [7]–[10]. In its most generalform, SCP entails finding a locally-optimal solution to a non-convex optimization problem as the limit point of a sequenceof solutions to convex subproblems formed by successive ap-proximations. The main advantage offered by this approach isthe ability to leverage a wide spectrum of numerical techniquesto efficiently solve each convex subproblem [11]–[14], leadingto near-real-time numerical schemes. For example, amongthe most mature SCP paradigms we find the well-knownSequential Quadratic Programming (SQP) method [15]–[17].

Through the years, SCP’s sound performance has pushedthe community towards deep investigations of the theoreticalnature of this method. The most informative result states thatwhen convergence is achieved, SCP finds a candidate local

This paper was submitted for review in July 2021.This research was supported by the National Science Foundation

under the CPS program (grant #1931815) and by the King AbdulazizCity for Science and Technology (KACST).

The authors are with the Department of Aeronautics &Astronautics, Stanford University, Stanford, CA 94305-4035 USA(emails: [email protected], [email protected], [email protected]).

optimum for the original non-convex problem, i.e., a solutionthat satisfies necessary conditions for local optimality [18]–[20] (convergence rate results have also been derived, see,e.g., [21]). When used in the context of non-convex optimalcontrol, the SCP convexification scheme is usually applied tothe non-convex program that stems from a discretization of theoriginal continuous-time problem, providing only partial in-sights with respect to the original continuous-time formulation.For instance, are those guarantees only applicable to specificdiscretization schemes? Can insights from continuous-timeanalysis be leveraged to improve SCP-based optimal controlmethods? To the best of our knowledge, the only continuous-time analysis of SCP-based optimal control is provided in [5],though the optimal control context considered by the authors isvery specific and the conditions for optimality used are weakerthan those in the state-of-the-art for continuous-time optimalcontrol (see our discussion in Section III-C).

Statement of contributions: In this paper we contribute tofilling the existing gap in the theoretical analysis of SCP-basedoptimal control methods by providing a unifying analysis ofa wide class of SCP procedures for continuous-time (non-convex) optimal control. Our main result consists of provingthat, under a minimal set of assumptions, any accumulationpoint for the sequence of solutions returned by SCP sat-isfies the Pontryagin Maximum Principle (PMP) [22], [23]associated with the original formulation. The PMP representsa set of necessary conditions for optimality in continuous-time optimal control that is stronger than the traditionalLagrange multiplier rules (the latter were investigated in [5]),and it often represents the best result one might hope for innonlinear optimal control. Our convergence result stems froman analysis on the continuity with respect to convexification ofthe Pontryagin cones of variations, tools originally introducedby Pontryagin and his group to prove the PMP. In addition,we relax some technical assumptions that are often difficultto verify in practice and that have been considered in [5](e.g., strong compactness of the set of admissible controls isreplaced by weak compactness), thus enlarging the class ofproblems that can be solved by SCP with guarantees.

Our continuous-time analysis provides a generalization ofseveral existing discrete-time results and reveals new insightsinto the nature of SCP applied to optimal control, ultimatelyoffering three key advantages. First, we can transfer the-oretical guarantees to any discrete-time implementation ofthe continuous-time SCP-based optimal control formulation,

arX

iv:2

009.

0503

8v3

[m

ath.

OC

] 2

Aug

202

1

2

regardless of the time-discretization scheme adopted. Second,we can directly and effectively extend these guarantees tothe setting with manifold-type constraints, i.e., nonlinear stateequality constraints often found when dealing with mechanicalsystems. Third, we can provide a powerful connection toindirect methods for optimal control such as (indirect) shootingmethods [24], enabling the design of numerical schemes thataccelerate the convergence of SCP.

Specifically, our contributions are as follows: (1) We derivetheoretical guarantees for continuous-time SCP-based optimalcontrol methods, whose related sequence of convex subprob-lems stems from the successive linearization of all nonlinearterms in the dynamics and all non-convex functions in thecost. In particular, we apply this analysis to finite-horizon,finite-dimensional, non-convex optimal control problems withcontrol-affine dynamics. (2) Through a study of the continuityof the Pontryagin cones of variations with respect to lineariza-tion, we prove that whenever the sequence of SCP iterates con-verges (under specific topologies), we find a solution satisfyingthe PMP associated with the original formulation. In addition,we prove that up to some subsequence, the aforementionedsequence always has an accumulation point, which providesa weak guarantee of success for SCP (“weak” in the sensethat only a subsequence of the sequence of SCP iterates canbe proved to converge). (3) We leverage the continuous-timeanalysis to design a novel and efficient approach to account formanifold-type constraints. Specifically, we show that, undermild assumptions, one can solve the original formulation (i.e.,with manifold-type constraints) with convergence guaranteesby applying SCP to a new optimal control problem wherethose constraints are simply ignored, thereby simplifyingnumerical implementation. (4) As a byproduct, our analysisshows that the sequence of multipliers associated with thesequence of convex subproblems converges to a multiplier forthe original formulation. We show via numerical experimentshow this property can be used to considerably accelerateconvergence rates by infusing techniques from indirect control.

Previous versions of this work have appeared in [9], [10].In this paper, we provide as additional contributions (i) anew formulation with more general cost functionals, (ii)convergence proofs under weaker assumptions, (iii) detailedexplanations on “transferring” theoretical guarantees undertime discretizations, and (iv) extensive numerical simulationsfor the acceleration procedure based on indirect methods.

We do highlight three main limitations of our work. First,being SCP a local optimization algorithm, our theoreticalguarantees are necessarily local (this is arguably unavoidablegiven the local nature of SCP). Second, the assumption ofcontrol-affine dynamics plays a crucial (though technical) rolein our convergence analysis. The extension of our results tothe more general setting represents an open research question.

Organization: The paper is organized as follows. SectionII introduces notation and the continuous-time non-convexoptimal control problem we wish to study. Our convergenceanalysis of SCP-based optimal control methods is split in twosections: In Section III, convergence is analyzed in the absenceof manifold-type constraints, and in Section IV we accountfor manifold-type constraints. We show in Section V how our

theoretical analysis can be used to design convergence accel-eration procedures through numerical experiments in SectionVI. Finally, Section VII provides final remarks and directionsfor future research.

II. PROBLEM FORMULATION

Our objective consists of providing locally-optimal solutionsto Optimal Control Problems (OCP) of the form:

mintf∈[0,T ], u∈U

∫ tf

0

f0(s, x(s), u(s)) ds ,∫ tf

0

(G(s, u(s))

+H(s, x(s)) + L0(s, x(s)) +

m∑i=1

ui(s)Li(s, x(s))

)ds

x(s) = f(s, x(s), u(s))

, f0(s, x(s)) +

m∑i=1

ui(s)fi(s, x(s)), a.e. s ∈ [0, T ]

x(0) = x0, g(x(tf )) = 0

x(s) ∈M ⊆ Rn, s ∈ [0, tf ]

where the variable x denotes state variables, and we optimizeover the final time tf ∈ [0, T ] (whenever free), where T > 0is some fixed maximal final time, and controls u ∈ U ,L2([0, T ];U), where L2([0, T ];U) is the space of squareintegrable controls defined in [0, T ] and with image in U ,U ⊆ Rm being a convex compact subset. The set U containsall the admissible controls. The mappings Li : Rn+1 → R,fi : Rn+1 → Rn, for i = 0, . . . ,m, and g : Rn → R`g are as-sumed to be smooth (i.e., at least continuously differentiable),whereas we consider smooth mappings G : Rm+1 → R,H : Rn+1 → R that are convex with respect to the variablesu and x, respectively. We require that the vector fields fi,i = 0, . . . ,m have compact supports and that 0 is a regularvalue for g, so that g−1(0) is a submanifold of Rn, and thatg(x0) 6= 0, so that no trivial solutions tf = 0 exist. In addition,we may require optimal trajectories to satisfy manifold-typeconstraints of the form x(s) ∈M , s ∈ [0, tf ], where M ⊆ Rnis a smooth d-dimensional submanifold of Rn. In this case, theinitial condition x0 ∈ Rn lies within M . In OCP, the mappingsf and f0 model control-affine nonlinear dynamics which aresatisfied almost everywhere (a.e) in [0, T ], and non-convex-in-state cost, respectively. In particular, we leverage the factthat controls u appear in the cost f0 through either convexor linear terms only to establish convergence guarantees. Any(locally-optimal) solution to OCP is denoted as (t∗f , x

∗, u∗),where the control u∗ : [0, T ] is in L2([0, T ];U) trajectory andx∗ : [0, T ]→ Rn is an absolutely-continuous trajectory.

Remark 2.1: The requirement that the vector fields fi, i =0, . . . ,m have compact supports is not restrictive in practice,for we may multiply the fi by some smooth cut-off functionwhose support is in some arbitrarily large compact set thatcontains states x ∈ Rn which are relevant to the givenapplication domain. Importantly, as a standard result, thisproperty implies that the trajectory solutions to the dynamicsof OCP (and to the dynamics of every other problem that willbe defined later) are defined and uniformly bounded for alltimes s ∈ [0, T ], see Lemma 3.1 for a more precise statement.From this last observation and Filippov’s theorem, we infer the

3

existence of (at least locally) optimal solutions to OCP as soonas OCP is feasible (see, e.g., [25]). Sufficient conditions forthe feasibility of OCP exist and are related to the Lie algebragenerated by f1, . . . , fm. In particular, these conditions aregeneric (more details may be found in [26]; see also SectionIII-A).

Remark 2.2: Many applications of interest often involvestate constraints c(s, x(s)) ≤ 0, s ∈ [0, tf ], where the mappingc : Rn+1 → R`c is smooth and non-convex. One commonway of solving such constrained problems hinges on thepenalization of state constraints within the cost, thus reducingthe original problem to OCP. Specifically, given a penalizationweight ω = (ω1, . . . , ω`c) ∈ [0, ωmax]`c , one may introducethe mapping L0

ω(s, x) , L0(s, x)+∑`ci=1 ωih(ci(s, x)), where

h : R → R+ is any continuously differentiable penalizationfunction such that h(z) = 0 for z ≤ 0 (e.g., h(z) = 0 forz ≤ 0 and h(z) = z2 for z > 0). The constrained problem isreduced to OCP by dropping state constraints and replacingthe running cost function L0 with L0

ω (note that L0ω is smooth

but not necessarily convex). The parameter ω is selected bythe user and weighs the presence of state constraints; thehigher the value, the larger the penalization for the violationof state constraints. We will use this remark for numericalexperiments in Section VI. We refer to [17] for the analysisof the convergence for ω → ∞ of penalty methods towardsolutions of constrained optimization problems, which liesoutside the scope of this work.

The OCP problem is in general difficult to solve because ofthe presence of nonlinear dynamics and non-convex cost. Thesolution strategy proposed in this work is based on SCP.

III. SEQUENTIAL CONVEX PROGRAMMING WITHOUTMANIFOLD-TYPE CONSTRAINTS

As a first step, we develop our SCP framework withoutconsidering manifold-type constraints, showing later how thewhole formalism can be adapted to the presence of those con-straints. Dropping the manifold-type constraints, OCP takesthe simpler form:


∫ tf

0

f0(s, x(s), u(s)) ds

x(s) = f(s, x(s), u(s)), a.e. s ∈ [0, T ]

x(0) = x0, g(x(tf )) = 0.

SCP entails finding a locally-optimal solution to OCP as a limitpoint of a sequence of solutions to convex subproblems com-ing from successive approximations to OCP. Although severaldifferent approximation schemes have been introduced in theliterature, in this work we focus on arguably the simplest one,which is to linearize any nonlinear term in the dynamics andany non-convex function in the cost. The two main advantagesof this approach are ease of computing linearizations and theabsence of high-order singular Jacobians, which can cause theSCP problem to be ill-posed (e.g., SQP requires additionalprocedures to ensure positive definiteness of Hessians [17]).

A. Design of Convex Subproblems

Assume we are given (t0f , x0, u0), where t0f ∈ [0, T ] is aguessed final time, u0 : [0, T ]→ Rm is piecewise continuous,and x0 : [0, T ] → Rn is absolutely continuous. This tuplerepresents the initializing guess for the SCP procedure. Im-portantly, we do not require (x0, u0) to be feasible for OCP,though feasibility of (x0, u0) and closeness to a satisfactorytrajectory increases the chances of rapid convergence. We willaddress this point further in the numerical experiment section.A sequence of convex optimal control problems is definedby induction as follows: Given a sequence (∆k)k∈N ⊆ R+,the Linearized Optimal Control subProblem (LOCP∆

k+1) atiteration k + 1 subject to trust-region radius ∆k+1 > 0 is


∫ tf

0f0k+1(s, x(s), u(s)) ds

,∫ tf

0

(G(s, u(s)) +H(s, x(s)) + L0(s, xk(s))

+

m∑i=1

ui(s)Li(s, xk(s)) +

(∂L0

∂x(s, xk(s))

+

m∑i=1

uik(s)∂Li

∂x(s, xk(s))

)(x(s)− xk(s))

)ds

x(s) = fk+1(s, x(s), u(s)), x(0) = x0

, f0(s, xk(s)) +

m∑i=1

ui(s)fi(s, xk(s)) +

(∂f0

∂x(s, xk(s))

+

m∑i=1

uik(s)∂fi∂x

(s, xk(s))

)(x(s)− xk(s)), a.e. s ∈ [0, T ]

gk+1(x(tf )) , g(xk(tkf )) +∂g

∂x(xk(tkf ))(x(tf )− xk(tkf )) = 0

|tf − tkf | ≤ ∆k+1,

∫ T

0‖x(s)− xk(s)‖2 ds ≤ ∆k+1

where all the non-convex contributions of OCP have beenlinearized around (tkf , xk, uk), which for k ≥ 1 is a solution tothe subproblem LOCP∆

k at the previous iteration. Accordingly,(tk+1f , xk+1, uk+1) always denotes a solution to the subprob-

lem LOCP∆k+1. Each subproblem LOCP∆

k is convex in thesense that after a discretization in time through any time-linearintegration scheme (e.g., Euler schemes, trapezoidal rule, etc.),we end up with a finite-dimensional convex program that canbe solved numerically via convex optimization methods. Inparticular, linearizations of G and H are not required, beingthe contribution of those mappings already convex. Finally, wehave introduced convex trust-region constraints

|tf − tkf | ≤ ∆k+1,

∫ T

0

‖x(s)− xk(s)‖2 ds ≤ ∆k+1. (1)

These are crucial to guiding the convergence of SCP in thepresence of linearization errors. Since the control variable al-ready appears linearly within the non-convex quantities defin-ing OCP, trust-region constraints are not needed for control.We remark that although it might seem more natural to imposepointwise trust-region constraints at each time s ∈ [0, T ], theL2-type constraints (1) are sufficient to perform a convergenceanalysis, and importantly, they are less restrictive. The trust-region radii (∆k)k∈N ⊆ R+ represent optimization parametersand may be updated through iterations to improve the search

4

for a solution at each next iteration. Effective choices of suchan updating rule will be discussed in the next section.

The definition of every convex subproblem by inductionmakes sense only if we can claim that: (1) at each step, theoptimal trajectory xk is defined in the entire interval [0, T ],and (2) there exists (at least one) optimal solution at eachstep. The answer to the first question is contained in thefollowing lemma, whose proof relies on routine applicationof the Gronwall inequality and is postponed in the Appendix.

Lemma 3.1 (Boundness of trajectories): Let suppfi denotethe support of fi, i = 0, . . . ,m. If U and suppfi, i = 0, . . . ,mare compact, then each xk is defined in the entire interval [0, T ]and uniformly bounded for every t ∈ [0, T ] and every k ∈ N.

To answer the second question, we should provide sufficientconditions under which LOCP∆

k+1 admits a solution for eachk ∈ N. To this purpose, we assume the following:

(A1) For every k ∈ N, the subproblem LOCP∆k+1 is feasible.

As a classical result, under (A1), for every k ∈ N, the subprob-lem LOCP∆

k+1 has an optimal solution (tk+1f , xk+1, uk+1),

which makes the above definition of each convex subproblemby induction well-posed (see, e.g., [25]).

Remark 3.1: In practical contexts, (A1) is often satisfied.This assumption is well-motivated, because, up to a slightmodification, each subproblem LOCP∆

k is generically feasiblein the following sense. In the presence of (1), the feasibilityof each subproblem would be a consequence of the controlla-bility of its linear dynamics, which is in turn equivalent tothe invertibility of its Gramian matrix (see, e.g., [25]; theconstraints (1) force any admissible trajectory of LOCP∆

k+1

to lie within a tubular neighborhood around xk, thus, as aclassical result, the controllability criterion in [25] applies byrestriction to this tubular neighborhood). Since the subset ofinvertible matrices is dense, Gramian matrices are invertiblewith probability one. Linearized dynamics are thus almost al-ways controllable, which implies that each subproblem is fea-sible. As an important remark, feasibility is preserved throughtime discretization, making any time-discretized version ofthe convex subproblems well-posed numerically. Indeed, timediscretization maps the continuous linear dynamics into asystem of linear equations. Since the set of full-rank matricesis also dense, similar reasoning shows that the discretizedsubproblems are also almost always feasible. In conclusion,(A1) is a mild and well-justified assumption.

Before discussing the SCP pseudo-algorithm that we in-troduce to sequentially solve each subproblem LOCP∆

k , it isworth introducing one last class of subproblems. Specifically,we inductively define the Linearized Optimal Control sub-Problem (LOCPk+1) at iteration k + 1 as LOCP∆

k+1 mod-ified to remove trust-region constraints (i.e. dropping (1)).Those subproblems are optimal control problems without stateconstraints, and they will play a key role in developing ourtheoretical result of convergence. The well-posedness of eachsubproblem LOCPk and the existence of an optimal solutionat each iteration directly come from (A1), as discussed earlier.

B. Algorithmic FrameworkThe objective of our SCP formulation can be stated as

follows: to find locally-optimal solutions to OCP by it-

eratively solving each subproblem LOCP∆k until the se-

quence (∆k, tkf , xk, uk)k∈N, where (tkf , xk, uk) is a solution

to LOCP∆k , satisfies some convergence criterion (to be de-

fined later). We propose pursuing this objective by adopting(pseudo-) Algorithm 1, which is designed to return a locally-optimal solution to OCP, up to small approximation errors.

Algorithm 1: Sequential Convex ProgrammingInput : Guess trajectory x0 and control u0.Output: Solution (tkf , xk, uk) to LOCP∆

k for some k.Data : Initial trust-region radius ∆0 > 0.

1 begin2 k = 0, ∆k+1 = ∆k

3 while (uk)k∈N has not converged do4 Solve LOCP∆

k+1 for (tk+1f , xk+1, uk+1)

5 ∆k+1 =

UpdateRule(tk+1f , xk+1, uk+1, t

kf , xk, uk)

6 k ← k + 1

7 return (tk−1f , xk−1, uk−1)

Algorithm 1 requires the user to provide a rule UpdateRuleto update the values of the trust-region radius. This ruleshould primarily aim to prevent accepting solutions at eachiteration that are misguided by significant linearization error.A priori, we only require that UpdateRule is such that thesequence of trust-region radii (∆k)k∈N converges to zero(in particular, (∆k)k∈N is bounded). In the next section, weshow that this numerical requirement, together with other mildassumptions, are sufficient to establish convergence guaranteesfor Algorithm 1. An example for UpdateRule will be providedin Section VI when discussing numerical simulations.

The algorithm terminates when the sequence of controls(uk)k∈N converges with respect to some user-defined topology(as we will see shortly, converge is always achieved in somespecific sense by at least one subsequence of (uk)k∈N; inSection VI, we show how to numerically check the conver-gence of the sequence (uk)k∈N). Whenever such convergenceis achieved (in some specific sense; see the next section),we may claim Algorithm 1 has found a candidate locally-optimal solution for OCP (see Theorem 3.2 in the nextsection). The reason that only the convergence of the sequenceof controls suffices to claim success is contained in ourconvergence result (see Theorem 3.2 in the next section). Tomeasure the convergence of (uk)k∈N, some topologies arebetter than others, and in particular, under mild assumptionsone can prove that, up to some subsequence, (uk)k∈N alwaysconverges with respect to the weak topology of L2. In turn,this may be interpreted as a result of weak existence ofsuccessful trajectories for Algorithm 1 when selecting the L2-weak topology as convergence metric. In practice, Algorithm1 is numerically applied to time-discretized versions of eachsubproblem LOCP∆

k . Thus we will show that our conclusionsregarding convergence behavior still hold in a discrete context,up to discretization errors (see the next section).

5

C. Convergence AnalysisWe now turn to the convergence of Algorithm 1. Under mild

assumptions, our analysis provides three key results:R1 When the sequence (uk)k∈N returned by Algorithm 1

converges, the limit is a stationary point for OCP in thesense of the Pontryagin Maximum Principle (PMP).

R2 There always exists a subsequence of (uk)k∈N thatconverges to a stationary point of OCP for the weaktopology of L2.

R3 This converging behavior transfers to time-discretizationof Algorithm 1, i.e., versions for which we adopt time-discretization of subproblems LOCP∆

k .Result R1 is the core of our analysis and roughly statesthat whenever Algorithm 1 achieves convergence, a candidatelocally-optimal solution to the original problem has beenfound. For the proof of this result, we build upon the PMP.

Before focusing on the convergence result, we recall thestatement of the PMP and list its main assumptions. For everyp ∈ Rn and p0 ∈ R, define the Hamiltonian (related to OCP)

H(s, x, p, p0, u) = p>f(s, x, u) + p0f0(s, x, u).

Theorem 3.1 (Pontryagin Maximum Principle):Let (t∗f , x

∗, u∗) be a locally-optimal solution to OCP.There exist an absolutely-continuous function p : [0, T ]→ Rnand a constant p0 ≤ 0, such that the following hold:• Non-Triviality Condition: (p, p0) 6= 0.• Adjoint Equation: Almost everywhere in [0, t∗f ],

p(s) = −∂H∂x

(s, x∗(s), p(s), p0, u∗(s)).

• Maximality Condition: Almost everywhere in [0, t∗f ],

H(s, x∗(s), p(s), p0, u∗(s)) = maxv∈U

H(s, x∗(s), p(s), p0, v).

• Transversality Conditions: It holds that

p(tf ) ⊥ ker∂g

∂x(x∗(t∗f )),

and if the final time is free,maxv∈U

H(t∗f , x∗(t∗f ), p(t∗f ), p0, v) = 0, t∗f < T,

maxv∈U

H(t∗f , x∗(t∗f ), p(t∗f ), p0, v) ≥ 0, t∗f = T.

The tuple (t∗f , x∗, p, p0, u∗) is called (Pontryagin) extremal.

An extremal is called normal if p0 6= 0.The previous theorem states the PMP for the formulation

OCP only. However, our theoretical analysis requires us towork with the PMP related to the family of subproblemswithout state constraints, namely (LOCPk)k∈N. The statementof the PMP readily adapts to those subproblems by assumingthe following regularity condition:

(A2) In the case of free final time, for every k ∈ N, any optimalcontrol uk to LOCPk is continuous at the optimal finaltime tk+1

f of LOCPk+1.This assumption is arguably very mild. To see this, we mayproceed by induction, using [27, Theorem 3.2] to show that,for a large class of running cost functions f0 (which forinstance includes minimal-effort-seeking cost functions, which

are widely used in aerospace and robotics), any optimal controluk of LOCPk is Lipschitz continuous as soon as the relatedextremals are normal, and thanks to [28, Proposition 2.12]together with our discussion about the feasibility of LOCP∆

k

in Section III-A, the extremals of LOCPk are almost alwaysnormal for a very large class of running cost functions f0. Inconclusion, (A2) is a mild and well-justified assumption.

Molding the PMP for subproblems LOCP∆k requires intro-

ducing more technical tools due to the state constraints (1).Nevertheless, under specific (though, importantly, fairly mild)requirements, extremals for LOCP∆

k coincide with extremalsfor LOCPk (see, e.g., [29] and proof in Section III-D).

Assumptions (A1) and (A2) suffice to obtain the result R1(see Theorem 3.2 below). To prove result R2, more regular-ity on the solutions to the convex subproblems is required.Specifically, we introduce the following technical condition:

(A3) There exists a finite subset D ⊆ R+ such that, for everyk ∈ N, any time-discontinuity of any optimal controluk+1 to LOCP∆

k+1 lies within D.This assumption can be weakened by requiring that anyLebesgue point of any optimal control uk+1 to LOCP∆

k+1 lieswithin D (see the proof of Theorem 3.2 below), although wedo not assume this in the following. We can leverage the sameexact argument used for (A2) to show that (A3) is a mild andwell-justified assumption for a large class of mappings f0.

Our main convergence result reads as follows,Theorem 3.2 (Guarantees of convergence for SCP):

Assume that (A1) and (A2) hold and that Algorithm 1 returnsa sequence (∆k, t

kf , uk, xk)k∈N such that, for every k ∈ N,

the tuple (tk+1f , uk+1, xk+1) locally solves LOCP∆

k+1 with

|tk+1f − tkf | < ∆k+1,

∫ T

0‖xk+1(s)− xk(s)‖2 ds < ∆k+1, (2)

that is, trust-region constraints are satisfied strictly.1) Assume that the sequence of final times (tkf )k∈N con-

verges to t∗f ∈ (0, T ], and the sequence of controls(uk)k∈N converges to u∗ ∈ U for the strong topologyof L2. Let x∗ : [0, T ] → Rn denote the solution to thedynamics of OCP associated with the control u. Then:

a) There exists a tuple (p, p0) such that(t∗f , x

∗, p, p0, u∗) is an extremal for OCP.b) There exists a sequence (pk, p

0k)k∈N such that

(tkf , xk, pk, p0k, uk) is an extremal for LOCPk (and

also for LOCP∆k due to (2), see, e.g., [29]), and

these convergence results hold:• (xk)k∈N converges to x∗ for the strong topology

of C0.• Up to some subsequence, (pk)k∈N converges top for the strong topology of C0, and (p0

k)k∈Nconverges to p0.

2) Assume that the sequence of final times (tkf )k∈N con-verges to t∗f ∈ (0, T ] and the sequence of controls(uk)k∈N converges to u∗ ∈ U for the weak topologyof L2. If (A3) holds, then the statements in 1.a-1.babove remains true. In addition, there always exist asubsequence (t

kjf )j∈N ⊆ (tkf )k∈N that converges to some

t∗f ∈ (0, T ] and a subsequence (ukj )j∈N ⊆ (uk)k∈N that

6

converges to some u∗ ∈ U for the weak topology of L2,such that the statements in 1.a-1.b above are true.

The guarantees offered by Theorem 3.2 read as follows.Under (A1) and (A2) and by selecting a shrinking-to-zerosequence of trust-region radii, if iteratively solving problemsLOCP∆

k returns a sequence of solutions that satisfy (2) (notethat (2) needs to hold starting from some large enough iterationonly) and whose controls converge with respect to the strongtopology of L2, then there exists a Pontryagin extremal forthe original problem, i.e., a candidate (local) solution to OCP,which formalizes result R1. Moreover, under the additionalassumption that the generated sequence of controls has a finitenumber of time-discontinuities, such a converging sequence ofcontrols always exists, which formalizes result R2. This canbe clearly interpreted as a “weak” guarantee of success forSCP, where “weak” refers to the fact that only a subsequenceof the sequence of control strategies converges.

Remark 3.2: When SCP achieves convergence, (A3) is notneeded for the derivation of theoretical guarantees on localoptimality. Moreover, the condition |tk+1

f − tkf | < ∆k+1 in(2) is required only if the final time is free. When (2) isfulfilled throughout SCP iterations, theoretical convergencecan be analyzed by replacing each LOCP∆

k with LOCPk,allowing one to rely on just the classical version of the PMP,i.e., Theorem 3.1. However, it is worth pointing out that weleverage (2) for the sake of clarity and conciseness in theexposition, and that Theorem 3.2 can be proved even if weremove (2). Indeed, since trajectories are considered under theintegral sign, the trust-region constraints |tk+1

f − tkf | ≤ ∆k+1

and∫ T

0‖xk+1(s) − xk(s)‖2 ds ≤ ∆k+1 play a similar role

as the final constraints g(x(tf )) = 0. Thus, they can becombined with final constraints so that no additional Lagrangemultipliers appear in the development of necessary conditionsfor optimality (the reader might gain a better understandingof this claim by analyzing our computations in Section III-D),although such development requires more extensive computa-tions which we do not report due to lack of space. It is worthmentioning that (2) is usually satisfied in practice and can bechecked numerically, see Section VI.

Remark 3.3: Those guarantees adapt when time discretiza-tion is adopted to numerically solve each convex subproblem,which is the most frequently used and reliable techniquein practice. To see this, fix a time-discretization schemeand consider the discretized version of OCP. Any candidatelocally-optimal solution to this discrete formulation satisfiesthe Karun-Kush-Tucker (KKT) conditions. If a Pontryaginextremal of OCP exists, the limit of points satisfying theKKT for the discretized version of OCP as the time steptends to zero converges to the aforementioned Pontryaginextremal of OCP (more precisely, up to some subsequence;the reader can find more details in [30]). Theorem 3.2 exactlyprovides conditions under which the “if sentence” aboveholds true, that is, conditions under which the aforementionedPontryagin extremal of OCP exists, thus endowing Algorithm1 with correctness guarantees that are independent of any timediscretization the user may select (Euler, Runge-Kutta, etc.).

D. Proof of the Convergence Result

We split the proof of Theorem 3.2 in three main steps.First, we retrace the main steps of the proof of the PMPto introduce necessary notation and expressions. Second, weshow the convergence of trajectories and controls, togetherwith the convergence of variational inequalities (see SectionIII-D.3 for a definition). The latter represents the cornerstoneof the proof and paves the way for the third and final step,which consists of proving the convergence of the Pontryaginextremals. For the sake of clarity and conciseness, we carryout the proof for free-final-time problems only, the other casebeing easier (indeed, in this case we need to develop thesame computations but without considering the transversalityconditions on the final time, see also [31, Section 3]).

1) Pontryagin Variations: Let u ∈ U be a feasible controlfor OCP, with associated trajectory xu in [0, T ] and final timetuf ∈ (0, T ] (recall that tuf 6= 0 due to g(x0) 6= 0). We mayassume that tuf is a Lebesgue point of u. Otherwise, one mayproceed similarly by adopting limiting cones, as done in [32,Section 7.3]. For every r ∈ [0, tuf ] Lebesgue point of u, andv ∈ U , we define

ξr,vu ,

(f(r, xu(r), v)− f(r, xu(r), u(r))f0(r, xu(r), v)− f0(r, xu(r), u(r))

)∈ Rn+1.

(3)The variation trajectory zr,vu : [0, T ] → Rn+1 related to r ∈[0, tuf ], to v ∈ U , and to the feasible control u ∈ U for OCPis defined to be the unique (global in [0, T ]) solution to thefollowing system of linear differential equations

˙z(s)> = z(s)>

∂f

∂x(s, xu(s), u(s)) 0

∂f0

∂x(s, xu(s), u(s)) 0

z(r) = ξr,vu .

(4)

The proof of the PMP goes by contradiction, consideringPontryagin variations (see, e.g., [23]). We define those to beall the vectors zr,vu (tuf ), where r ∈ (0, tuf ) is a Lebesgue pointof u and v ∈ U . In particular, if (tuf , xu, u) is locally optimalfor OCP, then one infers the existence of a nontrivial tuple(p, p0) ∈ R`g+1, with p0 ≤ 0, satisfying, for all r ∈ (0, tuf )Lebesgue points of u and all v ∈ U ,

(p∂g

∂x(xu(tuf )), p0

)· zr,vu (tuf ) ≤ 0

maxv∈U

H

(tuf , xu(tuf ), p

∂g

∂x(xu(tuf )), p0, v

)= 0, tuf < T,

maxv∈U

H

(tuf , xu(tuf ), p

∂g

∂x(xu(tuf )), p0, v

)≥ 0, tuf = T.

(5)

The non-triviality condition, the adjoint equation, the max-imality condition, and the transversality conditions listed inTheorem 3.1 derive from (5). Specifically, it can be shown thata tuple (tuf , xu, p, p

0, u) is a Pontryagin extremal for OCP if

and only if the nontrivial tuple(p(tuf ) = p

∂g

∂x(xu(tuf )), p0

)∈

Rn+1 with p0 ≤ 0 satisfies (5) (see, e.g., [23]). For this reason,(tuf , xu, p, p

0, u) is also called extremal for OCP.Under the regularity assumption (A2), the previous con-

clusions adapt to each subproblem built in Algorithm 1.

7

Specifically, for every k ∈ N, let (tk+1f , xk+1, uk+1) denote a

solution to LOCP∆k+1, with related trust-region radius ∆k+1.

Since (2) holds, (tk+1f , xk+1, uk+1) is locally optimal for

LOCPk+1. Note that each trajectory xk is defined in the entireinterval [0, T ] thanks to Lemma 3.1. Fix k ∈ N, and for everyr ∈ [0, tk+1

f ] Lebesgue point of uk+1 and every v ∈ U define

ξr,vk+1 =

(fk+1(r, xk+1(r), v)− fk+1(r, xk+1(r), uk+1(r))

f0k+1(r, xk+1(r), v)− f0

k+1(r, xk+1(r), uk+1(r))

).

(6)Straightforward computations show that the control uk does

not explicitly appear within expression (6). Thus the time r ∈[0, tk+1

f ] needs to be a Lebesgue point of uk+1 only. We definethe variation trajectory zr,vk+1 : [0, T ] → Rn+1 related to r ∈[0, tk+1

f ], to v ∈ U , and to the locally-optimal control uk+1

for LOCPk+1 to be the unique (global in [0, T ]) solution tothe following system of linear differential equations

˙z(s)> = z(s)>

∂fk+1

∂x(s, xk+1(s), uk+1(s)) 0

∂f0k+1

∂x(s, xk+1(s), uk+1(s)) 0

z(r) = ξr,vk+1.

(7)

The Pontryagin variations related to LOCPk+1 are all thevectors zr,vk+1(tk+1

f ), where r ∈ (0, tk+1f ) is a Lebesgue point

of uk+1 and v ∈ U . From (A2) and the local optimalityof (tk+1

f , xk+1, uk+1) for LOCPk+1, we infer the existenceof a nontrivial tuple (pk+1, p

0k+1) ∈ R`g+1, with p0

k+1 ≤ 0,satisfying, for r ∈ (0, tk+1

f ) (Lebesgue for uk+1) and v ∈ U ,

(pk+1

∂g

∂x(xk+1(tk+1

f )), p0k+1

)· zr,vk+1(tk+1

f ) ≤ 0

maxv∈U

Hk+1

(tk+1f , xk+1(tk+1

f ),

pk+1∂g

∂x(xk+1(tk+1

f )), p0k+1, v

)= 0, tk+1

f < T,

Hk+1

(tk+1f , xk+1(tk+1

f ),

pk+1∂g

∂x(xk+1(tk+1

f )), p0k+1, v

)≥ 0, tk+1

f = T,

(8)

where Hk+1(s, x, p, p0, u) , p>fk+1(s, x, u) +p0f0

k+1(s, x, u) is the Hamiltonian related to LOCPk+1

(the regularity assumption (A2) plays a key role in recoveringthe transversality condition in (8), see [32, Section 7.3]).Again, the non-triviality condition, the adjoint equation, themaximality condition, and the transversality conditions relatedto LOCPk+1 derive from algebraic manipulations on (8).

The main step in the proof of Theorem 3.2 consists ofshowing that it is possible to pass the limit k →∞ inside (8),recovering a nontrivial tuple (p, p0) ∈ R`g+1 with p0 ≤ 0 thatsatisfies (5). Due to the equivalence between the conditionsof the PMP and (5), this is sufficient to prove the existenceof a Pontryagin extremal for OCP. We will show that thisalso implies the convergences stated in Theorem 3.2. We willonly focus on proving the last part of 2) in Theorem 3.2, byadopting the additional assumption (A3), since proofs of theremaining cases are similar and easier to construct.

2) Convergence of Controls and Trajectories: Consider thesequence of final times (tkf )k∈N. This sequence is boundedby T > 0, and we can extract a subsequence (still denoted(tkf )k∈N) that converges to some time t∗f ∈ [0, T ]. On theother hand, by (A1) (in particular, by the compactness ofU ), the sequence (uk)k∈N ⊆ L2([0, T ];U) is uniformlybounded in L2([0, T ];Rm). Since L2([0, T ];U) is closed andconvex in L2([0, T ];Rm) (because U is compact and convex)and L2([0, T ];Rm) is reflexive, there exists a control u∗ ∈L2([0, T ];U) (in particular u∗ ∈ U) such that we can extracta subsequence (still denoted (uk)k∈N) that converges to u∗

for the weak topology of L2. We denote by x∗ the trajectorysolution to the dynamics of OCP related to the control u∗,which is defined on the entire interval [0, T ] (recall that x∗ is abounded curve thanks to the assumptions on fi, i = 0, . . . ,m).

Next, recalling that thanks to Lemma 3.1 the trajectories xkare defined in the entire interval [0, T ] and uniformly bounded,by leveraging (2) we show that

sups∈[0,T ]

‖xk(s)− x∗(s)‖ −→ 0 (9)

for k → ∞. This will provide the desired convergence of

trajectories. Let us denote δxk+1 ,∫ T

0

‖xk+1(s)−xk(s)‖2 ds,

for k ∈ N. For t ∈ [0, T ] we have that

‖xk+1(t)− x∗(t)‖ ≤∫ t

0‖f0(s, xk(s))− f0(s, x∗(s))‖ ds

+

m∑i=1

∥∥∥∥∫ t

0

(uik+1(s)fi(s, xk(s))− u

i(s)fi(s, x∗(s))

)ds

∥∥∥∥+

∫ t

0

∥∥∥∥∂f0∂x (s, xk(s))

∥∥∥∥ ‖xk+1(s)− xk(s)‖ ds

+m∑i=1

∫ t

0

∥∥∥∥uik(s)∂fi∂x (s, xk(s))

∥∥∥∥ ‖xk+1(s)− xk(s)‖ ds

≤ C1

(∫ t

0‖xk+1(s)− x∗(s)‖ ds+ δxk+1

+

m∑i=1

∥∥∥∥∫ t

0fi(s, x

∗(s))(uik+1(s)− (u∗)i(s)

)ds

∥∥∥∥︸︷︷︸,δu,1k+1

(t)

)

where C1 ≥ 0 is a constant that stems from the uniformboundedness of (xk)k∈N (see also the proof of Lemma 3.1 inthe Appendix). Now, the definition of weak convergence in L2

gives that, for every fixed t ∈ [0, T ], δu,1k+1(t)→ 0 for k →∞.In addition, by the compactness of U and suppfi, there existsa constant C2 ≥ 0 such that, for every t, s ∈ [0, T ]

|δu,1k+1(t)− δu,1k+1(s)| ≤ C2|t− s|

uniformly with respect to k ∈ N. Thus, by [33, Lemma 3.4],δu,1k+1(t) → 0 for k → ∞ uniformly in the interval [0, T ].Finally, since δxk+1 → 0 for k → ∞ by (1) and ∆k → 0, weconclude by a routine Gronwall inequality argument (see alsothe proof of Lemma 3.1 in the Appendix).

Let us prove that the trajectory x∗ : [0, T ]→ Rn is feasiblefor OCP. To do so, we only need to prove that g(x∗(t∗f )) =0. Note that from (1), ∆k → 0, and the boundedness and

8

convergence of the trajectories, we have that

‖g(x∗(t∗f ))‖ ≤ ‖g(x∗(t∗f ))− g(xk(tkf ))‖

+

∥∥∥∥∂g∂x (xk(tkf ))

∥∥∥∥ ‖xk+1(tk+1f )− xk(tkf )‖ −→ 0,

and we conclude that x∗ : [0, T ] → Rn is feasible for OCP.In particular, since g(x0) 6= 0, we must also have that t∗f 6= 0,and therefore t∗f ∈ (0, T ].

3) Convergence of Pontryagin Variations: Due to the conver-gence of controls and trajectories, we can now prove that it ispossible to pass the limit k →∞ inside (8), showing that (5)holds. Specifically, we first recall a convergence result whoseproof comes from a straightforward adaptation to [31, Lemma3.11], whereby the continuity of the controls is replaced bythe weaker assumption (A3).

Lemma 3.2 (Pointwise convergence of controls): Under(A3), for every r ∈ (0, t∗f ) Lebesgue point of u, there exists(rk)k∈N ⊆ (0, t∗f ) such that rk is a Lebesgue point of uk andrk → r, uk(rk)→ u∗(r) for k →∞.

Now, fix r ∈ (0, t∗f ) Lebesgue point of u∗, and v ∈ U , andlet (rk)k∈N be the sequence provided by Lemma 3.2 relatedto r and v. We prove the following convergence:

sups∈[r,T ]

‖zrk+1,vk+1 (s)− zr,vu∗ (s)‖ −→ 0 (10)

for k → ∞, where zrk+1,vk+1 solves (7) with initial condition

zrk+1,vk+1 (rk+1) = ξ

rk+1,vk+1 given by (6), whereas zr,vu∗ solves (4)

with initial condition zr,vu∗ (r) = ξr,vu∗ given by (3). First,

‖ξrk+1,v

k+1 − ξr,vu∗ ‖ ≤

≤m∑i=1

|vi|‖fi(rk+1, xk+1(rk+1))− fi(r, x∗(r))‖

+

m∑i=1

‖uik+1(rk+1)fi(rk+1, xk+1(rk+1))− (u∗)i(r)fi(r, x∗(r))‖

+ ‖G(rk+1, v)−G(r, v)‖+ ‖G(rk+1, uk+1(rk+1))−G(r, u∗(r))‖

+m∑i=1

|vi|‖Li(rk+1, xk+1(rk+1))− Li(r, x∗(r))‖

+

m∑i=1

‖uik+1(rk+1)Li(rk+1, xk+1(rk+1))− (u∗)i(r)Li(r, x∗(r))‖

≤ C3

(|rk+1 − r|+ ‖xk+1(rk+1)− x∗(r)‖+ ‖uk+1(rk+1)− u∗(r)‖

)

where C3 ≥ 0 is a constant, and from Lemma 3.2 and (9)we infer that ‖ξrk+1,v

k+1 − ξr,vu∗ ‖ → 0 for k → ∞. Second, byleveraging the uniform boundedness of the trajectories, withthe same exact argument proposed in the proof of Lemma3.1 in the Appendix, one may show that the sequence ofvariation trajectories (zrk,vk )k∈N is uniformly bounded in thetime interval [r, T ]. From this, for every t ∈ [r, T ]

‖zrk+1,v

k+1 (t)− zr,vu∗ (t)‖ ≤ ‖ξrk+1,v

k+1 − ξr,vu∗ ‖+ C4|rk+1 − r|

+

∫ t

r

(∥∥∥∥∂fk+1

∂x(s, xk+1(s), uk+1(s))

∥∥∥∥+

∥∥∥∥∥∂f0k+1

∂x(s, xk+1(s), uk+1(s))

∥∥∥∥∥)‖zrk+1,v

k+1 (s)− zr,vu∗ (s)‖ ds

+

∥∥∥∥∫ t

rzr,vu∗ (s)>

(∂fk+1

∂x(s, xk+1(s), uk+1(s)) 0

∂f0k+1

∂x(s, xk+1(s), uk+1(s)) 0

−

∂f

∂x(s, x∗(s), u∗(s)) 0

∂f0ω

∂x(s, x∗(s), u∗(s)) 0

) ds

∥∥∥∥

≤ ‖ξrk+1,v

k+1 − ξr,vu∗ ‖+ C4

(|rk+1 − r|+ |ωk+1 − ωmax|

+

∫ t

r‖zrk+1,v

k+1 (s)− zr,vu∗ (s)‖ ds

+

∫ t

r‖xk(s)− x∗(s)‖ ds+

∫ t

r‖xk+1(s)− x∗(s)‖ ds

)+

m∑i=1

∥∥∥∥∫ t

rF (fi, L

i, zr,vu∗ )(s)(uik(s)− (u∗)i(s)

)ds

∥∥∥∥︸︷︷︸,δu,2

k+1(t)

,

where the (overloaded) constant C4 ≥ 0 comes from the uni-form boundedness of both (xk)k∈N and (zrk,vk )k∈N previouslystated. In particular, we introduce the terms F (fi, L

i, zr,vu∗ ) :[r, T ] → R that are continuous and uniformly bounded map-pings depending on fi, Li, and zr,vu∗ . Following the exact sameargument we developed for δu,1k+1, one prove that δu,2k+1(t)→ 0for k →∞, uniformly in the interval [r, T ], so that (9) and aroutine Gronwall inequality argument allow us to obtain (10).

Importantly, convergence (10) implies that, for k →∞,

‖zrk,vk (tkf )− zr,vu∗ (t∗f )‖ −→ 0. (11)

4) Convergence of Extremals and Conclusion: At this step,consider the sequence of tuples (pk, p

0k)k∈N, with p0

k ≤ 0 forevery k ∈ N. It is clear that the variational expressions (8)remain valid whenever (pk, p

0k) is multiplied by some positive

constant. Therefore, without loss of generality, we may assumethat ‖(pk, p0

k)‖ = 1 and p0k ≤ 0 for every k ∈ N. Then,

we can extract a subsequence (still denoted (pk, p0k)k∈N) that

converges to some nontrivial tuple (p, p0) satisfying p0 ≤ 0.We may leverage (9) and (11) to prove that (t∗f , x

∗, p, p0, u∗)is the sought-after extremal for OCP. Indeed, it can be readilychecked that (A1) and the previous convergences imply

maxv∈U

Hk+1

(tk+1f , xk+1(tk+1

f ), pk+1∂g

∂x(xk+1(tk+1

f )), p0k+1, v

)−→ max

v∈UH

(t∗f , x

∗(t∗f ), p∂g

∂x(x∗(t∗f )), p0, v

)for k →∞, so that we infer the transversality condition of (5)

from the transversality condition of (8). Moreover, for everyr ∈ (0, t∗f ) Lebesgue point of u∗, and v ∈ U , (9) and (11) we

9

have that, for k →∞,(p∂g

∂x(x∗(t∗f )), p0

)· zr,vu∗ (t∗f ) ≤

≤∣∣∣∣ (p∂g∂x (x∗(t∗f )), p0

)· zr,vu∗ (tf )

−(pk∂g

∂x(xk(tkf )), p0

k

)· zrk,vk (tkf )

∣∣∣∣ −→ 0

due to the inequality of (8), and we conclude.The proof of Theorem 3.2 is achieved if we show that

sups∈[0,t∗f ]

‖pk(s)− p(s)‖ −→ 0 (12)

for k →∞, where pk+1 solvespk+1(s) = −

∂Hk+1

∂x(s, xk+1(s), pk+1(s), p0

k+1, uk+1(s))

pk+1(tk+1f ) = pk+1

∂g

∂x(xk+1(tk+1

f )),

whereas p solvesp(s) = −∂H

∂x(s, x∗(s), p(s), p0, u∗(s))

p(t∗f ) = p∂g

∂x(x∗(t∗f )).

To this end, by leveraging the uniform boundedness of thetrajectories, with the same exact argument proposed in theproof of Lemma 3.1 (see the Appendix), one shows that thesequence (pk)k∈N is uniformly bounded in the interval [0, T ].From this, for every t ∈ [0, t∗f ] we have that

‖pk+1(t)− p(t)‖ ≤∥∥∥∥pk+1

∂g

∂x(xk+1(tk+1

f ))− p∂g

∂x(x∗(t∗f ))

∥∥∥∥+ C5

(|tk+1f − t∗f |+ |p

0k+1 − p

0|+∫ t∗f

t‖pk+1(s)− p(s)‖ ds

+

∫ t∗f

t‖xk(s)− x(s)‖ ds+

∫ t∗f

t‖xk+1(s)− x∗(s)‖ ds

)+

m∑i=1

∥∥∥∥∥∫ t∗f

tF (fi, L

i, p, p0)(s)(uik(s)− (u∗)i(s)

)ds

∥∥∥∥∥︸︷︷︸,δu,3

k+1(t)

where the constant C5 ≥ 0 comes from the uniform bounded-ness of both (xk)k∈N and (pk)k∈N, whereas F (fi, L

i, p, p0) :[r, t∗f ] → R again denote continuous and uniformly boundedmappings that depend on fi, Li, p, and p0. Following theexact same argument we developed for δu,1k+1, one proves thatδu,3k+1(t)→ 0 for k →∞, uniformly in the interval [0, t∗f ], sothat (9) and a routine Gronwall inequality argument allow usto conclude (see also the Appendix).

IV. SEQUENTIAL CONVEX PROGRAMMING WITHMANIFOLD-TYPE CONSTRAINTS

We now show how the framework described in Section IIIcan be applied verbatim to solve our optimal control prob-lem when additional manifold-type constrains are considered,

under mild regularity assumptions on the dynamics. In thiscontext, we focus on problems OCPM defined as:


∫ tf

0

f0(s, x(s), u(s)) ds

x(s) = f(s, x(s), u(s)), a.e. s ∈ [0, T ]

x(0) = x0 ∈M, g(x(tf )) = 0

x(s) ∈M ⊆ Rn, s ∈ [0, tf ]

where M ⊆ Rn is a smooth d-dimensional submanifold of Rnand, for the sake of consistency, we assume that g−1(0)∩M 6=∅. Similar to the previous case, it is clear that any solution(t∗f , x

∗, u∗) to OCPM that strictly satisfies the penalized stateconstraints is also a locally-optimal solution to OCP.

A. Unchanged Framework under Regular Dynamics

One possibility to solve OCPM would consist of penalizingthe manifold-type constraints within the cost (see Remark 2.2).Although possible, this approach might add undue complexityto the formulation. Interestingly, in several important cases forapplications, this issue can be efficiently avoided. To this end,we assume that the following regularity condition holds:

(A4) For i = 0, . . . ,m, the vector fields fi : Rn+1 → Rn aresuch that fi(s, x) ∈ TxM , for every (s, x) ∈ R×M .

In (A4), TxM denotes the tangent space of M at x ∈M , which we identify with a d-dimensional subspace ofRn. This requirement is often satisfied when dealing withmechanical systems in aerospace and robotics applications (forinstance, consider rotation and/or quaternion-type constraints).Under (A4), as a classical result, the trajectories of x(s) =f(s, x(s), u(s)) starting from x0 ∈M lie on the submanifoldM , and therefore, the condition x(s) ∈ M , s ∈ [0, tf ],is automatically satisfied. In other words, we may removemanifold-type constraints from problem OCPM so that itexactly resembles OCP, i.e., the formulation adopted in SectionIII with the additional constraint x0 ∈ M . At this step, wemay leverage the machinery built previously to solve OCP.Specifically, the construction of each subproblem LOCPk andAlgorithm 1 applies unchanged. Due to the linearization ofthe dynamics, solutions to the convex subproblems are notsupposed to lie on M . However, convergence does force thelimiting trajectory to satisfy the manifold-type constraints.

B. Convergence Analysis

The convergence of Algorithm 1 applied to this new contextcan be inferred from Theorem 3.2. However, despite the reg-ularity assumption (A4), it is not obvious that the optimalityclaimed by this result extends to the general geometric settingbrought on by manifold-type constraints. Specifically, if Algo-rithm 1 converges to a trajectory satisfying the assumptions ofTheorem 3.2, although such a trajectory meets manifold-typeconstraints, the related extremal satisfies the PMP for problemsdefined in the Euclidean space by construction. In otherwords, a priori the extremal does not carry any informationabout the geometric structure of a problem with manifold-type constraints, whereas extremals for OCPM are expected

10

to satisfy stronger geometrically-consistent necessary condi-tions for optimality. Specifically, to recover a geometrically-consistent candidate optimal solution for OCPM , we mustshow that this satisfies the Geometric PMP (GPMP) (see, e.g.,[23]), necessary conditions for optimality for OCPM whichare stronger than PMP. This is our next objective.

Before stating the GPMP related to formulation OCPM , wefirst need to introduce some notation and preliminary results(further details may be found in [23]). We denote TM andT ∗M as the tangent and cotangent bundle of M , respectively.Due to (A4), the mapping

fM : R×M × Rm → TM : (s, x, u) 7→ f(s, x, u)

is a well-defined, non-autonomous vector field of M . Thus,trajectories related to feasible solutions (tf , x, u) for OCPM

may be seen as solutions to the geometric dynamical equations

x(s) = fM (s, x(s), u(s)), x0 ∈M. (13)

In a geometric setting, given a feasible solution (tf , x, u) forOCPM , Pontryagin extremals are represented by the quantity(tf , λ, p

0, u). In particular, the information concerning thetrajectory x that satisfies (13) is encapsulated within thecotangent curve λ : [0, tf ] → T ∗M , i.e., x(s) = π(λ(s)),s ∈ [0, tf ], where π : T ∗M →M is the canonical projection.At this step, for λ ∈ T ∗M and p0 ∈ R, we may define thegeometric Hamiltonian (related to OCPM ) as

H(s, λ, p0, u) , 〈λ, fM (s, π(λ), u)〉+ p0f0(s, π(λ), u),

where 〈·, ·〉 denotes the duality in T ∗M . We remark thatwhenever M = Rn, we recover the Hamiltonian introduced inSection III. In the geometric framework, adjoint equations aredescribed in terms of Hamiltonian vector fields. Specifically,as a classical result, for every (s, u) ∈ Rm+1 one canassociate to H(s, ·, ·, u) a unique vector field

→H(s, ·, ·, u) :

T ∗(M × R) → T (T ∗(M × R)) of the product cotangentbundle T ∗(M×R) (known as Hamiltonian vector field) by the

rule σ(λ,p0)

(·,→H(s, λ, p0, u)

)=

∂H

∂(λ, p0)(s, λ, p0, u), with σ

being the canonical symplectic form of T ∗(M × R). We arenow ready to state the GPMP related to OCPM .

Theorem 4.1 (Geometric Pontryagin Maximum Principle):Let (t∗f , x

∗, u∗) be a locally-optimal solution to OCPM . Thereexists an absolutely continuous curve λ : [0, t∗f ] → T ∗M 1

with x∗(s) = π(λ(s)), s ∈ [0, t∗f ] and a constant p0 ≤ 0 suchthat the following hold:• Non-Triviality Condition: (λ, p0) 6= 0.• Adjoint Equation: Almost everywhere in [0, t∗f ],

d(λ, p0)

ds(s) =

→H(s, λ(s), p0, u∗).

• Maximality Condition: Almost everywhere in [0, t∗f ],

H(s, λ(s), p0, u∗(s)) = maxv∈U

H(s, λ(s), p0, v).

• Transversality Conditions: It holds that

λ(tf ) ⊥ ker∂gM∂x

(x∗(t∗f )),

1Continuity is meant with respect to the standard topology in T ∗M .

and if the final time is free,maxv∈U

H(t∗f , λ(t∗f ), p0, v) = 0, t∗f < T,

maxv∈U

H(t∗f , λ(t∗f ), p0, v) ≥ 0, t∗f = T,

where we denote gM : M → R`g : x 7→ g(x).The tuple (t∗f , λ, p

0, u∗) is called a geometric extremal.Assuming that Algorithm 1 applied as described above

converges, we prove that the limiting solution is a candidatelocal optimum for OCPM by showing that it is possibleto appropriately orthogonally project the extremal for OCPprovided by Theorem 3.2 to recover a geometric extremalfor OCPM . First, we need to introduce the notion of theorthogonal projection to a subbundle. Specifically, given thecotangent bundles T ∗M ⊆ T ∗Rn ∼= R2n, define T ∗Rn|M ,⋃x∈M {x} × T ∗xRn ∼= M × Rn. Equipped with the structure

of the pullback bundle given by the canonical projectionT ∗Rn|M → M , T ∗Rn|M is a vector bundle over M ofrank n, and T ∗M may be identified with a subbundle ofT ∗Rn|M . We build an orthogonal projection operator fromT ∗Rn+1|R×M to T ∗(R × M) by leveraging the usual or-thogonal projection in Rn+1. To do this, let x ∈ M and(V, ϕ) = (V, y1, . . . , yn) be a local chart of x in Rn adaptedto M , i.e., satisfying ϕ(V ∩M) = ϕ(V ) ∩ Rd × {0}n−d. Byconstruction, {dyj(·)}j=1,...,n is a local basis for T ∗Rn|M and{dyj(·)}j=1,...,d is a local basis for T ∗M around x. Considerthe cometric 〈·, ·〉Rn in T ∗Rn|M which is induced by theEuclidean scalar product in Rn. The Gram-Schmidt processapplied to {dyj(·)}j=1,...,n provides a local orthonormal frame{Ej(·)}j=1,...,n for T ∗Rn|M , that satisfies in V ∩M

span〈E1(·), . . . , Ej(·)〉 = span〈dy1(·), . . . , dyj(·)〉 (14)

for every 1 ≤ j ≤ n. It follows that, when restricted to V ∩M ,the following orthogonal projection operator

Π : T ∗Rn+1|R×M → T ∗(R×M) ∼= R2 × T ∗M

(z, x, p0, p) 7→(

(z, p0),

d∑j=1

〈p,Ej(x)〉Rn Ej(x)

)is well-defined and smooth. Moreover, since the change offrame mapping between two orthonormal frames is orthogonal,from (14) it is readily checked that Π is globally defined.Equipped with the GPMP and orthogonal projections, thenumerical strategy to solve OCPM detailed above becomesmeaningful and justified by the following convergence result(similar to the discussion for Theorem 3.2, the convergencesstated therein readily extend to the discretized setting).

Theorem 4.2 (Convergence for SCP with manifold constraints):Assume that (A1), (A2), and (A4) hold. Moreover, assumethat applying Algorithm 1 to OCPM when manifold-typeconstraints are dropped returns a sequence (∆k, t

kf , uk, xk)k∈N

such that, for every k ∈ N, the tuple (tk+1f , uk+1, xk+1)

locally solves LOCP∆k+1 with

|tk+1f − tkf | < ∆k+1,

∫ T

0

‖xk+1(s)− xk(s)‖2 ds < ∆k+1.

Then there exists a tuple (t∗f , x∗, p, p0, u∗) that is an extremal

for OCPM when manifold-type constraints are dropped and

11

satisfies all the statements listed in Theorem 3.2 (where theconvergence of (uk)k∈N for the strong topology of L2 may bereplaced by the weak topology of L2 whenever (A3) holds).In addition, the limiting trajectory satisfies x∗(s) ∈ M , s ∈[0, t∗f ], and by defining the absolutely continuous curve

λ : [0, tf ]→ T ∗M : t 7→ π2

(Π(z∗(t), x∗(t), p0, p(t)

)), (15)

where π2 : T ∗(R ×M) → T ∗M : ((z, p0), ξ) 7→ ξ and z∗ :[0, t∗f ]→ R satisfies z(s) = f0(s, x∗(s), u∗(s)), z(0) = 0, thetuple (t∗f , λ, p

0, u∗) is a geometric extremal for OCPM .

C. Proof of the Convergence ResultLet (t∗f , x

∗, p, p0, u∗) be an extremal for OCPM in the casewhere manifold-type constraints are dropped, whose existenceis guaranteed by Theorem 3.2. To avoid overcharging thenotation, in the rest of this section we denote (tf , x, p, p

0, u) ,(t∗f , x

∗, p, p0, u∗). Because (A4) implies that x(s) ∈ M ,s ∈ [0, tf ], Theorem 4.2 is proved once we show that the tuple(tf , λ, p

0, u) with λ built as in (15) satisfies the non-trivialitycondition, the adjoint equation, the maximality condition, andthe transversality conditions of Theorem 4.1. In what follows,

we denote dgx =∂g

∂x(x), d(gM )x =

∂gM∂x

(x).1) Adjoint Equation: Before getting started, we introduce

some fundamental notations. For every (t0, z0, p0) ∈ [0, tf ]×Rn+1, the differential equation

z(s) = f0(s, x(s), u(s)), x(s) = f(s, x(s), u(s)) (16)

with z(t0) = z0, x(t0) = p0 has a unique solution, whichmay be extended to the whole interval [0, tf ]. We denoteby exp : [0, tf ]2 × Rn+1 → Rn+1 the flow of (16), i.e.,exp(·; t0, (z0, p0)) solves (16) with initial condition (z0, p0)at time t0. As a classical result, for every (t, t0) ∈ [0, tf ]2, themapping exp(t; t0, ·) : Rn+1 → Rn+1 is a diffeomorphism.With this notation at hand, one can show that the solution p tothe adjoint equation of Theorem 3.2 is such that for s ∈ [0, tf ],

(p0, p(s)) = (exp(tf ; s, ·))∗(z,x)(tf ) · (p0, p(tf )), (17)

where we denote (z, x)(t) , exp(t, 0; (0, x0)) and (·)∗ de-notes the pullback operator of 1-forms in Rn+1 (see, e.g.,[23]). At this step, to prove that (λ, p0) satisfies the adjointequation of Theorem 4.2 with λ defined in (15) and (p0, p) sat-isfying (17), we can leverage classical results from symplecticgeometry in the context of Hamiltonian equations (see, e.g.,[23]) from which it is sufficient to prove the following lemma:

Lemma 4.1 (Projections of solutions to Hamiltonian systems):For almost every t ∈ [0, tf ], let (V, ϕ) = (V, y0, . . . , yn) be alocal chart of (z, x)(t) , exp(t, 0; (0, x0)) (which is a pointin R×M due to (A4)) in Rn+1 adapted to R×M . For everyi = 0, . . . , d, it holds that

d

ds

(Π(

(exp(tf ; s, ·))∗(z,x)(tf ) · (p0, p(tf ))

)( ∂

∂yi((x, z)(s)

)))(t)

= −d∑j=0

∂(f0, fM )j∂yi

(t, x(t), u(t)) ·

Π(

(exp(tf ; t, ·))∗(z,x)(tf ) · (p0, p(tf ))

)( ∂

∂yj((x, z)(t)

)),

where (·)∗ denotes the pullback operator of 1-forms in Rn+1.Proof: For indices i = 0, . . . , n, we denote

ai(t) = Π(

(exp(tf ; t, ·))∗(z,x)(tf ) · (p0, p(tf ))

)( ∂

∂yj((x, z)(t)

)).

Since by the definition of the pullback it holds that

(exp(tf ; t, ·))∗(z,x)(tf ) ·(p0, p(tf )) =

n∑j=0

bj(t)dyj((z, x)(t)

)(18)

for appropriate coefficients bj(t), j = 0, . . . , n, from (14),

Π((exp(tf ; t, ·))∗(z,x)(tf ) · (p

0, p(tf )))

=

d∑j=0

bj(t)dyj((z, x)(t)

),

which yields aj(t) = bj(t) for every j = 0, . . . , d. Therefore,by inverting (18), we obtain

(p0, p(tf )) =

d∑j=0

aj(t)(exp(t; tf , ·))∗(z,x)(t) · dyj((z, x)(t)

)+

n∑j=d+1

bj(t)(exp(t; tf , ·))∗(z,x)(t) · dyj((z, x)(t)

).

Now, let (A,α) = (A,w0, . . . , wn) be a local chart of(z, x)(tf ) in Rn+1 adapted to R × M . Since due to (A4),the trajectory (z, x)(t) lies entirely in R ×M and the chart(V, ϕ) is adapted to R×M , for every i = 0, . . . , d and everyj ≥ d+ 1, one computes

(exp(t; tf , ·))∗(z,x)(t) · dyj((z, x)(t)

)( ∂

∂wi((z, x)(tf )

))=

∂

∂wi(yj ◦ exp(t; tf , ·) ◦ α−1)

(α((z, x)(tf ))

)= 0.

This implies that for every i = 0, . . . , d,

(p0, p(tf ))

(∂

∂wi((z, x)(tf )

))=

d∑j=0

aj(t)∂

∂wi(yj ◦ exp(t; tf , ·) ◦ α−1)

(α((z, x)(tf ))

).

The term on the left-hand side does not depend on t. Therefore,a differentiation with respect to t together with (17) lead to2

d∑j=0

[aj(t)

((exp(t; tf , ·))∗(z,x)(t)·

dyj((z, x)(t)

)( ∂

∂wi((z, x)(tf )

)))+

d∑`=0

aj(t)∂(f0, f)j∂y`

(t, x(t), u(t))

((exp(t; tf , ·))∗(z,x)(t)·

dy`((z, x)(t)

)( ∂

∂wi((z, x)(tf )

)))]= 0, (19)

which must hold for every i = 0, . . . , n. At this step,we notice that due to (A4), for every j = 0, . . . , d and

2Note that, as soon as i = 0, . . . , d, quantities in (19) evolve in R ×M .Therefore, indices greater than d do not explicitly appear in calculations.

12

every ` = 0, . . . , d, we have∂(f0, f)j∂y`

(t, x(t), u(t)) =

∂(f0, fM )j∂y`

(t, x(t), u(t)). Moreover, due to (A4), the restric-

tion exp(t; tf , ·) : R ×M → R ×M is well-defined and isa diffeomorphism. Hence (exp(t; tf , ·))∗ is an isomorphismwhen restricted to 1-forms in T ∗M . Combining those with(19) gives

d∑j=0

[aj(t)dy

j((z, x)(t)

)+

d∑`=0

aj(t)∂(f0, fM )j

∂y`(t, x(t), u(t))dy`

((z, x)(t)

)]= 0

and the conclusion follows.2) Maximality, Transversality and Non-Triviality Conditions:

Before getting started, consider the following analysis oftangent spaces. From (A4) and the definition of gM , it holdsthat x(tf ) ∈ g−1(0) ∩M = g−1

M (0). Note that g−1(0) ⊆ Rnand g−1

M (0) ⊆ M are submanifolds of dimension n − `g andd− `g , respectively, with tangent spaces given by

Txg−1(0) = {v ∈ TxRn ∼= Rn : dgx(v) = 0}, x ∈ g−1(0)

Txg−1M (0) = {v ∈ TxM : d(gM )x(v) = 0}, x ∈ g−1

M (0).

In particular, by subspace identification, for every x ∈ g−1M (0),

one has Txg−1M (0) ⊆ Txg

−1(0) ∩ TxM . The inclusion aboveis actually an identity. To see this, let x ∈ g−1

M (0) ⊆ M and(V, ϕ) = (V, y1, . . . , yn) be a local chart of x in Rn adaptedto M . The definition of adapted charts immediately gives that

dgx

(∂

∂yj(x)

)= d(gM )x

(∂

∂yj(x)

), for j = 1, . . . , d. Thus,

if v =

d∑j=1

vj∂

∂yj(x) ∈ TxM such that dgx(v) = 0, it holds

that

d(gM )x(v) =

d∑j=1

vjd(gM )x

(∂

∂yj(x)

)= dgx(v) = 0,

and the sought after identity follows. A straightforward ap-plication of Grassmann’s formula to this identity in particularyields

Rn = Tx(tf )g−1(0) + Tx(tf )M. (20)

Noticing that the maximality condition and the transver-sality condition on the final time are straightforward conse-quences of (A4), we are now ready to prove the transversalitycondition at the final point and the non-triviality condition.

To show the validity of the transversality condition at thefinal point, let us prove that for every v ∈ Tx(tf )M ⊆ Rn, itholds that

〈λ(tf ), v〉 = p(tf )>v, (21)

which provides the desired result because p(tf ) · v = 0for v ∈ Tx(tf )g

−1(0), due to Theorem 3.2. To show this,by the Gram-Schmidt process, we can build a local or-thonormal frame {Ej(·)}j=1,...,n for T ∗x(tf )R

n around x(tf )

such that {Ej(·)}j=1,...,d is a local frame for T ∗x(tf )M

around x(tf ). The dual frames {E∗j (x(tf ))}j=1,...,n and

{E∗j (x(tf ))}j=1,...,d span T ∗∗x(tf )Rn ∼= Tx(tf )Rn ∼= Rn and

T ∗∗x(tf )M∼= Tx(tf )M , respectively. Thus, for any tangent

vector v ∈ Tx(tf )M , the definitions of the dual frame andof the orthogonal projection Π allow us to conclude that

〈λ(tf ), v〉 =

⟨d∑j=1

〈p,Ej(x)〉Rn Ej(x),

d∑j=1

vj E∗j (x)

⟩

=

⟨n∑j=1

〈p,Ej(x)〉Rn Ej(x),

d∑j=1

vj E∗j (x)

⟩= p(tf )>v.

Finally, let us focus on the non-triviality condition. Bycontradiction, assume that there exists t ∈ [0, tf ] such that(λ(t), p0) = 0. The linearity of the adjoint equation yieldsλ(s) = 0 for all s ∈ [0, tf ], so that λ(tf ) = 0. On the otherhand, from the transversality conditions of Theorem 3.2, weknow that p(tf ) ⊥ Tx(tf )g

−1(0). Now, given v ∈ Rn, from(20) we infer that v = v1 + v2 with v1 ∈ Tx(tf )g

−1(0) andv2 ∈ Tx(tf )M so that from (21), one obtains

p(tf )>v = p(tf )>v2 = 〈λ(tf ), v2〉 = 0.

This leads to (p, p0) = 0, in contradiction with the non-triviality condition of Theorem 3.2. The conclusion follows.

V. ACCELERATING CONVERGENCE THROUGHINDIRECT SHOOTING METHODS

An important result provided by Theorem 3.2 (and conse-quently by Theorem 4.2) is the convergence of the sequence ofthe extremals (related to the sequence of convex subproblems)towards an extremal for the (penalized) original formula-tion. This can be leveraged to accelerate the convergenceof Algorithm 1 by warm-starting indirect shooting methods[15], [24]. Indirect shooting methods consist of replacing theoriginal optimal control problem with a two-point boundaryvalue problem formulated from the necessary conditions foroptimality stated by the PMP. When indirect shooting methodssucceed in converging to a locally-optimal solution, theyconverge very quickly (quadratically, in general). Nevertheless,they are very sensitive to initialization, which often presentsa difficult challenge (see, e.g., [24], [34]). In the following,with the help of Theorem 3.2, we show how the initializationof indirect shooting methods may be bypassed by extractinginformation from the multipliers at each SCP iteration. Theresulting indirect shooting methods may thus be combinedwith SCP to decrease (sometimes drastically decrease) the totalnumber of iterations, given that the convergence rate of SCP isin general lower than that provided by shooting methods (see,e.g., [20]). For the sake of clarity, we provide details in theabsence of manifold-type constraints, knowing from Theorem4.2 that the same reasoning can be applied to problems withsuch constraints.

From now on, without loss of generality we assume thatevery extremal that is mentioned below is normal, that is,by definition, p0 = −1 (as mentioned earlier, this is avery mild requirement, see, e.g., [28]). Assume that a time-discretized version of Algorithm 1 converges. In particular,due to the arguments in Section III-C, we can assume thatthe convergence result stated in Theorem 3.2 applies to the

13

sequence of KKT multipliers related to the time discretizationof each convex subproblem LOCP∆

k . For every k ≥ 1, theKKT multiplier γ0

k that is related to the initial conditionx(0) = x0 approximates the initial value pk(0) of the extremalrelated to LOCP∆

k (see, e.g., [30]). Therefore, Theorem 3.2implies that up to some subsequence, for every δ > 0 thereexists a kδ ≥ 1 such that for every k ≥ kδ , it holds that‖p(0)− γ0

k‖ < δ, where p comes from an extremal related toOCP. In particular, select δ > 0 to be the radius of convergenceof an indirect shooting method that we use to solve OCP(a rigorous notion of radius of convergence of an indirectshooting method may be inferred from the arguments in [15]).Any such indirect shooting method is then able to achieveconvergence if initialized with γ0

k , for k ≥ kδ . In other words,we may stop SCP at iteration kδ and successfully initializean indirect shooting method related to the original (penalized)formulation with γ0

kδto find a locally-optimal solution be-

fore SCP achieves full convergence, drastically reducing thenumber of SCP iterations used. Since in practice we do nothave any knowledge of δ > 0 and indirect shooting methodsreport convergence failures quickly, we can just run an indirectshooting method after every SCP iteration and stop wheneverthe latter converges (eventual convergence is ensured by theargument above). This acceleration procedure is summarizedin Algorithm 2. Details concerning the implementation ofindirect shooting methods are provided in the next section.

Algorithm 2: Accelerated SCPInput : Guesses for trajectory x0 and control u0.Output: Solution to OCP.Data : Initial trust-region radius ∆0 > 0.

1 begin2 k = 0, ∆k+1 = ∆k, flag = 03 while ((uk)k∈N has not converged) or flag = 0

do4 Solve LOCP∆

k+1 for (tk+1f , xk+1, uk+1)

5 Solve an indirect shooting method on OCP to(tk+1f , xk+1, uk+1) initialized with the

multiplier related to the constraint x(0) = x0,and if successful, put flag = 1

6 ∆k+1 =

UpdateRule(tk+1f , xk+1, uk+1, t

kf , xk, uk)

7 k ← k + 1

8 return (tk−1f , xk−1, uk−1)

VI. NUMERICAL EXPERIMENTS

Next, we perform numerical experiments for the free-final-time optimal control of a nonlinear system subject to obstacle-avoidance constraints. We describe our implementation of theindirect shooting method and leverage Theorem 3.2 to handlethe free final time. Finally, we demonstrate the performanceof SCP and the gains from our acceleration procedure.

1) Problem formulation: We consider a 3-dimensional non-holonomic Dubins car, with state x = [rx, ry, θ] ∈ R3 and

θ(tf ) fixed θ(tf ) free

tffree

θ(tf )− θf = 0 (25a)H∗ω(tf ) = 0 (25b)

pθ(tf ) = 0 (26a)H∗ω(tf ) = 0 (26b)

Fig. 1. Transversality conditions of the PMP for Dubins car, de-pending on whether or not the final angle is free. H∗

ω(tf ) denotesHω(x(tf ), ϕ(x(tf ), p(tf )), p(tf )).

control u ∈ R. The dynamics are x= [v cos(θ), v sin(θ), ku],where (v, k) = (1, 2) are the constant speed and turning curva-ture. Starting from x0, the objective of the problem is to reachthe state xf while minimizing control effort

∫ tf0u(s)2ds and

avoiding obstacles. Control bounds are set to U = [−u, u]with u = 0.25. We consider problems with free final time tfand both fixed and free final angle θ(tf ). We consider nobscylindrical obstacles of radius εi centered at point ri ∈ R2.For each obstacle, we set up an obstacle avoidance constraintusing the smooth potential function ci : R2 → R, defined as

ci(r) =

{(‖r − ri‖2 − ε2

i )2, if ‖r − ri‖ < εi

0, otherwise, (22)

where r = [rx, ry]. To incorporate these constraints withinour problem formulation, we penalize them within the costfunction and define OCP to minimize the cost

∫ tf0

(u(s)2 +

ω∑nobsi=1 ci(r)

)ds, which is convex in (r, u) and continuously

differentiable. This yields the following problem:minu,tf

∫ tf

0

(u(s)2 + ω

∑i

ci(r(s)))

ds

rx(s) = v cos θ(s), ry(s) = v sin θ(s),

θ(s) = ku(s), x(0) = x0, x(tf ) = xf .

(23)

2) Indirect shooting method: As described in Section Vand Algorithm 2, the solution at each SCP iteration canbe used to initialize an indirect shooting method for (23).Accordingly, we next derive the associated two-point boundaryvalue problem using the necessary conditions for optimalityof the PMP. Assuming p0 =−1 (see Section V), the Hamil-tonian Hw(s, x, p, p0, u) = p>f(s, x, u) + p0f0

w(s, x, u) withp= [px, py, pθ] is expressed as

Hw(x, u, p)=v(px cos θ+py sin θ)+kupθ−(u2+ω

nobs∑i=1

ci(r)).

Applying the adjoint equation and the maximality conditionof the PMP (Theorem 3.1), we obtain the following relations:

px = ω∂(∑

ci(r))

∂rx, py = ω

∂(∑

ci(r))

∂ry,

pθ = v(px sin θ − py cos θ), and(24)

u = ϕ(x, p) =

u if pθk

2 ≥ u,pθk2 if pθk

2 ∈ (−u, u),

−u if pθk2 ≤ u.

Further, the transversality conditions of the PMP for bothproblems with fixed and free final angle θf are shown in Figure1. Based on these conditions, we define the shooting function

14

F : R3 × R3 → R4 as:

F1(x(tf ), p(tf )) = rx(tf )− rx,fF2(x(tf ), p(tf )) = ry(tf )− ry,f

F3(x(tf ), p(tf )) =

{θ(tf )− θf , if θ(tf ) fixedpθ(tf ), if θ(tf ) free

F4(x(tf ), p(tf )) =

{0, if tf fixedH∗ω(tf ), if tf free

.

The PMP states that Fi(x(tf ), p(tf )) = 0 for all i= 1, 2, 3, 4for any locally-optimal trajectory. Thus, based on the condi-tions of the PMP, we set the following root-finding problem:

Find (p0, tf ) s.t. Fi(x(tf ), p(tf )) = 0, i= 1, 2, 3, 4,

x = [v cos θ, v sin θ, kϕ(x, p)], x(0) = x0,

p = (24), p(0) = p0.

Given (x0, p0, tf ), we obtain x(tf ) and p(tf ) by numericalintegration of the dynamics and the adjoint equation. Then,given an initial guess, this problem can be solved using off-the-shelf root-finding algorithms, e.g., Newton’s method. In thiswork, we use a fourth-order Runge-Kutta integration schemeto integrate differential equations and use the default trust-region method from the Julia NLsolve.jl package [35] asthe root-finding algorithm.

As discussed in Section V, the success of solving this two-point boundary value problem is highly sensitive to the initialguess for (p0, tf ). To address this issue, we leverage theinsights from Theorem 3.2. Given a solution to LOCPP∆

k+1

strictly satisfying the trust-region constraints, we retrieve theKKT multiplier γ0

k+1 associated with the initial conditionx(0) =x0. As discussed in Section V, γ0

k+1 approaches p0

as SCP converges to a locally-optimal trajectory. Thus, asdescribed in Algorithm 2, we initialize the root-finding al-gorithm with (γ0

k+1, tk+1f ) stemming from the solution of

LOCPP∆k+1. If a solution (p0, tf ) to the root-finding problem is

found, the corresponding candidate locally-optimal trajectory(x, u=ϕ(x, p), tf ) has been found and Algorithm 2 terminates.

3) SCP for free-final-time problems and implementation: Nu-merically solving free-final-time optimal control problems isnotoriously challenging due to the presence of the final time tfas an additional variable. To address this difficulty, we leveragethe insights of Theorem 3.2 to obtain a convex reformulation.Specifically, we first make the change of variable s = s/tfand express (23) as

minu,tf

∫ 1

0tf (s)

(u(s)2 + ω

∑i ci(r(s))

)ds

rx(s) = tf (s)v cos θ(s), ry(s) = tf (s)v sin θ(s),

θ(s) = tf (s)ku(s), tf (s) = 0,

x(0) = x0, x(1) = xf .

(27)

This problem definition, although slightly different than ourprevious formulations due to the free initial condition tf (0),is equivalent to OCP [30]. In particular, the results of Theorem3.2 still apply, and one can use the KKT multiplier γ0

k+1 asso-ciated with the initial condition x(0) = x0 of this problem toinitialize the indirect shooting method (Algorithm 2). Although

Avg. num. of SCP iterations

Problem SCP only SCP+shooting

free θf 6.72 4.37

fixed θf 6.14 4.20

Fig. 2. Number of SCP iterations until convergence averaged over 100randomized experiments as described in Section VI-.4.

Fig. 3. Results from randomized problems with fixed final angle (left)and free final angle (right). These histograms show the number of SCPiterations until convergence for SCP only (orange, Algorithm 1) and forshooting-accelerated SCP (blue, Algorithm 2).

(27) is mathematically equivalent to OCP, due to the presenceof tf which multiplies u2 (see [23]), this formulation doesnot fit the methodological mold of problem OCP which weleveraged to define convexified problems for SCP in SectionII. Nevertheless, we can again leverage Theorem 3.2 whichstates that the sequence {tkf}k∈N converges to tf , where tf ispart of a Pontryagin extremal related to OCP. Thus, withoutchanging the structure of solutions, we replace tf with tkf in(27), thereby regaining the same structure leveraged in SectionII to perform the convexifications. Convergence to a candidatelocally-optimal trajectory is guaranteed by Theorem 3.2.

To apply Algorithm 1 and 2, we start from (∆0, ω0) =(3, 5000), we keep ωk constant as that is sufficient to guaranteeconstraint satisfaction for the scenarios considered in the ex-periments and we let ∆k+1 ← 0.95∆k to satisfy the assump-tions of Theorem 3.2. Note that different update rules are alsopossible [9]. We initialize SCP with a straight-line trajectoryfrom x0 to xf , initialize all controls to 0, and use a trapezoidaldiscretization scheme with N = 51 nodes. To check conver-gence of SCP, we verify that

∫ tf0‖uk+1 − uk‖2(s) + ‖uk −

uk−1‖2(s) ds ≤ 10−3. We also check that the trust-regionconstraints are strictly satisfied at convergence, i.e., (2), andsolve each convexified problem using IPOPT. We release ourimplementation at https://github.com/StanfordASL/jlSCP.

4) Results and discussion: We evaluate our method in 100randomized experiments. Denoting Unif(a, b) as the uniformprobability distribution from a ∈ R to b ∈ R, we set

r0x ∼ Unif(−1, 1), r0

y ∼ Unif(−1, 1), θ0 ∼ Unif(−π, π),

θxy ∼ Unif(θ0−π4, θ0+

π

4), θf ∼ Unif(θ0−π

4, θ0+

π

4),

rfx ∼ r0x+(4+Unif(0, 3)) cos θxy, r

fy ∼ r0

y+(4+Unif(0, 3)) sin θxy,

εi=0.4, nobs=2, ri,x∼Unif(min(r0x, r

fx)+8εi,max(r0

x, rfx)−8εi),

and similarly for ri,y . The guess for the final time is initializedaccording to tf ∼ Unif(4, 6),

We consider the problems with free final time and bothfree and fixed final angle θf . In 100% of these scenarios, bothSCP and the shooting-accelerated SCP converge successfully.

NLsolve.jl

https://github.com/StanfordASL/jlSCP

15

Fig. 4. Example of trajectory using an infeasible straight-line initializa-tion that passes through obstacles (left) and associated controls (right).

Figure 4 shows that the initialization does not need to befeasible for SCP to converge successfully to a (candidate)locally-optimal trajectory avoiding obstacles and satisfyinginput constraints. Further, although the solution of the firstiteration of SCP does not respect the nonlinear dynamicsconstraints, such constraints become satisfied as the algorithmperforms further iterations.

Results in Figures 2 and 3 demonstrate that leveragingthe PMP through an indirect shooting method decreases thenumber of SCP iterations on average, significantly accelerat-ing the algorithm. Indeed, SCP alone may require multipleiterations close to the optimal solution before convergence. Incontrast, once a good guess for (p0, tf ) to initialize the root-finding algorithm is available, the indirect shooting method iscapable of efficiently computing a (candidate) locally-optimaltrajectory solving OCP. In the worst case where the numberof SCP iterations until convergence NSCP is the same forboth methods, which occurs if the guess for (p0, tf ) is neverwithin the radius of convergence of the shooting method at anySCP iteration, the computation time for Algorithm 2 is NSCP ·(TSCP + Ts-fail), with TSCP being the time to convexify OCPand solve the resulting LOCP∆

k+1, and Ts-fail being the timefor the root-finding algorithm to report convergence failure.As Ts-fail�TSCP (see for instance [36]. In our non-optimizedJulia implementation, Ts-fail = 7.5ms and TSCP = 235.9ms onaverage, measured on a laptop equipped with a 2.60GHzIntel Core i7-6700 CPU with 8GB of RAM.), there is littlecomputational overhead in using accelerated-SCP over SCPonly, and results in Figures 2 and 3 demonstrate that leveragingthe PMP significantly accelerates the optimization process.Finally, as trust-region constraints are strictly satisfied in100% of these scenarios, from the results of Theorem 3.2, alltrajectories are candidate locally-optimal solutions to OCPPω .

VII. CONCLUSION AND PERSPECTIVES

In this paper, we analyze the convergence of SCP whenapplied to continuous-time non-convex optimal control prob-lems, including in the presence of manifold-type constraints.In particular, we prove that, up to some subsequence, SCP-based optimal control methods converge to a candidate locally-optimal solution for the original formulation. Under mild as-sumptions, our approach can be effortlessly leveraged to solveproblems with manifold-type constraints. Finally, we leverageour analysis to accelerate the convergence of standard SCP-type schemes through indirect methods, and we investigatetheir performance via numerical simulations on a trajectoryoptimization problem with obstacles and free final time.

For future research, we plan to extend our approach tomore general optimal control formulations, which for instanceconsider stochastic dynamics, risk functionals as costs, andprobabilistic chance constraints. In addition, we plan to inves-tigate particular parameters update rules which guarantee theconvergence of the whole sequence of controls (uk). Finally,we plan to test the performance of our approach by means ofhardware experiments on complex systems such as free-flyersand robotic manipulators.

ACKNOWLEDGMENT

We thank Andrew Bylard for his careful review of thismanuscript.

REFERENCES

[1] J. E. Falk and R. M. Soland, “An algorithm for separable nonconvexprogramming problems,” Management Science, vol. 15, no. 9, pp. 550–569, 1969.

[2] G. P. McCo, “Computability of global solutions to factorable nonconvexprograms: Part i - convex underestimating problems,” MathematicalProgramming, vol. 10, pp. 147–175, 1976.

[3] X. Liu and P. Lu, “Solving nonconvex optimal control problems by con-vex optimization,” AIAA Journal of Guidance, Control, and Dynamics,vol. 37, no. 3, pp. 750 – 765, 2014.

[4] D. Morgan, S. J. Chung, , and F. Y. Hadaegh, “Model predictive controlof swarms of spacecraft using sequential convex programming,” AIAAJournal of Guidance, Control, and Dynamics, vol. 37, no. 6, pp. 1725–1740, 2014.

[5] Y. Mao, M. Szmuk, and B. Acikmese, “Successive convexification ofnon-convex optimal control problems and its convergence properties,”in Proc. IEEE Conf. on Decision and Control, 2016.

[6] J. Virgili-llop, C. Zagaris, R. Zappulla II, A. Bradstreet, and M. Romano,“Convex optimization for proximity maneuvering of a spacecraft witha robotic manipulator,” in AIAA/AAS Space Flight Mechanics Meeting,2017.

[7] F. Augugliaro, A. P. Schoellig, and R. D’Andrea, “Generation ofcollision-free trajectories for a quadrocopter fleet: A sequential convexprogramming approach,” in IEEE/RSJ Int. Conf. on Intelligent Robots& Systems, 2012.

[8] J. Schulman, Y. Duan, J. Ho, A. Lee, I. Awwal, H. Bradlow, J. Pan,S. Patil, K. Goldberg, and P. Abbeel, “Motion planning with sequentialconvex optimization and convex collision checking,” Int. Journal ofRobotics Research, vol. 33, no. 9, pp. 1251–1270, 2014.

[9] R. Bonalli, A. Cauligi, A. Bylard, and M. Pavone, “GuSTO: Guaranteedsequential trajectory optimization via sequential convex programming,”in Proc. IEEE Conf. on Robotics and Automation, 2019.

[10] R. Bonalli, A. Bylard, A. Cauligi, T. Lew, and M. Pavone, “Trajec-tory optimization on manifolds: A theoretically-guaranteed embeddedsequential convex programming approach,” in Robotics: Science andSystems, 2019.

[11] S. Boyd and L. Vandenberghe, Convex optimization. Cambridge Univ.Press, 2004.

[12] V. Azhmyakov and J. Raisch, “Convex control systems and convex opti-mal control problems with constraints,” IEEE Transactions on AutomaticControl, vol. 53, no. 4, pp. 993–998, 2008.

[13] D. Verscheure, B. Demeulenaere, J. Swevers, J. De Schutter, andM. Diehl, “Time-optimal path tracking for robots: A convex optimizationapproach,” IEEE Transactions on Automatic Control, vol. 54, no. 10, pp.2318–2327, 2009.

[14] M. Chamanbaz, F. Dabbene, R. Tempo, V. Venkataramanan, and Q. G.Wang, “Sequential randomized algorithms for convex optimization inthe presence of uncertainty,” IEEE Transactions on Automatic Control,vol. 61, no. 9, pp. 2565–2571, 2015.

[15] J. T. Betts and P. Huffman, “Path-constrained trajectory optimizationusing sparse sequential quadratic programming,” AIAA Journal of Guid-ance, Control, and Dynamics, vol. 16, pp. 59–68, 1993.

[16] A. Sideris and J. E. Bobrow, “An efficient sequential linear quadraticalgorithm for solving nonlinear optimal control problems,” IEEE Trans-actions on Automatic Control, vol. 50, no. 12, pp. 2043–2047, 2005.

[17] J. Nocedal and S. J. Wright, Numerical Optimization, 2nd ed. Springer,2006.

16

[18] C. Zillober, “Sequential convex programming in theory and praxis,” inStructural Optimization 93: The World Congress on Optimal Design ofStructural Systems, 1993.

[19] Q. T. Dinh and M. Diehl, “Local convergence of sequential convexprogramming for nonconvex optimization,” in Recent Advances inOptimization and its Applications in Engineering. Springer, 2010.

[20] M. Diehl and F. Messerer, “Local convergence of generalized gauss-newton and sequential convex programming,” in Proc. IEEE Conf. onDecision and Control, 2019.

[21] F. Messerer and M. Diehl, “Determining the exact local convergence rateof sequential convex programming,” in European Control Conference,2020.

[22] L. S. Pontryagin, Mathematical Theory of Optimal Processes. Taylor& Francis, 1987.

[23] A. A. Agrachev and Y. Sachkov, Control Theory from the GeometricViewpoint. Springer, 2004.

[24] E. Trelat, “Optimal control and applications to aerospace: some resultsand challenges,” Journal of Optimization Theory & Applications, vol.154, no. 3, pp. 713–758, 2012.

[25] E. B. Lee and L. Markus, Foundations of Optimal Control Theory. JohnWiley & Sons, 1967.

[26] C. Lobry, “Une propriete generique des couples de champs de vecteur,”Czechoslovak Mathematical Journal, vol. 22, no. 97, pp. 230–237, 1972.

[27] I. A. Shvartsman and R. B. Vinter, “Regularity properties of optimalcontrols for problems with time-varying state and control constraints,”Nonlinear Analysis: Theory, Methods & Applications, vol. 65, no. 2, pp.448–474, 2006.

[28] Y. Chitour, F. Jean, and E. Trelat, “Singular trajectories of control-affinesystems,” SIAM Journal on Control and Optimization, vol. 47, no. 2,pp. 1078–1095, 2008.

[29] A. Dmitruk, “On the development of pontryagin’s maximum principlein the works of a. ya. dubovitskii and a. a. milyutin,” Control andCybernetics, vol. 38, no. 4A, pp. 923–957, 2009.

[30] L. Gollmann, D. Kern, and H. Maurer, “Optimal control problemswith delays in state and control variables subject to mixed control-stateconstraints,” Optimal Control Applications and Methods, vol. 30, no. 4,pp. 341–365, 2009.

[31] R. Bonalli, B. Herisse, and E. Trelat, “Continuity of pontryagin ex-tremals with respect to delays in nonlinear optimal control,” SIAMJournal on Control and Optimization, vol. 57, no. 2, pp. 1440–1466,2019.

[32] R. Gamkrelidze, Principles of Optimal Control Theory. Springer, 1978.[33] E. Trelat, “Some properties of the value function and its level sets for

affine control systems with quadratic cost,” Journal of Dynamical andControl Systems, vol. 6, no. 4, pp. 511–541, 2000.

[34] R. Bonalli, B. Herisse, and E. Trelat, “Optimal control of endo-atmospheric launch vehicle systems: geometric and computational is-sues,” IEEE Transactions on Automatic Control, vol. 65, no. 6, pp.2418–2433, 2020.

[35] P. K. Mogensen and contributors, “Nlsolve.jl,” 2018. [Online].Available: https://github.com/JuliaNLSolvers/NLsolve.jl

[36] R. Bonalli, “Optimal control of aerospace systems with control-stateconstraints and delays,” Ph.D. dissertation, Sorbonne Universite &ONERA - The French Aerospace Lab, 2018.

VIII. APPENDIXProof: [Proof of Lemma 3.1] Fix k ∈ N. For every t ∈

R+ where xk+1 is defined, we may compute

‖xk+1(t)‖ ≤ ‖x0‖+

∥∥∥∥∥∫ t

0

(f0(s, xk(s)) +

m∑i=1

uik+1(s)fi(s, xk(s))

)ds

∥∥∥∥∥+

∥∥∥∥∥∫ t

0

(∂f0

∂x(s, xk(s)) +

m∑i=1

uik(s)∂fi

∂x(s, xk(s))

)xk+1(s) ds

∥∥∥∥∥+

∥∥∥∥∥∫ t

0

(∂f0

∂x(s, xk(s)) +

m∑i=1

uik(s)∂fi

∂x(s, xk(s))

)xk(s) ds

∥∥∥∥∥≤ C

(1 +

∫ t

0‖xk+1(s)‖ ds

)+ C

m∑i=0

∫{s∈[0,tk+1

f]: (s,xk(s))∈suppfi

} ∥∥∥∥∂fi∂x (s, xk(s))

∥∥∥∥ ‖xk(s)‖ ds≤ C

(1 +

∫ t

0‖xk+1(s)‖ ds

).

Here, suppfi denotes the support of fi, i = 0, . . . ,m. The(overloaded) constant C ≥ 0 depends on T , fi, i = 0, . . . ,m,and U , and it comes from the compactness of U and ofsuppfi (and therefore of supp∂fi∂x ), i = 0, . . . ,m. By applyingthe Gronwall inequality, we finally have that ‖xk+1(t)‖ ≤C exp(CT ) for every t ∈ R+ ∩ [0, T ] where xk+1 is defined.The compactness criterion for solution to ODE applies, andwe infer that xk+1 is defined in the entire interval [0, T ]. Sincethe constant C > 0 does not depend on k ∈ N, the trajectoriesxk are uniformly bounded in [0, T ] and in k ∈ N.

Riccardo Bonalli obtained his M.Sc. in Mathe-matical Engineering from Politecnico di Milano,Italy in 2014 and his Ph.D. in applied math-ematics from Sorbonne Universite, France in2018 in collaboration with ONERA - The FrenchAerospace Lab, France. He is recipient of theONERA DTIS Best Ph.D. Student Award 2018.He is now postdoctoral researcher at the Depart-ment of Aeronautics and Astronautics at Stan-ford University. His main research interests con-cern theoretical and numerical robust optimal

control with applications in aerospace systems and robotics.

Thomas Lew received his BSc. degree in Micro-engineering from Ecole Polytechnique Federalede Lausanne in 2017, received his MSc. degreein Robotics from ETH Zurich in 2019, and iscurrently pursuing his Ph.D. degree in Aeronau-tics and Astronautics at Stanford University. Hisresearch focuses on the intersection betweenoptimization, control theory, and machine learn-ing techniques for aerospace applications androbotics.

Marco Pavone is an Associate Professor ofAeronautics and Astronautics at Stanford Uni-versity, where he is the Director of the Au-tonomous Systems Laboratory. Before joiningStanford, he was a Research Technologist withinthe Robotics Section at the NASA Jet Propul-sion Laboratory. He received a Ph.D. degreein Aeronautics and Astronautics from the Mas-sachusetts Institute of Technology in 2010. Hismain research interests are in the developmentof methodologies for the analysis, design, and

control of autonomous systems, with an emphasis on self-driving cars,autonomous aerospace vehicles, and future mobility systems. He is arecipient of a number of awards, including a Presidential Early CareerAward for Scientists and Engineers, an ONR YIP Award, an NSFCAREER Award, and a NASA Early Career Faculty Award. He wasidentified by the American Society for Engineering Education (ASEE)as one of America’s 20 most highly promising investigators under theage of 40. He is currently serving as an Associate Editor for the IEEEControl Systems Magazine.

https://github.com/JuliaNLSolvers/NLsolve.jl

analysis of theoretical and numerical properties of

Documents