ieee transactions on robotics 1 trajectory planning for

IEEE TRANSACTIONS ON ROBOTICS 1

Trajectory Planning for Quadrotor SwarmsWolfgang Honig, James A. Preiss, T. K. Satish Kumar, Gaurav S. Sukhatme, and Nora Ayanian

Abstract—We describe a method for multi-robot trajectoryplanning in known, obstacle-rich environments. We demonstrateour approach on a quadrotor swarm navigating in a warehousesetting. Our method consists of three stages: 1) roadmap gener-ation which generates sparse roadmaps annotated with possibleinter-robot collisions; 2) discrete planning which finds validexecution schedules in discrete time and space; and 3) contin-uous refinement that creates smooth trajectories. We accountfor the downwash effect of quadrotors, allowing safe flight indense formations. We demonstrate computational efficiency insimulation with up to 200 robots and physical plausibility with anexperiment on 32 nano-quadrotors. Our approach can computesafe and smooth trajectories for hundreds of quadrotors in denseenvironments with obstacles in a few minutes.

Index Terms—UAV, quadrotor, swarm, multi-robot systems

I. INTRODUCTION

TRAJECTORY planning is a fundamental problem inmulti-robot systems. Given a set of robots with known

initial locations and a set of goal locations, the task is tofind a set of continuous functions that move each robotfrom its start position to its goal, while avoiding collisionsand respecting dynamic limits. The unlabeled case allowsrobots to swap goal locations with each other; in the labeledcase the goal assignment is given. Trajectory planning is acore subproblem of various applications including search-and-rescue, inspection, and delivery.

A large body of work has addressed this problem with varieddiscrete and continuous formulations. However, no existingsolution simultaneously satisfies the goals of completeness,physical plausibility, optimality in time or energy usage, andgood computational performance. In this work, we present amethod that attempts to balance these goals.

Our method uses a graph-based planner to compute a solu-tion for a version of the problem that is discretized in space andtime, and then refines this solution into smooth trajectories ina separate, decoupled optimization stage. The roadmap for thediscrete planner can be automatically generated from a givenenvironment and robot description. We take the downwasheffect of quadrotors into account, preserving safety duringdense formation flights. Our method is complete with respectto the resolution of the discretization, and locally optimal withrespect to an energy-minimizing integral-squared-derivativeobjective function. We also present an anytime iterative re-finement scheme that improves the trajectories within a givencomputational budget. We support user-specified smoothnessconstraints and dynamic limits and provide simulations with

All authors are with the Department of Computer Science, University ofSouthern California, Los Angeles, CA, USA.

Email: whoenig, [email protected],[email protected], gaurav, [email protected]

This work was partially supported by the ONR grants N00014-16-1-2907and N00014-14-1-0734, and the ARL grant W911NF-14-D-0005.

Fig. 1. Long exposure of 32 Crazyflie nano-quadrotors flying through anobstacle-rich environment.

up to 200 robots and a physical experiment with 32 quadrotors,see Fig. 1.

We extend our previous work [1] in the following ways:1) Support of arbitrary continuous environments, i.e., we

remove the limitations that start/goal locations and ob-stacles must be aligned with a user-defined grid.

2) Introduction of a generic multi-robot path planningframework which can be used for non-quadrotor robotsas well. Specifically, we introduce Multi-Agent PathFinding with Generalized Conflicts (MAPF/C) and anefficient bounded suboptimal solver for MAPF/C probleminstances. This also enables us to solve the labeledtrajectory planning problem.

3) New experimental results analyzing the computationalefficiency of and validating our approach in simulationand on a physical quadrotor swarm.

II. RELATED WORK

A simple approach to multi-robot motion planning is torepurpose a single-robot planner and represent the Cartesianproduct of the robots’ configuration spaces as a single largejoint configuration space [2]. Robot-robot collisions are repre-sented as configuration-space obstacles. However, the high-dimensional search space is computationally infeasible forlarge teams.

Many works have approached the problem from a graphsearch perspective [3], [4]. These methods are adept at dealingwith maze-like environments and scenarios with high conges-tion. Some represent the search graph implicitly [5], so theyare not always restricted to a predefined set of points in con-figuration space. However, directly interpreting a graph planas a trajectory results in a piecewise linear path, requiring therobot to fully stop at each graph vertex to maintain dynamicfeasibility. It is possible, however, to use these planners toresolve ordering conflicts and refine the output for executionon robots [6].


Environment (OcTree)Robot Model (Shape)

Conflict Model (Shape)

Start/Goal Locations

discreteschedule

Input Output

TrajectoryOptimizationQP

Dynamic Limits

IterativeRefinement

Trajectories

Roadmap

Conflict-annotatedroadmap

Grid SPARS

RoadmapGeneration

ECBSILP

MAPF/C Solver

SpatialPartitionSVM

Scaling

Graph-based PlanningContinuous TrajectoryOptimization

Trajectories

ConflictAnnotationFCL Swept

Corridors

Fig. 2. Components of our approach. The user specifies a model of the environment (e.g. using an octree), a model of the robot (e.g. sphere), start/goallocations, a robot-robot collision model (e.g. ellipsoid), and dynamic limits (e.g. maximum acceleration). We generate a sparse roadmap from the environment,including the start and goal locations as vertices. The collision model is used to annotate the roadmap with additional inter-robot conflicts. This annotatedroadmap can be used to find a discrete schedule for each robot. This schedule defines a corridor for each robot in which trajectory optimization is used togenerate smooth trajectories. Finally, the trajectories are scaled in order to fulfill the dynamic limits. The pictures on the bottom show one example of twoquadrotors swapping their positions.

Some authors have solved the formation change problemin a continuous setting [7], [8], but such methods are oftentightly coupled, solving one large optimization problem inwhich the decision variables define all robots’ trajectories.These approaches are typically demonstrated on smaller teamsand do not easily scale to the size of a team in which we areinterested. The authors of [9] propose a method that, similarto ours, uses conservative separating hyperplanes to decouplethe optimization problem. However, the approach requiresassigning a priority order to the robots. Others decouple theproblem but do not support the level of smoothness in oursolution, and do not show results on large teams [10]. Themethod of Turpin et al. [11] is computationally fast, but offsetsthe different trajectories in time, resulting in much longer timedurations. Velocity profile methods [12] handle kinodynamicconstraints well but are not able to fully exploit free spacein the environment. Collision-avoidance approaches [13], [14]let each robot plan its trajectory independently and resolveconflicts in real time when impending collisions are detected.These methods are distributed and reactive, however, theydo not provide any means to optimize the trajectories forobjectives such as time or energy use, and they are poorlysuited to problems in maze-like environments.

Our method builds upon Spline-based refinement of way-point plans [15], [16] by adding support for three-dimensionalellipsoidal robots, environmental obstacles, and an anytimerefinement stage to further improve the plan after generatingan initial set of smooth trajectories. We demonstrate thatour iterative refinement produces trajectories with significantly

smoother dynamics than the initial trajectories.

III. APPROACH

We now formalize the trajectory planning problem for anyhomogeneous robot team and outline our method to solve thisclass of problems. Furthermore, we introduce the robot modelfor quadrotors. Later sections discuss the steps of our approachin more detail, using a quadrotor swarm as example.

A. Problem Statement

Consider a team of N robots in a bounded environmentcontaining convex obstacles O1 . . .ONobs

. Boundaries of theenvironment are defined by a convex polytopeW . The convexset of points representing a robot at position q ∈ R3 is RE(q).The free configuration space for a single robot is thus givenby

F = (W \ (⋃hOh)) RE(0) (1)

where denotes the Minkowski difference. We allow aseparate inter-robot collision model and define RR(q) to bethe convex set of points a robot at position q requires to operatesafely when close to another robot. For example, in the caseof quadrotors RR(q) can model the downwash effect.

Let f i : [0, T ]→ R3 be a trajectory for each robot ri, whereT ∈ R>0 is the total time duration until the last robot reachesits goal. All trajectories are considered collision-free if thereare no robot-environment collisions, i.e. f i(t) ∈ F and norobot-robot collisions, i.e.

RR(f i(t)) ∩RR(f j(t)) = ∅ ∀ i 6= j, 0 ≤ t ≤ T. (2)


In the labeled trajectory planning problem we are given astart and goal position for each robot si, gi ∈ F , where startand goal inputs must satisfy the robot-robot collision model,i.e. RR(si)∩RR(sj) = ∅ and RR(gi)∩RR(gj) = ∅ for alli 6= j.

We seek the total time duration T and collision-free trajec-tories f i such that:• f i(0) = si

• f i(T ) = gi

• f i is continuous up to user-specified derivative order C• f i is kinodynamically feasible for robot i.In the unlabeled case we allow the robots to exchange goal

locations, i.e. we additionally seek an assignment of each robotto a goal position such that f i(T ) = gφ(i), where φ is apermutation of 1 . . . N .

B. Components

Our method combines the strengths of two conceptually dif-ferent approaches to multi-robot trajectory planning problems.The first approach is commonly used in the AI community anduses a graph to represent a roadmap and graph-search basedalgorithms to find collision-free schedules for all robots. Thiscan be done efficiently for hundreds of robots even in maze-like environments. However, dynamic constraints of robotsare ignored. Furthermore, creating an appropriate roadmapfor an environment that fulfills all requirements of the searchalgorithm is challenging. The second approach, commonlyused in robotics, is based on trajectory optimization. Thisapproach can deal with kinodynamic constraints, but does notscale well to hundreds of robots.

Our approach is outlined in Fig 2. We use models of theenvironment and the robots to generate a roadmap suitablefor planning for a single robot. We annotate the roadmapwith additional edge and vertex conflicts to model constraintscaused by inter-robot dependencies. We can then use anextended multi-robot planner to find discrete schedules foreach robot. These schedules can be executed on real quadrotorswithout collisions, but they require the quadrotors to stopat each waypoint. Finally, a trajectory optimization stagegenerates smooth and conflict-free trajectories based on thediscrete schedule. This approach can be used for any multi-robot trajectory planning problem, however the exact detailsfor each step vary. In this paper we present the componentsneeded to plan trajectories for a swarm of quadrotors, directlytaking downwash into account.

C. Robot Model for Quadrotor Trajectory Planning

As aerial vehicles, quadrotors have a six-dimensional con-figuration space. However, as shown in [17], quadrotors aredifferentially flat in the flat outputs (x, y, z, ψ), where x, y, zis the robot’s position in space and ψ its yaw angle (heading).Differential flatness implies that the control inputs neededto move the robot along a trajectory in the flat outputs arealgebraic functions of the flat outputs and a finite numberof their derivatives. Furthermore, in many applications, aquadrotor’s yaw angle is unimportant and can be fixed at

Fig. 3. Axis-aligned ellipsoid model of robot volume. Tall height preventsdownwash interference between quadrotors.

ψ = 0. We therefore focus our efforts on planning trajectoriesin three-dimensional Euclidean space.

While some multi-robot planning work has consideredsimplified dynamics models such as kinematic agents [6]or double-integrators [7], our method produces trajectorieswith arbitrary smoothness up to a user-defined derivative.This goal is motivated by [17], where it was shown thata continuous fourth derivative of position is necessary forphysically plausible quadrotor trajectories, because it ensuresthat the quadrotor will not be asked to change its motor speedsinstantaneously.

Rotorcraft generate a large, fast-moving volume of airunderneath their rotors called downwash. The downwash forceis large enough to cause a catastrophic loss of stability whenone rotorcraft flies underneath another. We model downwashconstraints as inter-robot collision constraints by treating eachrobot as an axis-aligned ellipsoid of radii 0 < rx = ry rz ,illustrated in Fig. 3. Empirical data collected in [18], [19]support this model. The set of points representing a robot atposition q ∈ R3 is given by

RR(q) = Ex+ q : ‖x‖2 ≤ 1 (3)

where E = diag(rx, ry, rz).

IV. ROADMAP GENERATION

A roadmap is an undirected connected graph of the environ-ment GE = (VE , EE), where each vertex v ∈ VE correspondsto a location in F and each edge (u, v) ∈ EE denotes thatthere is a linear path in F connecting u and v. We also requirethat there exists a vertex vis ∈ VE corresponding to each startlocation si and that there exists a vertex vig ∈ VE for eachgoal location gi. Let loc : VE → R3 be a function that returnsthe location for each vertex.

The ideal roadmap should have the following properties:1) be connected, i.e., if a path between two points in F

exists, there should be a path in the roadmap as well;2) lead to optimal results, i.e., the shortest path between two

points in F can be approximated well by a path in theroadmap; and

3) be sparse, i.e., have a small number of vertices and edges.The last property is desired because dense roadmaps resultin a higher number of inter-robot conflicts, which can createa significant computational burden for the discrete planning


(a) Grid-based roadmap. (b) SPARS-based roadmap. (c) Conflicting edges.

Fig. 4. Example of generated roadmaps using the grid approach (a) and SPARS (b) with the same desired dispersion. The roadmaps can be annotated withedge and vertex conflicts based on the inter-robot conflict model RR. Conflicting edges for black edge are marked in red in (c). The edges are in conflictbecause it is possible to place the robots on the edge such that (2) is violated.

stage. The first two properties are in conflict with the lastproperty. Thus, roadmap generation should balance thosegoals. A useful property of a roadmap is its dispersion, whichis the radius of the largest sphere centered in F that does notcontain any vertex location. Dispersion is a measure of theuniformity of the roadmap. In this work we tested two differentapproaches, namely 6-connected grid structures, and sparseroadmap spanners. Here, spanner refers to a concept fromgraph theory where a spanner of a graph is a subgraph withfewer edges such that the distance between any two vertices iswithin a user-provided bound compared to the original graph.

The grid roadmap generator places potential verticesequidistant from each other. A vertex is only added if it isin F . Edges are added to the six closest neighbors if therewould be no collision between the robot and the environmentwhen traversing that edge. Roadmaps generated this way havea regular structure, reducing inter-robot conflicts and providinga low dispersion for a fixed number of vertices. However,completeness and optimality are only achieved as the grid-sizeapproaches zero, which is not desirable due to our sparsenessrequirement. Thus, the generated roadmaps are often missingimportant edges and vertices and fail to achieve the first twodesired properties.

The family of SPArse Roadmap Spanner algorithms(SPARS) [20] attempts to generate roadmaps that are boundedsub-optimally with respect to distance (up to a user-providedstretch factor). Unlike graph spanners, the algorithm attemptsto reduce both vertices and edges, rather than edges only.The SPARS algorithm uses a dense roadmap similar to thosegenerated by PRM* [21] and only keeps a subset of verticesand edges such that the distance sub-optimality is guaranteed.The SPARS2 algorithm removes the requirement of explicitlyconstructing a dense roadmap, reducing the memory overheadbut at the cost of a larger roadmap. We do not have arequirement for a small memory usage and used the SPARSalgorithm.

In both grid and SPARS cases we add vertices to theroadmap corresponding to the start and goal locations if suchvertices are not already part of the roadmap. We connect thoseadditional vertices to up to six neighbors within a search radiusif the resulting edge could be traversed without any collision

with the environment. Example roadmaps are shown in Fig. 4.

V. CONFLICT ANNOTATION

Roadmaps are typically generated to plan the motion for asingle robot. If the same roadmap is used by multiple robots,there are additional constraints:

1) Vertex-Vertex Constraints: two robots may not occupytwo vertices which are in close proximity to each otherat the same time.

2) Edge-Edge Constraints: two robots may not traverse twoedges if a collision could occur during the traversal, seeFig. 4(c).

3) Edge-Vertex Constraints: one robot may not traverse anedge if a collision could occur with another stationaryrobot.

The goal of the conflict annotation is to identify the set ofconflicts for each edge and vertex. The output is a graph whereeach vertex and each edge is annotated with a conflict set.

We define the following functions:

conV V (v) = u ∈ VE |RR(loc(u)) ∩RR(loc(v)) 6= ∅

conEE(e) = d ∈ EE |R∗R(d) ∩R∗R(e) 6= ∅conEV (e) = u ∈ VE |RR(loc(u)) ∩R∗R(e) 6= ∅

(4)

where R∗R(e) is the set of points swept when traversing edgee. We refer to this method as swept collision model.

The swept collision model is conservative, allowing edgetraversals with an arbitrary velocity profile. Alternatively, wecan assume that the motion on the edge uses a known velocityprofile (such as constant velocity), where mot(e, t) refers tothe position of the robot traversing edge e for t ∈ [0, 1]. Wecan then use a standard continuous collision definition as usedin the Flexible Collision Library (FCL) [22]:

conEE′(e) = d ∈ EE | ∃t ∈ [0, 1]

RR(mot(d, t)) ∩RR(mot(e, t)) 6= ∅conEV ′(e) = u ∈ VE | ∃t ∈ [0, 1]

RR(loc(u)) ∩RR(mot(e, t)) 6= ∅

(5)

Collision checking is required for all vertex and edge pairs,leading to quadratic time complexity.


VI. DISCRETE PLANNING

The annotated roadmap with specified start/goal vertices isthe input to the graph planning stage of our method, which wedescribe in this section. We first state the MAPF/C problem, ageneralized version of the Multi-Agent Path Finding problem(MAPF) that explicitly avoids conflicts in a conflict-annotatedroadmap. We then introduce two different planners, an optimalplanner that solves the unlabeled problem and a bounded-suboptimal planner for the labeled case. Finally, we discussmethods for goal assignment to allow using the bounded-suboptimal planner for the unlabeled case as well.

A. Multi-Agent Path Finding with Generalized Conflicts(MAPF/C)

We are given a roadmap of the environment (GE) withadditional conflict sets for generalized vertex-vertex (conV V ),edge-edge (conEE), and edge-vertex (conEV ) conflicts asgenerated in the previous section. Additionally, we are givena unique start vertex for each robot vis ∈ VE . In the labeledcase we are given a unique goal for each robot vig ∈ VE ; inthe unlabeled case these goals are interchangeable.

At each timestep, a robot can either wait at its current vertexor traverse an edge. A discrete schedule pi for each robot iscomposed of a sequence of K + 1 locations:

pi = loc(ui0), loc(ui1), . . . , loc(uiK)

, xi0, xi1, . . . , x

iK

(6)

Robots are expected to be synchronized in time, such that roboti is at waypoint xik at timestep k. In between waypoints xikand xik+1, we assume that robot i travels on the line segmentbetween xik and xik+1. We denote this line segment by `ik.

Our goal is to find discrete schedules pi, such that thefollowing properties hold:P1: Each robot starts at its start vertex: ∀i : ui0 = vis.P2: Each robot ends at a unique goal location:∀i : xiK = v

φ(i)g , where φ is a permutation of 1 . . . N .

P3: At each timestep, each robot either stays at its currentposition or moves along an edge: ∀k, ∀i: uik = uik+1 or(uik, u

ik+1) ∈ EE .

P4: There are no robots occupying the same location at thesame time (vertex collision): ∀k,∀i 6= j: uik 6= ujk.

P5: There are no robots traversing the same edge in oppositedirections at the same time (edge collision): ∀k, ∀i 6= j:uik 6= ujk+1 or ujk 6= uik+1.

P6: Robots obey inter-robot constraints when stationary (gen-eralized vertex-vertex collision): ∀k,∀i 6= j: uik /∈conV V (ujk).

P7: Robots obey inter-robot constraints while traversing anedge (generalized edge-edge collision): ∀k, ∀i 6= j:(uik, u

ik+1) /∈ conEE((ujk, u

jk+1)).

P8: Robots obey inter-robot constraints between stationaryand traversing robots (generalized edge-vertex collision):∀k, ∀i 6= j, uik = uik+1: uik /∈ conEV ((ujk, u

jk+1)).

In the labeled case the permutation φ(i) = i is given;in the unlabeled case we are seeking φ(i) and the discreteschedules pi simultaneously. A solution is optimal with respect

to makespan if K is minimal. A solution is optimal withrespect to cost if

∑i pathcost(p

i) is minimal. The pathcost(·)is a modular user-provided function, e.g. the total distancetraveled or the number of actions taken until the goal isreached.

The first five properties are identical to typical MAPFformulations. Properties 6–8 are newly considered constraintsin MAPF/C. In the following two theorems, we prove somehardness results for MAPF/C. Theorem VI.2 shows thatmakespan minimization for MAPF/C is hard even for theunlabeled case. This is in contrast to the tractability result,established using a reduction to the maximum-flow problem,for makespan minimization for unlabeled MAPF [23].

Theorem VI.1. Solving labeled MAPF/C optimally with re-spect to cost or makespan is NP-hard.

Proof. This follows directly from the NP-hardness proofs ofthe same problems for labeled MAPF [24].

Theorem VI.2. Solving unlabeled MAPF/C optimally withrespect to makespan is NP-hard.

Proof. Given an undirected graph, an independent set in it isdefined as a collection of vertices such that no two of them areconnected by an edge. A maximum independent set (MIS) isan independent set of maximum cardinality. Finding an MISin undirected graphs is a well-known NP-hard problem. It isalso NP-hard to decide whether the size of an MIS in graphGMIS is greater than or equal to kMIS . We reduce this NP-hard decision problem to the makespan minimization problemfor unlabeled MAPF/C.

For a given (GMIS , kMIS), we create a new graphGMAPF/C that includes GMIS straddled between 2kMIS

additional vertices v1s , . . . , v

kMISs and v1

g , . . . , vkMISg . Each

additional vertex is connected to all vertices from GMIS . Inthe corresponding MAPF/C instance, vis is interpreted as thestarting vertex of robot ri; and vig is interpreted as the goalvertex of robot ri. The edges from GMIS are interpreted asgeneralized vertex-vertex conflicts. The edges that connect theadditional vertices to vertices from GMIS are interpreted asedges that robots can traverse in unit time. It is now easy tosee that all kMIS robots can go from their start vertices totheir goal vertices within the minimum makespan of 2 if andonly if the size of an MIS in GMIS is greater than or equal tokMIS . An example is shown in Fig. 5. Therefore, if solvingMAPF/C optimally with respect to makespan can be done inpolynomial time, the decision version of the MIS problemcan also be solved in polynomial time. Conversely, since thedecision version of the MIS problem is NP-hard, makespanminimization for unlabeled MAPF/C is also NP-hard.

Although this negative result is established for MAPF/Con general graphs (and not on grids), it is relevant for us heresince the SPARS-based roadmaps on which MAPF/C instancesmust be solved are more general graphs that do not resemblegrids.


a

b

c

(a) GMIS

ab

ba, c

cb

v1s

v2s

v1g

v2g

(b) GMAPF/C

Fig. 5. Example for MIS to unlabeled MAPF/C reduction with kMIS = 2.The red sets define conV V (·) of nodes a, b, c. A solution for the MAPF/Cinstance is shown with the bold edges, indicating that a, c is a maximumindependent set.

B. Unlabeled Planner

Given the value of K, the decision problem of finding anunlabeled MAPF schedule with makespan K can be solvedin polynomial time. This can be achieved by reduction to amaximum-flow problem in a larger graph, derived from GE ,known as a time-expanded flow-graph [23]. This graph, de-noted by GF , contains O(K · |VE |) vertices and is constructedsuch that a flow in GF represents a solution to the MAPF in-stance. This maximum-flow problem can also be expressed asan Integer Linear Program (ILP) where each edge is modeledas a binary variable indicating its flow and the objective is tomaximize the flow subject to conservation constraints [25]. AnILP formulation allows us to add additional constraints for P6– P8.

We build the time-expanded flow-graph GF = (VF , EF ) asan intermediate step to formulate the ILP. We describe theapproach briefly in this paragraph; more detailed discussionsare available in related work [23], [25], [26]. For each timestepk and vertex v ∈ VE we add two vertices uvk and wvk to VFand create an edge connecting them (red edges in Fig. 6(c)).For each timestep k and edge (v1, v2) ∈ EE we create a“gadget” connecting uv1

k , uv2

k , wv1

k , and wv2

k . As shown inFig 6(b), the “gadget” disallows agents to swap their positionsin one timestep, thus enforcing P5. Furthermore, we connectconsecutive timesteps with additional edges (wvk, u

vk+1) (green

edges in Fig. 6(c)) to enforce P4. Additionally we add verticessource and sink, which are connected to vertices uv

is

0 : ∀iand wv

ig

K : ∀i respectively. If a maximum flow is computedon this graph, the flow describes a discrete schedule for eachrobot, fulfilling P1–P5.

We now extend previous work by introducing annotationscon : EF 7→ 2EF to some of the edges such that con(e) is theset of edges with which edge e has a generalized conflict with.Consider vertices v, v′ ∈ VE that, if simultaneously occupied,would violate P6. Those vertices map to helper edges ek =(wvk, u

vk+1) and e′k = (wv

′

k , uv′

k+1) in GF for all k (green edgesin Fig. 6(c)). In that case we insert e′k into con(ek) and ekinto con(e′k) for all k. Similarly, consider (v1, v2) ∈ EE and(v′1, v

′2) ∈ EE that violate P7. These edges map to helper edges

ek and e′k in GF as part of the gadget for all k (blue edges inFig. 6(c)). As before we insert e′k into con(ek) and vice versafor all k. For edge-vertex conflicts, consider (v1, v2) ∈ EEand v′ ∈ VE that violate P8. The first edge (v1, v2) maps tohelper edges ek, while the vertex v′ corresponds to an agent

waiting at vertex v′, which requires the traversal of helperedge e′k in GF (red edges in Fig. 6(c)). As before we inserte′k into con(ek) and vice versa for all k. Finally, we disallowthat the blue edges of the “gadget” are used for wait actionsby adding constraints con((uv1

k , u(v1,v2)k )) = (w(v1,v2)

k , wv1

k )and con((uv2

k , u(v1,v2)k )) = (w(v1,v2)

k , wv2

k ).For each edge (u, v) ∈ EF , we introduce a binary variable

z(u,v). The ILP can be formulated as follows:

maximize∑

(source,v)∈EF

z(source,v)

subject to∑

(u,v)∈EF

z(u,v) =∑

(v,w)∈EF

z(v,w) ∀v ∈ V ′F

ze +∑

e′ ∈ con(e)

ze′ ≤ 1 ∀e ∈ EF

(7)

where V ′F = VF \source, sink. The first constraint enforcesflow conservation, and thus P3–P5. The second constraintenforces P6–P8. P1 and P2 are implicitly enforced by con-struction of the flow graph. A solution to the ILP assigns aflow to each edge. We can then easily create the schedule pi

for each robot by setting xik based on the flow in GF . Thepermutation φ(i) is implicitly given by xiK .

In order to find an optimal solution for an unknown K,we use a two-step approach. First, we find a lower boundfor K by ignoring the generalized constraints. We searchthe sequence K = 1, 2, 4, 8, . . . for a minimum feasible valuethat is a power of 2, and then perform a binary search tofind the minimum feasible value LB(K). Because we ignorethe generalized constraints, we can check the feasibility inpolynomial time using the Edmonds-Karp algorithm on thetime-expanded flow-graph. Second, we execute a linear searchstarting from LB(K), solving the fully constrained ILP. Inpractice, we found that the lower bound LB(K) is sufficientlyclose to the final K such that a linear search is faster comparedto another modified binary search using the ILP.

C. Labeled Planner

Two common approaches for solving labeled MAPF in-stances are variants of M* [3] and Conflict-Based Search(CBS) [27]. The core idea of M* is to plan for the robotsindividually as much as possible. Whenever a conflict occurs,the conflicting robots are joined together as a meta-agent.In contrast, Conflict-Based Search methods try to resolveconflicts one-by-one, starting with a conflict occurring at theearliest timestep. In both methods, the search might grow ex-ponentially in the worst case. Our labeled planner is based ona bounded suboptimal variant of CBS, called Enhanced CBS(ECBS), which has been shown to solve practical instanceswith hundreds of robots in maze-like environments [28].

A detailed description, pseudocode, and an example of(E)CBS are available in related work [28]. We give a briefoverview of the approach in the following. ECBS uses twodifferent search levels: high- and low-level. The high-levelsearch creates a binary constraint tree. Each node in that treehas a set of constraints. A constraint either disallows a robotto occupy a specific vertex or to traverse a specific edge at a


as1

b

cg1

(a) GE

uv1k

uv2k

u(v1,v2)k w

(v1,v2)k

wv1k

wv2k

(b) “Gadget” for flow-graph construction.

ua0

ub0

uc0

wa0

wb0

wc0

ua1

ub1

uc1

wa1

wb1

wc1

source

sink

(c) GF with K = 2.

Fig. 6. Example flow-graph (c) for environment shown in (a) with a single robot. The construction uses a graph “gadget” (b) for each edge in GE . The blueedges are annotated with edge-edge and edge-vertex conflicts and the green edges are annotated with vertex-vertex conflicts. The bold arrows in (c) show themaximum flow through the network, which can be used to compute the robots’ discrete schedules.

fixed timestep. A node also contains a discrete schedule pi foreach robot that is consistent with the set of constraints, and anassociated cost for the discrete schedules. The root node doesnot have any constraints and thus the discrete schedules pi canbe created by finding the path with the lowest cost in the givenroadmap for each robot individually. If a node is valid (i.e.,its discrete schedules pi are consistent with P1–P8), a solutionis found. Otherwise, the first violation of P1–P8 between tworobots ri and rj is found, and two new nodes are created. Bothchildren inherit the constraints from the parent node. The firstchild imposes an additional constraint on ri and the secondnode adds one additional constraint for rj . In CBS the nodesare traversed using the best-first search strategy with respectto the cost. The low-level search is used to compute a discreteschedule for a single agent that satisfies the constraints forthat agent. Each time a new high-level node is created, thelow-level search needs to be executed just once for the agentwith the newly added constraint. The bounded suboptimalityin ECBS is achieved by using focal search1 with heuristics inboth levels.

ECBS can solve MAPF/C instances by using the generalizedconflict definition for both high-level and low-level focalsearches. The algorithm is identical to the one outlined in theoriginal paper [28], with the following adjustments. For thehigh-level search, we now need to consider conflicts caused byP6 – P8. In case the first conflict is a violation of P6 betweenagents ri and rj at timestep k, we create two child nodes.The first child adds a constraint for ri to avoid visiting xik attimestep k. The second child adds a constraint for rj to avoidvisiting xjk at timestep k. Similarly, if the first conflict is aviolation of P7 between ri and rj , we add two child nodes withthe additional conflicts of ri not traversing (uik, u

ik+1) and rj

not traversing (ujk, ujk+1), respectively. In case the first conflict

is a violation of P8 where, without loss of generality, ri iswaiting at a vertex and rj is traversing an edge, we add twochild nodes with the additional conflicts of ri not waiting atuik at timestep k and rj not traversing (ujk, u

jk+1), respectively.

Furthermore, we need to adjust the focal search heuristics tocount violations caused by P6 – P8 as well. For the high-levelfocal search we use the number of pairs of robots that haveat least one conflict. For the low-level focal search we use thetotal number of conflicts in the high-level search node.

1Focal search maintains two lists in an A* search framework: an open listand a focal list. The focal list contains elements that are in the open list andwithin a suboptimality factor of the minimum f -value. A node from the focallist is chosen for expansion based on a secondary heuristic.

ECBS is bounded suboptimal with respect to the cost, i.e.,for a user-provided suboptimality bound w, the cost of thereturned solution will satisfy∑

i

pathcost(pi) ≤ w∑i

pathcost(pi∗),

where pi∗ is a optimal solution as computed by CBS. The proofin [28] holds for the ECBS/C as well.

D. Goal Assignment

ECBS can be used to solve an unlabeled problem approx-imately, by finding a goal assignment first. In robotics, thepolynomial-time Hungarian method, which finds the optimalassignment with respect to the sum of the costs, is frequentlyused. This might, for example, correspond to minimizing thetotal energy usage of the robots. However, in the unlabeledcase it is typical to minimize the makespan, i.e., the time untilthe last robot reaches its goal. This goal assignment problemis also known as the linear bottleneck assignment problem andcan also be solved efficiently using, for example, the Thresholdalgorithm [29].

VII. TRAJECTORY OPTIMIZATION

In the continuous refinement stage, we convert the waypointsequences pi generated by the discrete planner into smoothtrajectories f i. We use the discrete plan to partition the freespace F such that each robot solves an independent smoothtrajectory optimization problem in a region that is guaranteedto be collision-free. We assign a time tk = k∆t to eachdiscrete timestep, where ∆t is a user-specified parameter. Thisis an initial guess for T = K∆t. The exact total time T iscomputed in a post-processing stage such that all trajectoriesmeet given physical limits such as a maximum thrust constant.

A. Spatial Partition

The continuous refinement method begins by finding safecorridors within the free space F for each robot. An examplefrom a real problem instance is shown in Fig. 7. The safecorridor for robot ri is a sequence of convex polyhedraPik, k ∈ 1 . . .K, such that, if each ri travels within Pikduring time interval [tk−1, tk], both robot-robot and robot-obstacle collision avoidance are guaranteed. For robot ri intimestep k, the safe polyhedron Pik is the intersection of:• N − 1 half-spaces separating ri from rj for j 6= i;


Fig. 7. Safe corridor for one robot over entire flight. Corridor polytopes arecolored by timestep. Black line: underlying discrete plan. Shaded tube: smoothtrajectory after first iteration of refinement. Highlighted in red: polytope,discrete plan segment, and trajectory polynomial piece for a single timestep.

• Nobs half-spaces separating ri from O1 . . .ONobs.

We separate ri from rj by finding a separating hyperplane(α

(i,j)k ∈ R3, β

(i,j)k ∈ R) such that:

`ik ⊂ x : α(i,j)k

Tx < β

(i,j)k

`jk ⊂ x : α(i,j)k

Tx > β

(i,j)k .

(8)

While this hyperplane separates the line segments `ik and `jk,it does not account for the robot ellipsoids. Without loss ofgenerality, suppose the hyperplanes are given in the normalizedform where ‖α(i,j)

k ‖2 = 1. We accommodate the ellipsoidsby shifting each hyperplane according to its normal vector,resulting in the final trajectory constraint halfspaces:

α(i,j)k

Tf i(t) < β

(i,j)k − ‖Eα(i,j)

k ‖2 ∀t ∈ [tk−1, tk]

α(i,j)k

Tf j(t) > β

(i,j)k + ‖Eα(i,j)

k ‖2 ∀t ∈ [tk−1, tk](9)

where E = diag(rx, ry, rz) is the ellipsoid matrix. Robot-obstacle separating hyperplanes are computed similarly, exceptusing a different ellipsoid Eobs for obstacles to model the factthat downwash is only important for robot-robot interactions.

In our implementation, we require that obstacles Oiare bounded convex polytopes described by vertex lists.Line segments are also convex polytopes described byvertex lists. Computing a separating hyperplane betweentwo disjoint convex polytopes Ψ = conv(ψ1 . . . ψmΨ

) andΩ = conv(ω1 . . . ωmΩ), where conv denotes the convex hull,can be posed as an instance of the hard-margin support vectormachine (SVM) problem [30]. However, the ellipsoid robotshape alters the problem: for a separating hyperplane withunit normal vector α, the minimal safe margin is 2‖Eα‖2.Incorporating this constraint in the standard hard-margin SVMformulation yields a slightly modified version of the typicalSVM quadratic program:

minimize αTE2α

subject to αTψi − β ≤ 1 for i ∈ 1 . . .mΨ

αTωi − β ≥ 1 for i ∈ 1 . . .mΩ

(10)

We solve a problem of this form for each robot-robot pairand each robot-obstacle pair to yield the safe polyhedron Pikin the form of a set of linear inequalities. Note that the safepolyhedra need not be bounded and that Pik ∩ Pik+1 6= ∅ ingeneral. In fact, the overlap between consecutive Pik allows thesmooth trajectories to deviate significantly from the discreteplans, which is an advantage when the discrete plan is far fromoptimal.

B. Bezier Trajectory Basis

After computing safe corridors, we plan a smooth trajectoryf i(t) for each robot, contained within the robot’s safe corridor.We represent these trajectories as piecewise polynomials withone piece per time interval [tk, tk+1]. Piecewise polynomialsare widely used for trajectory planning: with an appropriatechoice of degree and number of pieces, they can representarbitrarily complex trajectories with an arbitrary number ofcontinuous derivatives.

We denote the kth piece of robot i’s piecewise polynomialtrajectory as f ik. We wish to constrain f ik to lie within thesafe polyhedron Pik. However, when working in the standardmonomial basis, i.e., when the decision variables are the ai inthe polynomial expression

p(t) = a0 + a1t+ a2t2 + · · ·+ aDt

D,

bounding the polynomial inside a convex polyhedron is not aconvex constraint. Instead, we formulate trajectories as Beziercurves. A degree-D Bezier curve is defined by a sequence ofD + 1 control points yi ∈ R3 and a fixed set of Bernsteinpolynomials, such that

f(t) = b0,D(t)y0 + b1,D(t)y1 + · · ·+ bD,D(t)yD (11)

where each bi,D is a degree-D Bernstein polynomial withcoefficients2 given in [31]. The curve begins at y0 and endsat yD. In between, it does not pass through the interveningcontrol points, but rather is guaranteed to lie in the convexhull of all control points. Thus, when using Bezier controlpoints as decision variables instead of monomial coefficients,constraining the control points to lie inside a safe polyhedronguarantees that the resulting polynomial will also lie insidethe polyhedron. We define f i as a K-piece, degree-D Beziercurve and denote the dth control point of f ik as yik,d. The degreeparameter D must be sufficiently high to ensure continuity atthe user-defined continuity level C.

C. Distributed Optimization Problem

The set of Bezier curves that lie within a given safecorridor describes a family of feasible solutions to a singlerobot’s planning problem. We select an optimal trajectory byminimizing a weighted combination of the integrated squaredderivatives:

cost(f i) =

C∑c=1

γc

∫ T

0

∥∥∥∥ dcdtc f i(t)∥∥∥∥2

2

dt (12)

2The canonical Bernstein polynomials are defined over the time interval[0, 1], but they are easily modified to span our desired time interval.


where the γc ≥ 0 are user-chosen weights on the derivatives.Our decision vector y consists of all control points for f i

concatenated together:

y =[yi1,0

T. . . yi1,D

T, . . . , yiK,0

T. . . yiK,D

T]T

(13)

The objective function (12) is a quadratic function of y, whichcan be expressed in the form:

cost(f i) = yT (BTQB)y (14)

where B is a block-diagonal matrix transforming controlpoints into polynomial coefficients, and the formula for Q isgiven in [32]. The start and goal position constraints, as wellas the continuity constraints between successive polynomialpieces, can be expressed as linear equalities. Thus, we solvethe quadratic program:

minimize yT (BTQB)y

subject to yik,d ∈ Pik ∀ i, k, df i(0) = si, f i(T ) = gφ(i)

f i continuous up to derivative Cdc

dtcf i(t) = 0 ∀ c > 0, t ∈ 0, T

(15)

It is important to note that there are N quadratic programswhich can be solved in parallel, allowing a distributed imple-mentation where each robot receives the halfspace coefficientsof its safe corridor (9) and solves the quadratic program (15)onboard. Computation of the spatial partition may also bedistributed with a 2× constant factor of redundant work.

The quadratic program (15) may not always have a solutiondue to our conservative assumptions regarding velocity profilesand the decoupling of robots using hyperplanes. In these cases,we fall back on a solution that follows the discrete plan exactly,coming to a complete stop at corners. Details of this solutionare given in [15].

The corridor-constrained Bezier formulation presents onenotable shortcoming: for a given safe polyhedron Pik, thereexist degree-D polynomials that lie inside the polyhedron butcannot be expressed as a Bezier curve with control pointsthat are contained within Pik. Empirical exploration of Beziercurves suggests that this problem is most significant when thedesired trajectory is near the faces of the polyhedron ratherthan the center. Further research is needed to characterize thisissue more precisely.

D. Iterative Refinement

Solving (15) for each robot converts the discrete planinto a set of smooth trajectories that are locally optimalgiven the spatial decomposition. However, these trajectoriesare not globally optimal. In our experiments, we found thatthe smooth trajectories sometimes lie quite far away fromthe original discrete plan. Motivated by this observation, weimplement an iterative refinement stage where we use thesmooth trajectories to define a new spatial decomposition, anduse the same optimization method to solve for a new set ofsmooth trajectories.

For time interval k, we sample f ik at S evenly-spaced pointsin time to generate a set of points Sik. The number of samplepoints S is a user-specified parameter, set to S = 32 in ourexperiments. We then compute the separating hyperplanes asbefore, except we separate Sik from Sjk instead of `ik from`jk. This problem is also a (slightly larger) ellipsoid-weightedSVM instance. While the sample points Sik are not a completedescription of f ik, Sik is guaranteed to be linearly separablefrom Sjk for i 6= j, because the polynomial pieces f ik, f

jk lie

inside their respective disjoint polyhedra Pik,Pjk .

These new safe corridors are roughly “centered” on thesmooth trajectories, rather than on the discrete plan. Intuitively,iterative refinement provides a chance for the smooth trajec-tories to further minimize the cost of the quadratic objective(15) beyond that allowed by the constraints of the previousspatial partition. Iterative refinement is similar in spirit tosequential quadratic programming (SQP) [33] and sequentialconvex programming (SCP) [34]. These methods find a localminimum for a nonlinear, nonconvex optimization problemby solving an iterative sequence of convex approximations.Our method is not derived directly from such an underlyingproblem but follows the same pattern of alternating betweenupdates on the approximated constraints and on the solution.The max-margin spatial partition objective (10) is not directlyrelated to the primary energy minimization objective (14),but updating the separating hyperplanes in a separate step,allows us to decompose each iteration into N independentsubproblems for efficient and decentralized computation.

Iterative refinement can be classified as an anytime algo-rithm. If a solution is needed quickly, the original set of f i canbe obtained in a few seconds. If the budget of computationaltime is larger, iterative refinement can be repeated until thequadratic program cost (14) converges.

E. Dynamic Limits

Unlike related work (e.g. [35]), we do not take dynamiclimits such as acceleration constraints into account duringoptimization to avoid tight coupling of the robots in theoptimization. Instead, we leverage the fact that the discreteplanning stage already finds a synchronized solution for thequadrotors and we scale all trajectories uniformly in a post-processing step.

The user-supplied timestep duration ∆t directly affects themagnitudes of dynamic quantities such as acceleration and

(a) (b)

Fig. 8. Illustration of discrete plan postprocessing. (a) In timestep k, robotrj arrives at a graph vertex v and robot ri leaves v. The separating hyperplanebetween `ik and `jk (with ellipsoid offset shaded in grey) prevents both robotsfrom planning a trajectory that passes through v. (b) Subdivision of discreteplan ensures that this situation cannot occur.


snap that are constrained by the robot’s actuation limits. Weuse a binary search to find a uniform temporal scaling factorsuch that no trajectory f i violates a dynamic constraint. Forquadrotors, as the temporal scaling goes to infinity, the actuatorcommands are guaranteed to approach a hover state [17], sokinodynamically feasible trajectories can always be found.

F. Discrete Postprocessing

If the FCL collision model is used, the discrete plannermight produce waypoints pi that require some postprocessingto ensure that they satisfy the collision constraints (2) underarbitrary velocity profiles. In particular, we must deal with thecase when one robot rj arrives at a vertex v ∈ VE in thesame timestep k that another robot ri leaves v. This situationcreates a conflict where neither robot’s smooth trajectory canpass through v, as illustrated in Fig. 8. We ensure that thissituation cannot happen by dividing each discrete line segmentin half. In the subdivided discrete plan, at odd timesteps robotsexit a graph-vertex waypoint and arrive at a segment-midpointwaypoint, while at even timesteps robots exit a segment-midpoint waypoint and arrive at a graph-vertex waypoint.Under this subdivision, the conflict cannot occur. However,the increased number of timesteps requires more time to solvethe quadratic program (15).

In our experiments, we noticed that the continuous trajec-tories typically experience peak acceleration in the vicinityof t = 0 and t = T due to the requirement of acceleratingfrom and to a complete stop. We add an additional wait stateat the beginning and end of the discrete plans to reduce theacceleration peak.

VIII. EXPERIMENTS

We implement the roadmap generation, conflict annotation,and discrete planning in C++. We use FCL [22] for collisionchecking, OMPL [36] for the SPARS roadmap generator,OctoMap [37] for the environment data structure, Boost Graphfor maximum flow computation, and Gurobi 7.0 as ILP solver.For the swept collision model we use a quadratic programfor collision checking. The continuous refinement stage isimplemented in Matlab.

We use the total number of actions as a cost metric inECBS, where move actions along an edge and wait actionshave uniform cost. Whenever we use ECBS for an unlabeledplanning problem, we use the Threshold algorithm to computethe goal assignment first.

When computing the robot-obstacle separating hyperplanesfor each robot during each timestep, we limit the search spaceto a box containing the path segment to reduce computation.We expand the bounding box of the segment by 1 m to allowthe continuous plan to deviate from the discrete plan. Weconsider only the nodes of the octree that intersect this box.

To compute separating hyperplanes for the safe corridors,our method requires solving O(KN2 + KNobsN) smallellipsoid-weighted SVM problems. For these problems, we usethe CVXGEN package [38] to generate C code optimized forthe exact quadratic program specification (10). The per-robottrajectory optimization quadratic programs (15) are solved

using Matlab’s quadprog solver. Since these problems areindependent, this stage can take advantage of up to N ad-ditional processor cores. In our experiments, we enforcecontinuity up to the fourth derivative (C = 4) by using apolynomial degree of D = 7. We evaluate our method insimulation and on the Crazyswarm — a swarm of nano-quadrotors [39].

A. Downwash Characterization

In order to determine the ellipsoid radii E, we executed sev-eral flight experiments. For rz , we fly two Crazyflie quadrotorsdirectly on top of each other and record the average positionerror of both quadrotors at 100 Hz for varying distancesbetween the quadrotors. We noticed that high controller gainslead to very low position errors even in this case, but can causefatal crashes when the quadrotors are close. We determinedrz = 0.3 m to be a safe vertical distance. For the horizontaldirection, we use rx = ry = 0.12 m. We set Eobs to a sphereof radius 0.15 m based on the size of the Crazyflie quadrotor.

B. Runtime Evaluation

We execute our implementation on a PC running Ubuntu16.04, with a Xeon E5-2630 2.2 GHz CPU and 32 GB RAM.This CPU has 10 physical cores, which improves the executionruntime for the continuous portion significantly.

Each stage of our framework can be configured and thechoices influence the runtime and quality of results in the laterstages. We will first use one specific example and discuss thevariations at each stage. In later experiments we report theresults for one specific choice on different examples.

Our example “Wall32” uses 32 quadrotors, which begin in agrid in the x−y plane, fly through a wall with three windows,and form the letters “USC” in the air. For the roadmap, wereport the number of vertices and edges created, the desireddispersion (droad) and the runtime to create the roadmap (trm).For the conflict annotation we report the average number ofconflicting vertices per vertex (CVV ), average number of con-flicting edges per edge (CEE ), average number of conflictingvertices per edge (CEV ), and runtime to annotate the roadmap(tconf ). For the MAPF/C solver we report the makespan ofthe solution (K) and the runtime to find a solution (tdis),including solving the assignment problem if ECBS is used foran unlabeled instance. For the continuous trajectory planningwe report the runtime of six iterations (tcon) and the durationof the trajectories (T ).

Each stage might influence the results as follows, see Table Ifor example results:Mapping We use an occupancy grid representation of the

environment, stored in an octree. Larger leaf-nodes resultin a more compact data structure, but might be overlypessimistic. On the other hand, smaller leaf-nodes requiremore computation during the roadmap generation andtrajectory optimization stages. Row 2 shows an examplewhere a significantly higher number of leaf nodes hasno effect on the discrete side, but impacts computationtime on the continuous side due to the higher number ofhyperplanes that must be considered.


TABLE IINFLUENCE OF DIFFERENT PARAMETERS FOR “WALL32” PROBLEM INSTANCE, SEE SECTION VIII-B. BOLD ENTRIES INDICATE PARAMETERS CHANGED

COMPARED TO ROW 1. GRAY ENTRIES INDICATE RESULTS THAT ARE NOT EXPECTED TO CHANGE COMPARED TO ROW 1.

Row Mapping Roadmap Conflicts Discrete Continuousdocto nodes Method droad |VE | |EE | trm Method CVV CEE CEV tconf Method K tdis iter tcon T

1 0.1 17k Grid 0.5 978 3331 0.2 FCL 1.5 3.2 2.6 9.0 ECBS(1.5) 24 0.4 6 47 6.52 0.04 677k Grid 0.5 978 3331 0.2 FCL 1.5 3.2 2.6 9.0 ECBS(1.5) 24 0.4 6 380 6.53 0.1 17k SPARS 0.5 888 3495 36 FCL 0.8 7.5 1.7 9.6 ECBS(2.0) 30 1.5 6 56 11.54 0.1 17k Grid 0.2 13k 40k 1.5 FCL 15.9 11.9 24.0 1043 ECBS(1.5) 54 10 6 103 7.75 0.1 17k Grid 0.5 978 3331 0.2 Swept 1.5 26 2.6 0.6 ECBS(2.5) 36 1.4 6 29 8.56 0.1 17k Grid 0.5 978 3331 0.2 FCL 1.5 3.2 2.6 9.0 ILP 20 222 6 32 5.47 0.1 17k Grid 0.5 978 3331 0.2 FCL 1.5 3.2 2.6 9.0 ECBS(1.5) 24 0.4 2 17 9.5

TABLE IIRESULTS FOR DIFFERENT PROBLEM INSTANCES USING SPARS, ECBS AND THE SWEPT COLLISION MODEL, SEE SECTION VIII-B.

Roadmap Conflicts Discrete ContinuousExample labeled N Env. Size occupied |VE | |EE | trm CVV CEE CEV tconf K tdis t1(hp) t1(qp) tcon T

Flight Test No 32 9× 5.5× 2.2 4 % 873 3430 50 0.8 28 1.8 0.8 28 0.5 2.0 3.9 28 6.5Wall32 No 32 7.5× 6.5× 2.5 6 % 921 3536 36 0.8 27.7 1.7 0.8 41 4.5 1.6 3.5 35 11.6Maze50 No 50 10× 6.5× 2.5 30 % 1045 3221 82 1.1 24.2 2.2 0.7 48 4.4 7.3 12.3 133 16.3Sort200 No 200 14.5× 14.5× 2.5 31 % 3047 7804 85 1.1 18.8 2.0 3.5 39 25 21 34 411 11.7Swap50 Yes 50 7.5× 6.5× 2.5 6 % 869 3371 34 0.8 27 1.6 0.7 48 15 3.4 9.4 78 14.4

(a) Full 32-robot trajectory plan after six iterations of refinement.The start and end positions are marked by squares and filled circles,respectively.

(b) Picture of the final configuration after the test flight. A video is availableas supplemental material.

Fig. 9. Formation change example where quadrotors fly from a circle formation to a goal configuration spelling “USC” while avoiding obstacles.

Roadmap Generation A 6-neighbor grid is fast to computeand results in fewer conflicts during the conflict anno-tation stage, which in turn improves the performanceof solving the MAPF/C instance. However, such a gridapproach might miss narrow corridors entirely. A SPARSgenerated roadmap takes longer to compute and results inmore conflicts due to its irregular shape, as shown in row3. Furthermore, the non-uniform edge length might causethe discrete solver to make worse scheduling decisionsbecause the distance traveled by each robot during asingle timestep varies widely. In row 3, this caused thecontinuous solver to find trajectories which had to beexecuted much more slowly to stay within the physicallimits of the quadrotor.Lower dispersion creates denser roadmaps, which mightproduce better results. However, the time for conflictannotation and number of identified conflicts increasesignificantly. Moreover, this results in MAPF/C instancesthat are harder to solve in practice, see row 4.

Conflict Annotation Using the FCL collision checkingmodel (4) results in fewer edge conflicts on averagecompared to the swept model, decreasing the MAPF/Csolving time, as shown in row 5. However, it requires

the continuous side to postprocess the discrete plan asvisualized in Fig. 8, resulting in larger solving times.Moreover, we have seen various instances where theFCL model results in infeasible quadratic programs forthe continuous stage, thus necessitating reversion to thepiecewise linear plan [15]. The FCL collision checkingmodel is slower compared to the swept model because itis generically implemented, supporting a wide range ofpossible collision models.

Discrete Solver The ILP-based solver can be used tosolve the unlabeled problem optimally with respect tomakespan. However, ILP takes significantly longer (seerow 6) and in cases of large makespans or high numberof conflicts might not find a solution in reasonable timeat all. ECBS is more versatile and computes boundedsuboptimal solutions very quickly. We noted that the sub-optimality bound has a big impact on solution time witha smaller impact on the resulting makespan. Therefore,we used several different bounds, as noted in Table I.

Trajectory Optimization The number of refinement stagesaffects the quality of the trajectories. In our experimentssix refinement iterations were sufficient for convergencein all cases. If trajectories are used after fewer iterations,


(a) 1 iteration (b) 2 iterations (c) 6 iterations

Fig. 10. Subset of results from Fig. 9 after different numbers of refinement iterations. Fine lines represent the discrete plans pi; heavy curves represent thecontinuous trajectories f i. The remaining 28 robots are hidden for clarity. Increased smoothness and directness can be observed with more iterations.

1 2 3 4 5 6iteration

0

1

2

3

4

5

6

max a

ccele

rati

on [

m/s

2]

1 2 3 4 5 6iteration

0

0.5

1

1.5

2

max a

ng

ula

r velo

city

[ra

d/s

]

Fig. 11. Illustration of worst-case acceleration and angular velocity overall robots during six iterative refinement cycles. Left: peak acceleration wasreduced from 6.1 to 1.6m/s2. Right: peak angular velocity was reduced from2.4 to 0.2 rad/s.

the resulting time T tends to be larger because thetrajectories exhibit larger acceleration terms, see row 7.

In another set of experiments, we fix the parameters andapply our method to different problem instances, see Table II.We use a set of parameters such that we are able to find agood solution quickly. However, tuning the parameters fora particular problem might result in shorter times T at ahigher computational cost, as discussed before. We use aleaf-node size of 0.1 m, the SPARS roadmap planner withan average dispersion of 0.5 m, the swept collision model,ECBS as discrete solver and six iterations of continuousrefinement. For ECBS we mostly used suboptimality bound3.0 except for the labeled example (“Swap50”) where we useda higher suboptimality bound of 4.0. We compute plans for fivedifferent examples for 32 to 200 robots navigating in obstacle-rich environments. The most significant time portion is theroadmap generation, taking up to 85 s. Conflict annotationand discrete solving can be done within 30 s. The continuousrefinement finds the first smooth plan in less than a minute.

C. Flight Test

We discuss the different steps of our approach on a concretetask with 32 quadrotors. In this task, the quadrotors beginon a disc in the x − y plane, fly through randomly placedobstacles (e.g. boxes, bicycle, chair), and form the letters“USC” in the air. We execute the experiment in a space whichis 10 m× 16 m× 2.5 m in size and equipped with a VICONmotion capture system with 24 cameras.

In an initial step we use a structured light depthcamera tracked by our motion capture system and the

octomap_mapping ROS stack to map the environmentusing 0.1 m as octree resolution. We generate a roadmapwithin a bounding box of 9 m× 5.5 m× 2.2 m using SPARSin 50 s, generating 873 vertices and 3430 edges. The conflictannotation and discrete planning take less than one secondcombined, finding discrete schedules pi with K = 28. Thecontinuous planner needs six seconds to find the first set ofsmooth trajectories and finishes six iterations of refinementafter 28 seconds.

Fig. 11 demonstrates the effect of iterative refinement onthe quadrotor dynamics required by the trajectories f i. Foreach iteration, we take the maximum acceleration and angularvelocity over all robots for the duration of the trajectories.Iterative refinement results in trajectories with significantlysmoother dynamics. This effect is also qualitatively visiblewhen plotting a subset of the trajectories, as shown in Fig. 10.The final set of 32 trajectories is shown in Fig. 9(a).

We upload the planned trajectories to a swarm of Crazyflie2.0 nano-quadrotors before takeoff, and use the Crazyswarminfrastructure [39] to execute the trajectories. State estimationand control run onboard the quadrotor, and the motion capturesystem information is broadcasted to the UAVs for localiza-tion. Figure 9(b) shows a snapshot of the execution when thequadrotors reached their final state. The executed trajectoriescan be visualized with long-exposure photography, as shownin Fig. 1. The supplemental video shows the full trajectoryexecution.

IX. CONCLUSION

We present a trajectory planning method for large robotteams, combining the advantages of graph-based AI plannersand trajectory optimization. We validate our approach on areal-world example of downwash-aware planning for quadro-tor swarms.

Our approach creates plans where robots can safely fly inclose proximity to each other. We create the trajectories usingthree stages: roadmap generation with inter-robot conflictannotation, discrete planning, and continuous optimization.The roadmap generation stage creates a sparse roadmap fora single robot and can use any existing algorithm for thatpurpose. We annotate the roadmap with generalized conflicts,describing possible inter-robot dependencies which might beviolated if two robots are in close proximity to each other.We can then formulate a MAPF/C instance and either solve


it optimally with respect to makespan using an ILP-basedformulation, or solve it with bounded suboptimality using asearch-based method. The output is a discrete schedule that,if executed, would require quadrotors to stop frequently. Thecontinuous optimization finds smooth trajectories for eachrobot and is decoupled, allowing easy parallelization andimproving performance for large teams.

Our approach can compute safe and arbitrarily smoothtrajectories for hundreds of quadrotors in dense continuousenvironments with obstacles in a few minutes. The trajectoryplan outputs have been tested and executed safely in numeroustrials on a team of 32 quadrotors, as well as in simulations ofup to 200 quadrotors.

In future work, we plan to apply our method to differentkinds of robots and heterogeneous robot teams. Moreover,we plan to investigate methods to improve the performancefurther, for example by generating roadmaps which are bettersuited for multi-robot path planning. Finally, we are interestedin investigating how our method can be adapted for operatingin an online setting.

REFERENCES

[1] J. A. Preiss, W. Honig, N. Ayanian, and G. S. Sukhatme, “Downwash-aware trajectory planning for large quadrotor teams,” in IEEE/RSJ Int.Conf. on Intelligent Robots and Systems (IROS), 2017, pp. 250–257.

[2] S. M. LaValle, Planning algorithms. Cambridge Univ. Press, 2006.[3] G. Wagner and H. Choset, “M*: A complete multirobot path planning

algorithm with performance bounds,” in IEEE/RSJ Int. Conf. on Intelli-gent Robots and Systems (IROS), 2011, pp. 3260–3267.

[4] G. Sharon, R. Stern, A. Felner, and N. R. Sturtevant, “Conflict-basedsearch for optimal multi-agent pathfinding,” Artificial Intelligence, vol.219, pp. 40–66, 2015.

[5] K. Solovey, O. Salzman, and D. Halperin, “Finding a needle in anexponential haystack: Discrete RRT for exploration of implicit roadmapsin multi-robot motion planning,” in Algorithmic Foundations of RoboticsXI. Springer, 2015, pp. 591–607.

[6] W. Honig, T. K. S. Kumar, H. Ma, S. Koenig, and N. Ayanian,“Formation change for robot groups in occluded environments,” inIEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), 2016,pp. 4836–4842.

[7] F. Augugliaro, A. P. Schoellig, and R. D’Andrea, “Generation ofcollision-free trajectories for a quadrocopter fleet: A sequential convexprogramming approach,” in IEEE/RSJ Int. Conf. on Intelligent Robotsand Systems (IROS), 2012, pp. 1917–1922.

[8] D. Mellinger, A. Kushleyev, and V. Kumar, “Mixed-integer quadraticprogram trajectory generation for heterogeneous quadrotor teams,” inIEEE Int. Conf. on Robotics and Automation (ICRA), 2012, pp. 477–483.

[9] D. Morgan, S. Chung, and F. Y. Hadaegh, “Model predictive control ofswarms of spacecraft using sequential convex programming,” Journal ofGuidance, Control, and Dynamics, vol. 37, no. 6, pp. 1725–1740, 2014.

[10] Y. Chen, M. Cutler, and J. P. How, “Decoupled multiagent path planningvia incremental sequential convex programming,” in IEEE Int. Conf. onRobotics and Automation (ICRA), 2015, pp. 5954–5961.

[11] M. Turpin, K. Mohta, N. Michael, and V. Kumar, “Goal assignmentand trajectory planning for large teams of aerial robots,” in Robotics:Science and Systems (RSS), 2013.

[12] J. Peng and S. Akella, “Coordinating multiple robots with kinodynamicconstraints along specified paths,” Int. Journal of Robotics Research(IJRR), vol. 24, no. 4, pp. 295–310, 2005.

[13] D. Bareiss and J. van den Berg, “Reciprocal collision avoidance forrobots with linear dynamics using LQR-obstacles,” in IEEE Int. Conf.on Robotics and Automation (ICRA), 2013, pp. 3847–3853.

[14] J. Alonso-Mora, A. Breitenmoser, M. Rufli, P. Beardsley, and R. Sieg-wart, “Optimal reciprocal collision avoidance for multiple non-holonomic robots,” in Distributed Autonomous Robotic Systems (DARS),2013, pp. 203–216.

[15] S. Tang and V. Kumar, “Safe and complete trajectory generation forrobot teams with higher-order dynamics,” in IEEE/RSJ Int. Conf. onIntelligent Robots and Systems (IROS), 2016, pp. 1894–1901.

[16] M. E. Flores, “Real-time trajectory generation for constrained nonlineardynamical systems using non-uniform rational b-spline basis functions,”Ph.D. dissertation, California Institute of Technology, 2007.

[17] D. Mellinger and V. Kumar, “Minimum snap trajectory generation andcontrol for quadrotors,” in IEEE Int. Conf. on Robotics and Automation(ICRA), 2011, pp. 2520–2525.

[18] D. Yeo, E. Shrestha, D. A. Paley, and E. Atkins, “An empirical modelof rotorcraft UAV downwash model for disturbance localization andavoidance,” in AIAA Atmospheric Flight Mechanics Conference, 2015.

[19] N. Michael, D. Mellinger, Q. Lindsey, and V. Kumar, “The GRASPmultiple micro-UAV testbed,” IEEE Robotics & Automation Magazine,vol. 17, no. 3, pp. 56–65, 2010.

[20] A. Dobson and K. E. Bekris, “Sparse roadmap spanners for asymptoti-cally near-optimal motion planning,” Int. Journal of Robotics Research(IJRR), vol. 33, no. 1, pp. 18–47, 2014.

[21] S. Karaman and E. Frazzoli, “Sampling-based algorithms for optimalmotion planning,” Int. Journal of Robotics Research (IJRR), vol. 30,no. 7, pp. 846–894, 2011.

[22] J. Pan, S. Chitta, and D. Manocha, “FCL: A general purpose library forcollision and proximity queries,” in IEEE Int. Conf. on Robotics andAutomation (ICRA), 2012, pp. 3859–3866.

[23] J. Yu and S. M. LaValle, “Multi-agent path planning and network flow,”in Workshop on the Algorithmic Foundations of Robotics (WAFR), 2012,pp. 157–173.

[24] ——, “Structure and intractability of optimal multi-robot path planningon graphs,” in AAAI Conference on Artificial Intelligence, 2013.

[25] ——, “Planning optimal paths for multiple robots on graphs,” in IEEEInt. Conf. on Robotics and Automation (ICRA), 2013, pp. 3612–3617.

[26] H. Ma and S. Koenig, “Optimal target assignment and path finding forteams of agents,” in Int. Conf. on Autonomous Agents & MultiagentSystems (AAMAS), 2016, pp. 1144–1152.

[27] G. Sharon, R. Stern, A. Felner, and N. R. Sturtevant, “Conflict-basedsearch for optimal multi-agent path finding,” in AAAI Conference onArtificial Intelligence, 2012.

[28] M. Barer, G. Sharon, R. Stern, and A. Felner, “Suboptimal variantsof the conflict-based search algorithm for the multi-agent pathfindingproblem,” in Symposium on Combinatorial Search (SOCS), 2014.

[29] R. E. Burkard, M. Dell’Amico, and S. Martello, Assignment Problems.Society for Industrial and Applied Mathematics, 2009.

[30] S. R. Gunn, “Support vector machines for classification and regression,”University of Southhampton, Tech. Rep., 1998.

[31] K. I. Joy, “Bernstein polynomials,” On-Line Geometric Modeling Notes,vol. 13, 2000.

[32] C. Richter, A. Bry, and N. Roy, “Polynomial trajectory planning forquadrotor flight,” in IEEE Int. Conf. on Robotics and Automation (ICRA),2013, pp. 649–666.

[33] P. T. Boggs and J. W. Tolle, “Sequential quadratic programming,” Actanumerica, vol. 4, pp. 1–51, 1995.

[34] R. H. Byrd, J. C. Gilbert, and J. Nocedal, “A trust region method basedon interior point techniques for nonlinear programming,” MathematicalProgramming, vol. 89, no. 1, pp. 149–185, 2000.

[35] C. Richter, A. Bry, and N. Roy, “Polynomial trajectory planning foraggressive quadrotor flight in dense indoor environments,” in Int. Sym-posium of Robotic Research (ISRR), 2013, pp. 649–666.

[36] I. A. Sucan, M. Moll, and L. E. Kavraki, “The Open Motion PlanningLibrary,” IEEE Robotics & Automation Magazine, vol. 19, no. 4, pp. 72–82, December 2012, software available at http://ompl.kavrakilab.org.

[37] A. Hornung, K. M. Wurm, M. Bennewitz, C. Stachniss, and W. Bur-gard, “OctoMap: An efficient probabilistic 3D mapping frameworkbased on octrees,” Autonomous Robots, 2013, software available athttp://octomap.github.com.

[38] J. Mattingley and S. Boyd, “CVXGEN: a code generator for embeddedconvex optimization,” Optimization and Engineering, vol. 13, no. 1, pp.1–27, 2012.

[39] J. A. Preiss*, W. Honig*, G. S. Sukhatme, and N. Ayanian,“Crazyswarm: A large nano-quadcopter swarm,” in IEEE Int. Conf.on Robotics and Automation (ICRA), 2017, pp. 3299–3304, softwareavailable at https://github.com/USC-ACTLab/crazyswarm.


Wolfgang Honig is a Ph.D. student in the ACT Labat the University of Southern California. He holdsa Diploma in Computer Science from the TechnicalUniversity Dresden, Germany and an M.S. in Com-puter Science (Intelligent Robotics) from USC. Hisresearch focuses on enabling large teams of physicalrobots to collaboratively solve real-world tasks bycombining methods from artificial intelligence androbotics.

James A. Preiss received the B.S. in Mathematicsand B.A. in Photography from The Evergreen StateCollege (2010). He is currently pursuing a Ph.D.in Computer Science at the University of SouthernCalifornia in Los Angeles. His research interestsinclude trajectory optimization, machine learningfor robot control, and multi-robot systems with anemphasis on aerial robots.

T. K. Satish Kumar leads the Collaboratory forAlgorithmic Techniques and Artificial Intelligenceat USC’s Information Sciences Institute. He re-ceived his PhD in Computer Science from StanfordUniversity in March 2005. His research interestsinclude constraint reasoning, probabilistic reasoning,planning and scheduling, robotics, knowledge repre-sentation, model-based reasoning, heuristic search,algorithms and complexity, and approximation andrandomization techniques.

Gaurav S. Sukhatme is Professor of ComputerScience and Electrical Engineering and Gordon S.Marshall Chair in Engineering at the Viterbi Schoolof Engineering at the University of Southern Califor-nia (USC). He received his undergraduate educationat IIT Bombay in Computer Science and Engi-neering, and M.S. and Ph.D. degrees in ComputerScience from USC. He is the co-director of the USCRobotics Research Laboratory and the director ofthe USC Robotic Embedded Systems Laborat orywhich he founded in 2000. His research interests

are in robot networks with applications to environmental monitoring. Hehas published extensively in these and related areas. Sukhatme has servedas PI on numerous NSF, DARPA and NASA grants. He was a Co-PI onthe Center for Embedded Networked Sensing (CENS), an NSF Science andTechnology Center. He is a fellow of the IEEE and a recipient of the NSFCAREER award and the Okawa foundation research award. He is one of thefounders of the Robotics: Science and Systems conference. He was programchair of the 2008 IEEE International Conference on Robotics and Automationand the 2011 IEEE/RSJ International Conference on Intelligent Robots andSystems. He is the Editor-in-Chief of Autonomous Robots and has served asAssociate Editor of the IEEE Transactions on Robotics and Automation, theIEEE Transactions on Mobile Computing, and on the editorial board of IEEEPervasive Computing.

Nora Ayanian received the M.S. and Ph.D. degreesin Mechanical Engineering and Applied Mechanicsfrom the University of Pennsylvania, in Philadelphia,PA, USA, in 2008 and 2011, respectively.

Since 2013, she has been with the Universityof Southern California, Los Angeles, CA, USA asan Assistant Professor of Computer Science andAndrew and Erna Viterbi Early Career Chair (2017-present) and as a WiSE Gabilan Chair (2013-2017).From 2011-2013, she was a Postdoctoral Associatein the Computer Science and Artificial Intelligence

Laboratory at the Massachusetts Institute of Technology. Her research interestsinclude coordination and control for multi-robot systems that can be specifiedwith high-level inputs, with broad applications.

Prof. Ayanian is also a member of the American Society of MechanicalEngineers and the Association for Computing Machinery. She is a co-chairand co-founder of the IEEE Robotics and Automation Society Technical Com-mittee on Multi-Robot Systems, and is Junior Chair of the 2017 InternationalSymposium on Multi-Robot and Multi-Agent Systems (MRS). She serves onthe editorial board of the Swarm Intelligence Journal. She is the recipientof a 2016 National Science Foundation CAREER Award and 2016 OkawaFoundation Research Award, and was named to the 2016 MIT TechnologyReview 35 Innovators Under 35 (TR-35) and 2013 IEEE Intelligent Systems’“AI’s 10 to Watch”. She received the 2016 Outstanding Paper in Robotics atthe International Conference on Automated Planning and Scheduling (ICAPS),and 2008 Best Student Paper at the IEEE International Conference on Roboticsand Automation (ICRA).

ieee transactions on robotics 1 trajectory planning for

Documents