sneak attack against mobile robotic networksfor sneaking. simulation results are shown in section...

16
1 Sneak Attack against Mobile Robotic Networks Yushan Li , Student Member, IEEE, Jianping He , Senior Member, IEEE, Xuda Ding , Student Member, IEEE Lin Cai , Fellow, IEEE and Xinping Guan , Fellow, IEEE Abstract—The security of mobile robotic networks (MRNs) has been an active research topic in recent years. Prior works mainly focused on the cyberspace aspects, while the potential threats from the observable interaction process of MRNs received much less attention. In this paper, we demonstrate that for MRNs, the commonly adopted formation control, which enables robots to form a predefined geometric shape, can leak intrinsic interaction rules that underlie the mission execution of MRNs, putting MRNs at more significant dangers if the rules are further leveraged to design advanced attacks. Specifically, we investigate a sneak attack to replace a target robot by the attack robot, which has only partial observation over the MRN and no other explicit prior knowledge of system dynamics or access. The core insight is to learn the interaction rules via external observation and tentative excitation on robots, including how they interact with each other and with the environment. With learned rules, we design an Evaluate-Cut-Restore (ECR) attack strategy to steer the victim robot off the MRN and replace it by the attacker, and obtain optimal attack impacts. We also establish theoretical conditions to ensure a successful attack in terms of network connectivity and geometric pattern. Representative simulations illustrate the feasibility and effectiveness of the proposed attack, beckoning further research to address the new physical observation and excitation oriented attacks on generic networked systems. I. I NTRODUCTION Mobile robotic networks (MRNs) have received consider- able attention thanks to the mobility, flexibility, and distributed fashion, and are widely deployed in military and industrial applications, e.g., surveillance, reconnaissance, search and environmental monitoring [1]. For these missions, formation control is a fundamental technique to form and maintain a preset geometric shape. Many formation control methods have been proposed for various mission, including formation shape generation, trajectory tracking, reconfiguration and fault accommodation [2], [3], [4], [5], [6]. It has been reported that MRNs under formation control may suffer severe security attacks from both the cyber and physical spaces [7], [8]. In the literature, the security of MRNs mainly focused on the cyberspace aspects, while the potential threats from external observation over the interaction process of MRNs were less noticed. In fact, the state trajectory of the shape-forming and maintaining process essentially reveals the formation control rules to a large extent [9], placing MRNs at high risks. The physical space risks come from two aspects. First, the shape-forming process reflects the internal : The Dept. of Automation, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Edu- cation of China, Shanghai, China. E-mail address: {yushan li, jphe, dingxuda, xpguan}@sjtu.edu.cn. : The Dept. of Electrical and Computer Engineering, University of Victoria, BC, Canada. Email address: [email protected]. interaction structure among cooperating robots, which affects the convergence speed and stability of reaching the desired geometric pattern [10]. Second, the collision/obstacle avoid- ance process is driven by the external interaction mechanism with the environment, which is universally needed for robots to operate in practical applications [11]. Unfortunately, the necessary condition to launch advanced attacks based on external observation and their impacts on MRNs are still open issues and need further investigation. This paper is thus motivated to fill the gap and design a sneak attack against MRNs under formation control, where the external attacker is enabled to learn and leverage the interaction rules to replace a victim robot in the MRN. To practice, the attack robot only has partial observation (or sensing) over the formation and limited moving ability. The problem is of both theoretical and practical significance, either for invading an MRN or providing beneficial insights to the countermeasure design for MRNs. Challenges and contributions. The major challenges of achieving the attack lie in three aspects. First, the attacker is model agnostic with respect to both the internal and external interaction rules, making it hard to split the influences of different interaction rules. Second, the attacker is constrained with partial observation, imposing a harsh obstruction to discriminate whether the current observed state evolution is merely governed the observable part, the unobservable part or both of them. Third, the strong correlation of the former two factors further complicates the target robot evaluation and the feasibility of sneak, bringing strict conditions to guarantee the optimal attack impact. The main contributions of this paper are summarized as follows. We contribute to the existing body of research by under- standing what interaction rules in the formation control can be revealed from external observation over the MRN, without imposing any prior knowledge about system dynamics or access of the MRN. By exploring the learnability of the internal interaction structure within the MRN and the external interaction mechanism with the environment, we design an ECR attack strategy, which evaluates the possible target robots in terms of network interaction influence and to replace the selected victim robot with maximum attack effects. We formulate the universal interaction characteristics within MRNs. By defining indirect controllability, we prove that the conditions of a successful sneak attack lie in appropriate connectivity and geometry pattern of the MRN. Performance study by simulation demonstrates the effectiveness of the proposed approach.

Upload: others

Post on 04-Feb-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

  • 1

    Sneak Attack against Mobile Robotic NetworksYushan Li†, Student Member, IEEE, Jianping He†, Senior Member, IEEE, Xuda Ding†, Student Member, IEEE

    Lin Cai‡, Fellow, IEEE and Xinping Guan†, Fellow, IEEE

    Abstract—The security of mobile robotic networks (MRNs) hasbeen an active research topic in recent years. Prior works mainlyfocused on the cyberspace aspects, while the potential threatsfrom the observable interaction process of MRNs received muchless attention. In this paper, we demonstrate that for MRNs, thecommonly adopted formation control, which enables robots toform a predefined geometric shape, can leak intrinsic interactionrules that underlie the mission execution of MRNs, putting MRNsat more significant dangers if the rules are further leveragedto design advanced attacks. Specifically, we investigate a sneakattack to replace a target robot by the attack robot, which hasonly partial observation over the MRN and no other explicit priorknowledge of system dynamics or access. The core insight is tolearn the interaction rules via external observation and tentativeexcitation on robots, including how they interact with each otherand with the environment. With learned rules, we design anEvaluate-Cut-Restore (ECR) attack strategy to steer the victimrobot off the MRN and replace it by the attacker, and obtainoptimal attack impacts. We also establish theoretical conditionsto ensure a successful attack in terms of network connectivityand geometric pattern. Representative simulations illustrate thefeasibility and effectiveness of the proposed attack, beckoningfurther research to address the new physical observation andexcitation oriented attacks on generic networked systems.

    I. INTRODUCTION

    Mobile robotic networks (MRNs) have received consider-able attention thanks to the mobility, flexibility, and distributedfashion, and are widely deployed in military and industrialapplications, e.g., surveillance, reconnaissance, search andenvironmental monitoring [1]. For these missions, formationcontrol is a fundamental technique to form and maintaina preset geometric shape. Many formation control methodshave been proposed for various mission, including formationshape generation, trajectory tracking, reconfiguration and faultaccommodation [2], [3], [4], [5], [6].

    It has been reported that MRNs under formation controlmay suffer severe security attacks from both the cyber andphysical spaces [7], [8]. In the literature, the security of MRNsmainly focused on the cyberspace aspects, while the potentialthreats from external observation over the interaction processof MRNs were less noticed. In fact, the state trajectory ofthe shape-forming and maintaining process essentially revealsthe formation control rules to a large extent [9], placingMRNs at high risks. The physical space risks come from twoaspects. First, the shape-forming process reflects the internal

    †: The Dept. of Automation, Shanghai Jiao Tong University, and KeyLaboratory of System Control and Information Processing, Ministry of Edu-cation of China, Shanghai, China. E-mail address: {yushan li, jphe, dingxuda,xpguan}@sjtu.edu.cn.‡: The Dept. of Electrical and Computer Engineering, University of

    Victoria, BC, Canada. Email address: [email protected].

    interaction structure among cooperating robots, which affectsthe convergence speed and stability of reaching the desiredgeometric pattern [10]. Second, the collision/obstacle avoid-ance process is driven by the external interaction mechanismwith the environment, which is universally needed for robotsto operate in practical applications [11]. Unfortunately, thenecessary condition to launch advanced attacks based onexternal observation and their impacts on MRNs are still openissues and need further investigation.

    This paper is thus motivated to fill the gap and design asneak attack against MRNs under formation control, wherethe external attacker is enabled to learn and leverage theinteraction rules to replace a victim robot in the MRN. Topractice, the attack robot only has partial observation (orsensing) over the formation and limited moving ability. Theproblem is of both theoretical and practical significance, eitherfor invading an MRN or providing beneficial insights to thecountermeasure design for MRNs.

    Challenges and contributions. The major challenges ofachieving the attack lie in three aspects. First, the attacker ismodel agnostic with respect to both the internal and externalinteraction rules, making it hard to split the influences ofdifferent interaction rules. Second, the attacker is constrainedwith partial observation, imposing a harsh obstruction todiscriminate whether the current observed state evolution ismerely governed the observable part, the unobservable part orboth of them. Third, the strong correlation of the former twofactors further complicates the target robot evaluation and thefeasibility of sneak, bringing strict conditions to guarantee theoptimal attack impact. The main contributions of this paperare summarized as follows.

    • We contribute to the existing body of research by under-standing what interaction rules in the formation controlcan be revealed from external observation over the MRN,without imposing any prior knowledge about systemdynamics or access of the MRN.

    • By exploring the learnability of the internal interactionstructure within the MRN and the external interactionmechanism with the environment, we design an ECRattack strategy, which evaluates the possible target robotsin terms of network interaction influence and to replacethe selected victim robot with maximum attack effects.

    • We formulate the universal interaction characteristicswithin MRNs. By defining indirect controllability, weprove that the conditions of a successful sneak attack liein appropriate connectivity and geometry pattern of theMRN. Performance study by simulation demonstrates theeffectiveness of the proposed approach.

  • 2

    In a larger sense, this paper essentially provides deeper in-sights into the trade-off between the system security and stateobservability of a class of networked systems, by enlighteningthe possibility of revealing the interaction rules that support thenetworked systems, e.g., opinion dynamics in social networks,time synchronization in wireless sensor networks. In turn,it also promotes to analyze the security vulnerabilities anddesign defense strategies for the networked systems, from theperspective of securing the interaction process.

    Organization and Notation. The remainder of this paperis organized as follows. Section II presents related literatureconcerning this paper. Section III gives the modeling for theinteraction process of formation control and formulates theproblem of interest. Section IV studies how to reveal theinternal interaction structure. Section V develops how to learnthe external obstacle mechanism and design control strategyfor sneaking. Simulation results are shown in Section VI,followed by concluding remarks and further research issuesin Section VII.

    Throughout the paper, we use Euclid font to denote setvariable. A\B represents the elements in A that are not inB. The superscript k denotes the time instants of discreteform, and the subscripts i, j denote the indexes of agents. Thesuperscripts [i, :], [:, i] of a matrix denote its i-th row vectorand j-th column vector, respectively. The scripts ·̃ and ·̂ rightabove a variable indicate its corresponding observation andestimation value, respectively. The basic notation definitionsare given in Table I.

    II. RELATED WORKFormation control in MRNs. The fundamental rules for for-

    mation control were first introduced by the famous Reynolds’Rules [12]: separation, alignment, and cohesion. Based on therules, numerous methods have been proposed to achieve thedesired performance, and consensus-based algorithms becomethe mainstream, e.g., [13], [14], [15], [16]. The key idea ofconsensus-based algorithms is that the formation is modeledas a graph, and every robot implements information interaction(like positions and velocities) with their neighbors and com-pute their control inputs. Therefore, the interaction structurelays critical support for effective formation control, and ismostly based on network communication. In recent years,communication-free formation control [17], [18], [19] was alsodeveloped and attracts great research interests, thanks to theemergent development of sensor technology. Communication-free interaction avoids information delays and network band-width consumption, and even enables stealth modes of oper-ation [20]. For instance, formation control with only bearingmeasurements by vision sensors was investigated in [21]. TheReynolds’ Rules lay a solid foundation for formation control,but they do not take into account some unanticipated faults(e.g., interaction failure, formation splitting), which are easilyincurred when performing tasks in the real environment. Manyadvanced interaction rules and techniques were developed tobetter handle those accidental and unpremeditated faults [22],[23], [24]. For instance, a consensus-based splitting/mergingalgorithm was designed to tackle complex obstacle distribu-tions in [25]. A topology recovery method was designed to

    reconnect a broken link based on historical data in [26]. [27],[28] proposed a gradient-based self-repairing algorithm torestore the topology of formation and further synchronize theformation motion. The motion synchronization methods wereinvestigated to form desired/predefined formation configura-tion in the presence of robot malfunctions [29] or undesirableformation variation [30]. [31] gave a comprehensive survey formore types of fault-tolerant design about formation control.Note that by either way, the common characteristic of themethods is that the interaction capability is limited by distancedue to the energy constraint in real-world applications, and thissimple yet crucial point is utilized in our attack design.

    The security of MRNs. Effective formation control in MRNsis supported by physical sense and network interaction, whichconstitute the feedback loops for control and also incur secu-rity vulnerabilities, and an attack on these components maycause disastrous consequences. MRN under formation controla typical cyber-physical-system (CPS), while the existing workmainly focused on the defense design to cyber attacks, e.g.,DoS, false data injection, and replay attacks [32], [33], [34],[35]. The attacker is generally assumed with strong priorknowledge about the MRN (like knowing the system structureor node states [36], corrupting exchanged information [37]),and formulated with powerful attack model. Recently, a fewworks emerged which are dedicated to attacks against for-mation from physical space. For instance, a noise-generatingattack was proposed in [38] to alter gyroscopic sensor data,leading to drone crashes. [39], [40] proposed spoofing attack todisturb the GPS sensor readings, gradually causing trajectorydeviations. These attacks are against specific transducer byutilizing its sensing mechanism, and hard to be generalizedto other scenarios. [41] developed a trial-and-learning basedmethod to infer the obstacle-avoidance mechanism of mobilerobots by disguising the attack robot as an obstacle, whichrequires little prior knowledge about mobile robots.

    In summary, concerning what attack capability is feasiblefor an “ignorant” attacker from external observation and whatis the upper limit of the attack impacts on MRNs, it stillremains an open yet critical issue that motivates this work.

    III. PRELIMINARIES AND PROBLEM FORMULATION

    A. Graph Basics

    Let G = (V, E) be a directed graph that models a MRN,where V = {1, · · · , N} is the finite set of nodes (i.e., robots)and E ⊆ V×V is the set of interaction edges. An edge (i, j) ∈E indicates that i will use information from j. The adjacencymatrix A = [aij ]N×N of G is defined such that aij > 0 if(i, j) exists, and aij = 0 otherwise. Denote N ini = {j ∈ V :aij > 0} and N outi = {j ∈ V : aji > 0} as the in-neighborand out-neighbor sets of i, respectively. A directed path isa sequence of nodes {1, 2, · · · , j} such that (i+ 1, i) ∈ E ,i = 1, 2, · · · , j−1. A directed graph has a (directed) spanningtree if there exists at least a node having a directed path toall other nodes. G must have a spanning tree to guarantee theinformation flow in robot formation.

    Define L = diag{A1} − A as the Laplacian matrix of G,where 1 is a vector of all ones. Then we have L1 = 0. Let

  • 3

    TABLE INOTATION DEFINITIONS

    Symbol Definition

    zi the state of robot i (shortened to ri in the text)za the state of attack robot (shortened to ra in the text)zv the state of victim robot (shortened to rv in the text)g the external interaction mechanism with the environmentW the internal interaction matrix among robotsRω the radius indicating different interaction range, where

    ω = c, o, s, f represent the interaction, obstacle avoid-ance, safety, and observation radius, respectively.

    Dω the dataset collected in different stages, where ω =c, s, e, a represent the convergence, stabilization,excitation and attack processes, respectively.

    Vpi the ri-vertexed convex polygon covering Nouti

    Nωi the in/out-neighbor set (ω = in/out) of robot iVF the formation under partial observationΩ the rule set of formation interaction

    L = ΓΛΓ−1 be the Jordan decomposition of L, where Γ =[q1 · · · qN ] is the transformation matrix, Γ−1 = [p1 · · · pN ], andΛ = diag{λi} with λi being the Jordan block of eigenvalueλi. For ease of discussion and notation, we suppose that alleigenvalues of L are ordered as |λ1| ≤ |λ2| ≤ · · · ≤ |λN |. Theeigenvalues and eigenvectors satisfy

    (λiI − L)qi = 0, pTi (λiI − L) = 0, pTi qj=0 (i 6=j). (1)

    Specially, we take q1 = 1, and p1 = [p11 · · · p1N ]T, and all theright and left eigenvector are normalized such that pTi qi = 1.

    B. Distance Constraints and Shape Specification

    In terms of interaction range, each robot has an interactionradius Rc for interacting with its neighbors, and obstacle-avoidance radius Ro for detecting external obstacles, and asafe radius Rs for indicating possible collision danger. It isreasonable to assume that the radii satisfy

    Rc > Ro > Rs. (2)

    Let a0ij be the preset interaction weight of (i, j) initially,and zi be the position state of robot i (abbreviated to rihereafter). Based on the well-known disk model that considersthe boundedness of Rc [1], aij is assumed to satisfy

    aij = a0ij · sign(Rc − ‖zj − zi‖2), (3)

    where the indicative function sign(x) = 1 if x > 0, or0 otherwise. Accordingly, we define the R-disk of a givenposition z as

    P(z,R) = {zω : ‖zω − z‖2 < R} . (4)

    To describe the predefined geometric shape under formationcontrol, a group of parameters {hi}(i ∈ V) is introduced.Concretely, hi is the desired relative deviation between ri anda common reference point, which is usually the centroid or avertex of the geometric shape. Note that once the formationshape is specified, the choice of the reference point will makeno difference as hij = hj−hi remains unchanged. In practice,ri only needs to know {hij : j ∈ N ini }. For simplicity, wetake rN as the reference point.

    C. Interaction Modeling of Formation Control

    The following modeling is given to describe how to formthe preset shape through internal interaction, and deal with theenvironmental interferences through external interaction.

    Forming formation shape via internal interactions. Theformation V is claimed to reach the steady pattern if itspredefined shape is formed, i.e.,

    ‖(zj − hj)− (zi − hi)‖2 = 0,∀i, j ∈ V. (5)

    To achieve this pattern, a commonly used first-orderconsensus-based formation control algorithm and its globalform are given by [10]

    żi=∑j∈N ini

    aij(zj−zi−hij), ż(t) = −Lz(t) + Lh, (6)

    where hij = hj − hi, z = [z1, · · · , zN ]T, and h =[h1, · · · , hN ]T. Note that (6) applies to every dimension of theformation control due to their motion independence. Generally,to dynamically guide the formation motion, one robot willbe specified as the leader with an extra velocity input. Forsimplicity and without loss of generality, we take rN as theleader and suppose that it runs in a constant velocity c. Denoteu0 = [0, · · · , 0, c]T and u = Lh+ u0. Then based on (6), thenew global dynamics and its solution are obtain by

    ż(t) = −Lz(t) + u. (7)

    Accordingly, the solution of (7) is given by

    z(t) = e−Ltz(0) +

    ∫ t0

    e−L(t−τ)udτ. (8)

    Here L is regarded as the internal interaction structure anddetermines the stability and convergence of formation pattern.(8) guarantees zero deviation for static formation and boundedtracking error for dynamic formation. To eliminate the trackingerror, solutions contain imposing a virtual leader to all robots(which is impractical when the scale of the MRN is large), oradopting a second-order controller which further utilizes thevelocities of neighbors. Note that the global form of second-order controller is the same as that of first-order. The majordifference lies in that zi is augmented with velocity state, andconsequently the internal interaction structure is related to boththe positions and velocities of neighbors.

    Remark 1. Essentially, the use of velocity states by second-order controller is to approximate the real leader speed, thisprocess can be transfered to a nonlinear first-order modelż(t) = −Lz(t)+Lh+us(t), where us(t) is determined by thesecond-order controller design and satisfies lim

    t→∞us(t) = c1

    [42]. In Section VI, we will illustrate that our linear ap-proximation method also applies to this nonlinear model ifonly using first-order states. Further, using second-order linearapproximation is also directly available, which works the sameway.

    Obstacle/collision avoidance via external interaction inter-face. The obstacle-avoidance mechanism is the major interfacefor MRNs to interact with physical environment. Let zi∗ bethe desired goal of ri, zob and vob be the state and velocity of

  • 4

    the obstacle, respectively. The obstacle-avoidance behavior ismainly determined by the relative positions between a robotand the obstacles, and their velocities. Therefore, we formulatethe mechanism as

    żi = g(zob − zi, zi∗ − zi, vob, vi). (9)

    Note that in multi-robot cases, zi∗ usually represents asynthetic and time-varying position determined by the in-neighbors of ri. However, zi∗ − zi is negligible when‖za − zi‖2 → Rs, so the effect of the obstacle is dominant.Furthermore, typical obstacle-avoidance mechanisms do notconsider the mobility of obstacles, and thus the last parameterin g(·) may not be used, e.g., in the artificial potential methodand genetic approach. Accordingly, we suppose that g(·)satisfies

    g(·) = 0, if ‖zob − zi‖2 > Ro,0 ≤ g(·) ≤ b, if Rs ≤ ‖zob − zi‖2 ≤ Ro,g(·) = b, if ‖zob − zi‖2 < Rs.

    (10)

    Formation maintenance. To deal with possible faults andhazardous interference in the environment, many methods havebeen developed to support autonomy and self-organizationfor interaction maintenance, e.g., restoring the interactiontopology of splitting robot groups [28], [26]. In line with theseconsiderations, we extract the main characteristics behindthese methods and formulate them as the following rulesetΩ = {ΩR,ΩA}.• Neighbor-connection restoration rule ΩR. For ri, ifaij(t) 6= a0ij , ∀j ∈ N ini at time t, then ri will try tomove back to its desired position zi∗ and reconnect withrj , ∀j ∈ N ini . The necessary condition of successfulrestoration is that zi∗ is available.

    • Neighbor-replacement authentication rule ΩA. Givenra /∈ V , i ∈ V and a time slot tl, j ∈ N ini can bereplaced by ra at (t0 + tl) if and only if ∀t∈ [t0, t0 + tl),

    aij(t) 6= a0ij , za(t) ∈ P(zi(t), Rc), (11)‖za(t)− zj∗(t)‖2Rc, Rh = Rf −Rc. (16)

    From (16), once VF is determined, VH is also determined byRh, which is shown in Fig. 1.

  • 5

    E. Problem of Interest

    A MRN G = (V, E) of N robots is deployed to executemission in physical environment, while running to a goalposition zg with specified formation shape configuration {hij}.A malicious attack robot (denoted as ra) who has limited ob-servation ability over VF ⊆ V and bounded moving capability,aims to sneak into V by replacing a victim robot (denoted asrv).

    To achieve above purpose, ra needs to explore the forma-tion interaction knowledge first. Note that what is directlyobservable to ra is the formation shape convergence andmaintenance processes, which reflect the internal interactionstructure in the MRN. To learn the external interaction rule,ra needs to actively make excitation on the robot and observeits response. With the interaction rules obtained, ra begins thesneak attack. Accordingly, we divide the whole process as fourstages: formation shape convergence (observe), steady for-mation maintenance (observe), tentatively excitation(explore),and attack launching (sneak). The observations on the fourstages are constructed as four datasets: Dc, Ds, De and Da.Finally, we solve the following four problems sequentially,which constitute the whole work of this paper, as shown inFig. 2.• Stage 1: given Ds, infer the steady pattern parameters ĥ,ĉ.

    • Stage 2: given Dc, ĥ and ĉ, find the mapping relation φthat embodies W by solving

    minφ(Dc) 7→W

    ‖W − Ŵ‖Frob. (17)

    • Stage 3: given De, find the mapping relation ϕ thatembodies g by solving

    minϕ(De)7→g

    ‖g − ĝ‖2. (18)

    • Stage 4: given Ŵ and ĝ, design a control strategyua,0:H = {u0a, u1a, · · · , uHa } such that

    V(tH) = (V(t0)\{rv}) ∪ {ra}. (19)

    By solving above four problems sequentially, the attacker isenabled to learn the interaction rules and leverage them toimplement sneak attack.

    IV. REVEALING THE INTERACTION RULES

    In this section, we first analyze the feasibility of reveal-ing the interaction rules of a MRN, and then highlight themethod to reveal the internal and external interaction rules,respectively. The obtained results are leveraged to constitutethe cornerstone for the sneaking attack.

    A. Feasibility Analysis

    To reveal the interaction rules from the observations, thereare two foremost issues that need to be addressed. First,under what condition the observations are meaningful andreflect the interaction rules. Second, how large the numberthe of observations is sufficient to approximate the interactionrules with acceptable errors. If these issues are settled, the

    problem is then converted to parameter identification of a givensystem model or model fitting from a statistic perspective, anddifferent methods are adopted to approximate the two kindsof interaction rules, as shown in Fig. 3. The details are givenbelow.

    For the internal interaction rule, what matters most to usis the structure among the formation robots. Therefore, theclassical linear state space equation is sufficient to approximatethe internal interaction rule, where the structure is explicitlyrepresented as the state transition matrix. Under this model, thewidely used least square method (LSM) provides an efficientway to obtain the optimal parameter estimation in the senseof minimizing the observation errors [46]. Note that arbitrarytwo consecutive state variation of the formation in Stage 2 isidentical, thus the observations in this stage are not directlyused for approximate W . We will demonstrate in the followingsections that, the structure correctness of the approximatedW is robust under different noise variances and interactionmodels, and the interaction weight accuracy can also be furtheroptimized.

    For the external interaction rule, our major focus is toexploit certain knowledge to describe the impact of an ob-stacle with different position and velocity on a formationrobot, not necessarily seeking for the explicit analytical ex-pressions, especially considering the vast kinds of obstacle-avoidance mechanisms. Therefore, we adopt generic learning-based method to approximation the external interaction rule,e.g., SVR, which has good performance on nonlinear approx-imation and strong generalization ability when the amount ofdata is not vast [47]. Note that in this stage, the input-outputpairs with same relative positions and velocities between therobot and the obstacle are identical, only providing redundantinformation for the learning process.

    B. Steady Pattern Identification

    The steady pattern of the MRN reflects the formation shapeand moving speed of the MRN.

    Assume that ra begins its observation as the MRN beginsrunning. When the formation pattern is reached, its statedynamics are illustrated by the following result.

    Lemma 1. For the global dynamic model (8) with u = Lh+u0 = Lh+ [0 · · · 0 c]T, define

    zss = ct·1 + s, (20)

    where the offset vector s satisfies s = q1pT1 z(0)+(I−q1pT1 )h+N∑i=2

    cpiNλi

    qi. Then, we have

    limt→∞

    ‖z(t)− zss‖2 = 0. (21)

    Lemma 1 illustrates that, when the formation reaches thesteady pattern, all robots are running at a common speedwith fixed relative state deviations. This provides a theoreticalfoundation to identify the steady pattern parameters. Beforethat, we first denote ∆zki = z

    ki −z

    k−1i , d

    kij = ‖zkj − zki ‖2, and

    define the l-period as a time period [k, k+ l], and the real timepoint of reaching the steady pattern as ks. The second-order

  • 6

    Fig. 2. The whole scheme of our proposed sneaking attack. First, at Stage 1 and 2, ra leverages Dc and Ds to infer the steady pattern parameters and W .At Stage 3, ra makes tentative excitation on rv and and regresses g based on the observed response dataset De. Meanwhile, the interaction radius Rc isestimated and used as feedback to further optimize W . Finally, at Stage 4, ra begins its sneak with the support of the learned rules. And the response datasetDs can also be used to expand the data for regressing g.

    Fig. 3. Illustration of the approximation of the interaction rules. The formationstate evolution is governed by the interaction rules, and the observed state atinstant k is a resulting output of the interaction rules taking the observed stateat instant k − 1 as input (possibly with preprocessing).

    state difference accumulation of ri during l-period (l > 1) isgiven by

    ∆Sk0:k0+li =

    k0+l−1∑k=k0+1

    ‖∆zk+1i −∆zki ‖2. (22)

    Then, based on the observations, the steady velocity is ob-tained by solving

    ĉ(k0, l) = arg minc

    k0+l∑k=k0

    ∥∥z̃kF − (cεT k + b0)1∥∥2, (23)where b0 is an auxiliary constant. Note that (23) can be easilyby generic least square method (LSM). Next, we present thefollowing theorem to show how to identify the formationleader and the performance of estimated ĉ.

    Theorem 1. Given VF and ks, if the leader rN ∈ VF , then itis identified by

    r̂N = arg mini

    {∆S2:ksi : ∆S

    2:ksi < �, i ∈ VF

    }, (24)

    where � is an arbitrarily small positive constant. And ∀k0 >ks, we have lim

    l→∞ĉ(k0, l) = c.

    Theorem 1 illustrates how to identify the formation leaderwhen there are no observation noises, and how to obtain thestable velocity if the observation noises are involved. Note thatsmaller ∆Sk0:k0+li means less velocity variation in ri. Based

    on it, given preset l and �, the following criterion is designedto discriminate ks, denoted as k∗ here.

    k∗=inf{k0 :

    (∑i∈VF

    ∆S̃k0:k0+li

    )≤�}. (25)

    Once ra obtains k∗, the datasets Dc, Ds, and the formationshape parameters {hi} are respectively obtained by

    Dc = {z̃kF : k ≤ k∗}, Ds = {z̃kF : k > k∗}, ĥ = ŝ− ŝj1,(26)

    where z̃kF = [z̃ki1, z̃ki2 , · · · , z̃

    ki|VF |

    ]T (iω ∈ VF , ω =1, · · · , |VF |), and ŝ is calculated by

    ŝ =∑k∗+l

    k=k∗+1(z̃kF − ĉεTk ·1)/l.

    C. Internal Interaction Structure Approximation

    After identifying the steady pattern, we then try to extractthe interaction structure among VF .

    Denote the unobserved robot set as VF ′ = V\VF , and definethe indicative leader vector

    I[i]F =

    {1, if ∃i ∈ VF , i = r̂N ,0, otherwise.

    (27)

    Then, the global dynamics in (15) can be represented as[zk+1F

    zk+1F ′

    ]=

    [WFF WFF ′

    WF ′F WF ′F ′

    ][zkF

    zkF ′

    ]+εT

    [ukF

    ukF ′

    ], (28)

    where uF = hF + cIF , uF ′ = hF ′ + cIF ′ . Accordingly, theobservations of ra over VF is given by

    z̃k+1F = WFF z̃kF +WFF ′ z̃

    kF ′ + εT û

    kF + ξ

    kF , (29)

    From (29), it is straightforward to find that the states of thepartial formation are also influenced by the unobservable VF ′ .If we neglect z̃F ′ and try to seek a solution of WFF ′ , theresult will deviate a lot from the real one. However, a keyinsight is that not every robot in VF will be influenced byrobots in VF ′ . Recalling the feasible subset VH⊆VF specifiedby (16), let VH′ = VF\VH , WHF = [WHH WHH′ ]. Before

  • 7

    the following theorem of inferring partial topology, we defineykH = z̃

    kH − ĥH − εT ĉIH and ykF = [(z̃kH − ĥH)T, (z̃kH′)T]T.

    Theorem 2. Given Dc and k∗, if Dc is linearly modeled byra, then observations in Dc satisfy

    yk+1H = WHFykF , (30)

    and if |VF |+ 1 ≤ l ≤ k∗, then the optimal estimation ofφ(Dc) 7→W in sense of least square is

    φ(Dc) : ŴHF =(

    (YFYTF )−1YFY

    TH

    )T, (31)

    where the matrix YH = [y2H , y3H , · · · , ylH ] and YF =

    [y1F , y2F , · · · , yl−1F ].

    Theorem 2 gives the least square solution of WHF underlinear approximation model. Although the number of feasibleobservations is limited in practice, Theorem 2 neverthelesscan be used as the basis for approximating WHF , especiallyto obtain a robust connection structure in WHF even the realmodel is nonlinear and with different noise variance, whichwill be verified in Section VI.

    D. External Interaction Rule Approximation

    Next, we present the tentative excitation based method toillustrate how to infer the interaction radius Rc and constructdataset De for learning the external interaction rule g.

    Based on ŴHF , the desired goal position of ri in the nextmoment is calculated by

    ẑk+1i∗ =∑

    j∈VFâij(z̃

    kj − z̃ki − h

    [j]F + h

    [i]F ), (32)

    where âij = ŵij/εT (i 6= j). Then, we introduce the followingdefinition and prove the direct controllability of a robot byutilizing the response characteristic of g.

    Definition 1. (Direct controllability) A node is directly con-trollable if one can control it to reach any given state z∗c infinite steps by direct external excitations.

    Lemma 2. Given an external ra /∈ VF , ∀ri ∈ VF , if g isknown, and (zi∗ − zi), (za − zi) and vi are measurable, thenri is directly controllable by ra.

    Based on Lemma 1, ri can be directly influenced by radue to the interaction characteristic of g. Note that once arobot is off the interaction range of its in-neighbors, it becomesan outlier and does not move towards the desired position.Similar to [41], we utilize a tentative excitation on a robot andobserves its response, and obtain the obstacle detection radiusRo. Then, to infer the interaction radius Rc, the followingexcitation strategy is implemented such that ri under excitationis steered outside the interaction range of its neighbors, givenby

    {ue : za(ue) = αzi∗+(1−α)zi, Rs < ‖za(ue)− zi∗‖2 < Ro}.(33)

    Recalling yk+1H = WHFykF in Theorem 2 and given z̃

    ki , y

    k+1i

    can be predicted by ŷk+1i = Ŵ[i,:]HF y

    kF . If the continuous exci-

    tation of ra makes ri lose connections with its in-neighbors,the deviation between the predicted ŷk+1i and the actually

    observed ŷk+1i will be significant large. Therefore, we usethe following criterion to determine when the connection isbroken, and accordingly calculate the estimation of Rc, i.e.,{

    k2 = min{k :∥∥ŷk+1i − yk+1i ∥∥2 > β},

    R̂c = min{‖z̃k2i − z̃k2j ‖2 : j ∈ N

    ini },

    (34)

    where β > 0 is a given threshold. Technically, a larger βwill guarantee a more conservative R̂c. Further, taking R̂cinto account, the previously approximated ŴHF is furtheroptimized by solving

    minWHF

    ‖YH −WHFYF‖Frob (35a)

    s.t. W [i,j]HF = 0, if ‖z̃i − z̃j‖2 > R̂c, i ∈ VH , j ∈ VF . (35b)

    Note that (35) is a typical constrained linear least squareproblem, and it can be solved by many mature optimizationtechniques, e.g., Newton interior-point method [48]. Thisprovides feedback to obtain a more accurate estimation of theelement magnitudes in ŴHF .

    During the whole response process of ri in this stage,the corresponding input-output pairs of its obstacle-avoidancemechanism g(·) specified by (9) are recorded as

    Qkin=[z̃kv − z̃ka , z̃kv∗− z̃kv ,∆z̃kv/εT ,∆z̃ka/εT ], Qkout = ∆z̃k+1v .

    (36)Based on the dataset De =

    {∪{Qkin, Qkout}

    }, many mature

    learning methods are available to learn g (e,g., SVR) bysolving

    ĝ = arg ming:Qin 7→Qout

    ∑L′k=1

    ∥∥Qkout − g(Qkin)∥∥2. (37)So far, R̂c, ŴHF and ĝ are obtained, providing support todesign the following efficient sneak strategy.

    V. DYNAMIC SNEAKING ATTACK

    In this section, we first analyze the attack feasibility byintroducing the definitions of 1-hop convex formation andindirect controllability of a robot, and provide the conditions toensure the attack success. Then, we propose the ECR strategyto dynamically replace rv by ra. Finally, we show a wayto infer the out-neighbors of ra and further complete theinteraction topology estimation from WHF to WFF .

    A. Feasibility Analysis

    In this part, we give the theoretical guarantees for the exis-tence of attack positions and the controllability of formationrobots.

    Note that the success of the attack lies in two parts: i) thereexists a feasible attack position and ii) rv is controllable suchthat the attack condition is satisfied. Accordingly, we introducethe following definitions.

    Definition 2. (1-hop convex formation) Given ri, if ∃Vpi ⊆{i ∪ N outi } such that i) the nodes in V

    pi constitutes a convex

    polygon, ii) the polygon covers {i ∪ N outi }, and iii) i is onevertex of the polygon, then Vpi is the 1-hop convex pattern ofri.

  • 8

    Definition 3. (Indirect controllability) A node is indirectlycontrollable if one can first control another node, and a let achain reaction make the tagged node reach any given state z∗cin finite steps.

    Theorem 3. Given ri, if Vpi exists, then we havei) Vpi is unique.

    ii) Zfi =⋂

    j∈Nouti

    P(zj , dij) 6= ∅, and ‖z − zj‖2 < ‖zi − zj‖2,

    ∀z ∈ Zfi , j ∈ N outi .

    Theorem 3 shows the useful distance properties by intro-ducing Vpi , which will be applied in the following sections.Also, by proving it, we directly provide a method to find thepositions that satisfy the property. Next, we illustrate underwhat condition formation robots are (indirectly) controllable.Suppose that i ∈ V is under the excitation of ra, and thenzi is directly determined by ue = g(·), and ri is regardedas an another leader in V . In this situation, denote the newadjacent matrix as Ae where aeij = 0,∀j ∈ N ini , and otherelements are the same as those in original A. Accordingly, thecorresponding Laplacian matrix is Le = diag{Ae1} −Ae.

    Lemma 3. Given the goal state z∗c and the initial state z0, riis indirectly controllable by rj iff{

    ueuc > 0, if (z∗c − z0i )uc > 0,|pe1jue| > |pe1Nuc|, if (z∗c − z0i )uc < 0,

    (38)

    where pe1 =[pe11,· · ·, pe1N ]T is the corresponding left eigenvector

    for λ1 of Le.

    Lemma 3 points out the necessary and sufficient conditionfor indirect controllability from perspective of global connec-tivity. However, due to the partial observation of ra over V , theeigenvectors of the interaction matrix is hard to obtain, makingit impossible to verify the conditions. To overcome this issue,we leverage the idea of Lemma 3, and propose the followingtheorem to illustrate how to guarantee the direct controllabilityin a feasible way.

    Theorem 4. Given the goal state z∗c and the initial state z0,ri is indirectly controllable by rj when ue, uc satisfy{

    ueuc > 0, if (z∗c − z0i )uc > 0,|aijue| > |āijuc|, if (z∗c − z0i )uc < 0,

    (39)

    where āij =∑

    j′∈{N ini \j}aij′ .

    Without knowing the global interaction topology of V ,Theorem 4 provides an indirect controllability criterion thatfinds the feasible robots from the in-neighbors to indirectlycontrol the target robot, and it supports the following sneakingdesign.

    B. Control Strategy for Sneaking Attack: Evaluate-Cut-Restore

    Next, we present the detailed strategy design for ra to sneakinto the MRN efficiently.

    To successfully implement the sneak attack, a feasible rvneeds to be selected based on its significance evaluation in

    the formation first, and then ra is controlled to replace rv .Besides, different attack procedures are needed to meet ΩAgiven different geometric patterns of rv and its neighbors. Thekey point is to ensure the existence of Vpv . Accordingly, wepropose the following “Evaluate-Cut-Restore” procedures aswell as the strategy details.• Step 1: Evaluate the robots under partial observation

    based on interaction topology information, and select themost valuable one as rv .

    • Step 2: Validate the 1-hop convex pattern for rv . If thispattern is satisfied, adopt a Cut strategy based on directcontrollability. If not, go to step 3.

    • Step 3: If the indirect controllability is available for rv ,adopt a Pull strategy to make a 1-hop convex patternform. If not, re-select a sub-optimal target and go to step2.

    In this stage, the estimated state of ri under attack and thestate of ra are respectively given by

    ẑk+1i (uka) = z̃

    ki + ĝ(u

    ka), (40)

    zk+1a = zka + u

    ka, (41)

    where ĝ(uka) is the simplification of ĝ(zka − z̃ki , zki∗− z̃ki , vka −

    vki ).Evaluate phase. With the inferred partial topology, we need

    to choose an attack target to achieve the maximum effect.After the partial internal interaction WHF is obtained, a targetrobot is selected as the attack interface for ra. We proposethe following criteria to evaluate the significance of differentrobots, given by

    maxi

    (|N outi |+ ‖W[:,i]

    HF ‖1 − |N ini | − ‖W[i,:]HF ‖1) (42a)

    s.t. i ∈ VH , N outi | ≥ 1, |N ini | ≤ α1, (42b)

    where W [:,i]HF , W[i,:]HF represent the i-th column and i-th row

    of WHF , respectively, and α1 a preset possible integer. Thesolution of (42) is considered as rv . (42) shows that two factorsare taken into account for attack target selection. A robot witha larger out-degree means it has a higher impact on others, andvice versa. Due to the small scale of the observed formation,(42) can be easily solved by a common traverse search methodwith algorithm complexity of O(‖H‖‖F‖).

    Cut phase. In this core phase, ra aims to make the connec-tions between rv and its in-neighbors break, i.e., avj(t) 6= a0vj .If Vpi exists and rv is readily to be excited directly (i.e.,not many other robots are distributed around), the followingstrategy is adopted to break the connections,

    maxuka

    α2‖ẑk+1v (uka)−ẑk+1v∗ ‖2 + α3∑j∈N inv

    ‖ẑk+1j −ẑk+1v −h̃jv‖2,

    (43a)s.t. (32), (40) for rv, and (41), (43b)

    ‖uka‖2 < b. (43c)

    where α2, α3 > 0 are the weight parameters to balance theoffsets of rv with its ideal position and in-neighbors. Anefficient way to solve this cut-edge problem is to adopt agreedy heuristic by sampling from the feasible solution space.

  • 9

    If Vpi does not exist or the rv is not readily to attack,we utilize the indirect controllability and first attack an in-neighbor of rv , such that the formation pattern Vpi is reachedand rv is easily to approach. Let rl1 and rl2 be ri’s two out-neighbors that are the two closest to rj . Denote −−−→zl1zl2 as thevector from zl1 to zl2 , then the following strategy is adopted

    uka = argua

    max{∥∥ẑk+1j (ua)− ẑk+1j∗ ∥∥2 : ua ∈ UF ,j ∈ N ini and rj satisfies (39)}, (44)

    where UF ={ua : ‖ua‖2 < b, ua = λ−−−→zl1zl2⊥,∀λ > 0} and ⊥

    means the orthogonal vector of a vector. Iteratively, ra updatesthe control inputs until Vpi is formed, then the control modeis turned to (43).

    Restore phase. After the Vpi is formed, ra only needs tobe in a position that is close to all the out-neighbors of ri,therefore ra is controlled by

    uka = argua

    max{∥∥ẑk+1v (ua)− ẑk+1v∗ ∥∥2 : zk+1a ∈ Zfv } . (45)

    By continuously implementing the procedure, ra takes theideal position that belongs to rv , and will be regarded as thereal rv by rv’s out-neighbors at certain time by rule ΩA. Afterthat, the original rv becomes an outlier and ra is controlled tomove towards ẑv∗ and za ∈ P(zj , Rc)(∀j ∈ N inv ). Based onthe acceptance rule, ra is recognized as a formation memberand obtains partial control over the formation. The sneakingattack is completed.

    During this attack-implementing stage, ra also collectsdataset Da = {Qkin, Qkout} and uses it to update ĝ online.

    C. Further Inference and AttackAfter sneaking into the MRN, ra is able to implement

    further inference about its out-neighbors that may be in VH ,and attack the MRN by intentionally destroy the formationshape.

    Recall that the connections among robots in VH′ and con-nections from VH to VH′ are all unknown. Since ra becomesa member of the formation, if it intentionally deviates fromits supposed-to-go position ( taking this as excitation inputue), the deviation effect will spread to its out-neighbors, andto their out-neighbors and so on. Also, if ra further interfereswith the robots in VH′ , their neighbors are also inferable. Herewe provide the following theorem to illustrate how ra is ableto infer the out-neighbors of a robot.

    Theorem 5. Given β in (1) and instant k0, if ra is a member ofthe formation, then ∃m > 0,N outv are identifiable by arbitraryexcitation input |ue| > 0 at instant (k0 +m).

    Theorem 5 shows that the out-neighbors of a blocked robotcan be inferred. In practice, the observation period m is noneed to go infinity, as long as the the position fluctuationcaused by ra has exceeded the normal range that ra assumes.Therefore, in finite steps, the connections among robots inV\VF are discriminated if the condition holds. Finally, theinteraction structure WFF within VF is obtained. Additionally,Theorem 5 also provides a way for ra to destroy the forma-tion shape, e.g., by leading its out-neighbors to an arbitraryposition.

    Algorithm 1 Sneaking Attack DesignInput: Partial topology WHF .Output: ra sneaks into the formation.1: Observe the dynamic process of F converging to the steady

    formation geometric pattern.2: Judge the time point k∗ when the steady pattern is reached by

    (25).3: Construct datasets Dc, Ds by (26).4: Estimate the parameters of steady pattern based on Ds by (26).5: Infer the internal interaction topology WHF based on Dc by

    (31).6: while rv still moves towards the ideal position do7: Leveraging direct controllability, make tentative excitation on

    rj to approximate Rc, Ro by (34).8: Optimize the estimation of WHF utilizing approximated Rc

    by (35).9: Construct dataset De =

    {∪{Qkin, Qkout}

    }, and learn the

    external interaction mechanism g by (37).10: end while11: while rv and its neighbors are within R̂c do12: Leveraging indirect controllability, adopt ECR (Evaluate-Cut-

    Restore) attack strategy.13: Construct dataset Da =

    {∪{Qkin, Qkout}

    }during this process,

    and update g online by (37).14: end while15: Adopt further excitation on j ∈ VF\VH , and complete the

    estimation of WFF .16: Obtain partial control over VF and destroy the formation.

    (a) (b)

    Fig. 4. A MRN of 17 robots with two different interaction structure. Theformation shape is set the same, while the intersection structure is different.The detailed interaction weights are denoted in red font.

    To sum up, the whole procedure of the proposed sneakingattack against formation control is shown in Algorithm 1.

    VI. SIMULATION

    In this section, we conduct extensive simulations to demon-strate the feasibility of revealing the interaction rules in a MRNfrom external observations and excitation, and also validate thetheoretical analysis and the proposed sneak attack strategy.

    A. Simulation Setting

    The detailed simulation setting is given blew.MRN specification. A MRN consisting of 17 robots is con-

    sidered, with two kinds of interaction structure and commonpreset formation shape, as shown in Fig. 4. Specifically, therobot 1 is set as the leader, and the moving speed in stablestage is set as 0.2m/s. The maximum speed of ra is 1m/s, andthe radii are setting as Rc = 7m, Ro = 2m and Rs = 0.5m.

    Comparisons under different model and parameters. Forthe interaction structure ŴHF , we present the approximationresults from three perspectives. First, we adopt the linermodel ż(t) = −Lz(t) + Lh + u0 and the nonlinear model

  • 10

    (a) (b) (c)

    Fig. 5. The approximation results based calculated k∗. (a). Formation velocity estimation. (b) and (c) are the approximation results with increasing dataamount of Dc structure case 1 and 2, respectively.

    ż(t) = −Lz(t)+Lh+us(t), respectively. Second, we considerthe influence of different number of samples and differentobservation noise variance. Additionally, the performance ofŴHF without and with the feedback of R̂c are provides.For the external interaction mechanism ĝ, we present theapproximation results under different number of samples anddifferent observation noise variance.

    Metric of evaluation. To evaluate ŴHF , we use the fol-lowing two indexes

    ε1 =

    ∥∥∥sign(ŴHF )− sign(WHF )∥∥∥0

    |H||F|, ε2 =

    ∥∥∥ŴHF −WHF∥∥∥Frob

    ‖WHF‖Frob,

    (46)

    where ε1 evaluates the structure correctness in terms ofwhether two robots have an interaction connection, and ε2represent the magnitude correctness in terms of how larger isthe connection weight between two robots. To evaluate ĝ, weadopt mean directional accuracy (MDA), root mean squareerror (RMSE) and mean absolute error (MAE), which arerespectively calculated by

    MDA =1

    m

    m∑i=1

    sign(yi − y′i), (47)

    RMSE =

    √√√√ 1m

    m∑i=1

    (yi − y′i)2, (48)

    MAE =

    m∑i=1

    |yi − y′i|m

    , (49)

    where yi, y′i are the actual and predicted values, m is thesample number and sign(·) is the indicator function in thissection.

    B. Results and Analysis

    We mainly illustrate the results from the perspectives of fourstages described in Section III-E.

    Stage 1. Recall that the time instant k∗ is computedby (25) to identify when the MRN has reached the steadypattern. Using different threshold � to compute k∗ will resultin different sample number of Dc. As shown in Fig. 5(a),when the MRN reaches steady state, the velocity estimationremains stable. Fig. 5(b)-5(c) illustrate the sample numbermainly affects the accuracy of ε2, and when the amount

    (a) (b)

    Fig. 6. The approximation result comparison of ŴHF with and withoutthe feedback of R̂c. (a) under linear formation control. (b) under nonlinearformation control

    becomes larger, both ε1 and ε2 become stable. This is becausethat the latter observations in Dc is observed from the stagethat is close to the steady process, only providing redundantinformation to approximate WHF . In fact, this demonstratesthat what exactly is the calculate k∗ is not important underboth linear and nonlinear model, especially for the structureerror. Nevertheless, more data are still helpful to obtain WHFwith smaller magnitude errors.

    Stage 2. Hereafter, we present the results based on theinteraction structure given in Fig. 4(a). Fig. 6 shows theapproximation errors of the internal interaction structure underdifferent variances of observation noises. The variance ofobservation noise varies from 0 to 0.5. Fig. 6(a) presents theapproximation evaluation of the MRN under linear formationlaw. As it shows, the structure error ε1 is smaller than ε2, andwhen noises are involved, the approximation error is generallystable under different noise variances. Specifically, when theinteraction radius feedback is used to further optimize WHF ,i.e., solving (35), the approximation errors are reduced sig-nificantly. The details in Fig. 6(b) is likewise, and under thesame conditions other than the nonlinear formation control, theapproximation errors are basically larger than that under linearformation control, which is not hard to understand. However,we observe that even though the approximation errors innonlinear case is higher, the difference is not significant. Theerrors remain small and are sufficient to get a feasible WHFto support the following attack, especially in terms of theconnection structure. These results verify that the accuracyof our proposed linear approximation for WHF .

    Stage 3. To verify the approximation performance of theobstacle-avoidance mechanism (here we use classical artificialpotential method [49]), SVM with a Gaussian kernel function

  • 11

    Fig. 7. Obstacle-avoidance mechanism regression results based on SVR method. Under different numbers of the samples, 3 groups of experiments areconducted for each sample number.

    is used for regression, and the parameters are optimized by10-fold grid search cross-validation. We conduct 4 groups ofregression with 3 repeated procedure based on De collectedfrom the tentative excitation process, and the σ = 0.1 here.We randomly select 25, 50, 100, 200 samples to train the SVRmodel in each group, respectively. A fixed set of 120 samplesare used for testing. The testing results are shown in Fig. 7,with the statistic results given in Table II. By comparing theresults of the test samples in each group, it is obvious thatthe fitting performance is improved with more training data.Meanwhile, the index does not vary much with more samples,which is illustrated by comparing the results of repeated tests.MDA of training, and testing show upward trends from 25-sample-training to 200-sample-training, which indicates thatmore training samples lead to high accuracy. Nevertheless, weobserve that even though a larger number of samples bringsmore accurate results, the improvement of the accuracy is notsignificant. Even with a small number of samples such as the25-samples, the method is still feasible especially in terms ofMDA (above 90%). It is worth mentioning that RMSE andMAE fluctuate due to observation noises. Specifically, RMSEis no larger than 0.6, corresponds the famous 3σ-principle [50].Overall, these shows the effectiveness of the proposed methodto learn the external interaction rule.

    Stage 4. First, at the Evaluate phase, robot 5 is selectedas rv by the evaluation criteria (42), which has only onein-neighbor and three out-neighbors, and its corresponding 1hop convex pattern Vp5 exists. Then, during the Cut phase,ra utilizes the learned ĝ and makes consequent excitationson rv by solving (43), such that rv is gradually pulled outof the interaction range of its in-neighbors, shown as thephase before the break time in Fig. 8(a). After the connectionsof rv with its in-neighbors break, ra in the Restore phase

    TABLE IISTATISTIC OBSTACLE-AVOIDANCE MECHANISM REGRESSION RESULTS

    BASED ON SVR METHOD.

    25 samples 50 samplesIndex MDA RMSE MAE MDA RMSE MAE

    Training 0.880 0.253 0.154 0.913 0.217 0.113Testing 0.933 0.601 0.404 0.933 0.581 0.300

    100 samples 200 samplesIndex MDA RMSE MAE MDA RMSE MAE

    Training 0.910 0.333 0.146 0.923 0.426 0.206Testing 0.956 0.541 0.291 0.967 0.496 0.264

    is controlled by solving (45) dynamically to move withinZfi =

    ⋂j∈Nouti

    P(zj , dij) (i = 5 here), such that ‖za − zj‖2 <

    ‖zi − zj‖2, ∀j ∈ Nouti (i.e., j = 6, 7, 8 here). This phase is

    illustrated as the process between the break time and sneakingstart time in Fig. 8(a), with detailed presentation of distancedeviations (‖za − z̃j‖2 − ‖z̃i − z̃j‖2) in Fig. 8(b), where thedeviations are all negative. Based on the neighbor-replacementauthentication rule ΩA, the robotsN out5 will take ra as the newneighbor replace robot 5. Finally, ra here just moves towardsz5∗ and its out-neighbors begin following it, and the formationwill reach steady state again. Note that the position fluctuationofN out5 before beginning sneaking in Fig. 8(a) also verifies theindirect controllability in Theorem 4, and the identifiability inTheorem 5. Likewise, this can be used to destroy the formationshape, and the results are omitted here due to space limit.

    VII. CONCLUSIONIn this paper, we investigated the security of MRNs under

    formation control by designing a sneaking attack. We demon-strated that the interaction rules in MRNs are learnable eventhe attacker is only with partial observation over the MRN andwithout any explicit prior knowledge. First, we proposed an

  • 12

    (a)

    (b)

    Fig. 8. The illustration of the sneak process. (a). The position errors betweenthe real and the desired positions of the robots, and ra takes the z∗5 as itsdesired position. (b) The distance deviations (‖za − z̃j‖2 − ‖z̃i − z̃j‖2),here i = 5, j = 6, 7, 8.

    excitation-based method to approximate the internal interac-tion structure and the obstacle-avoidance mechanism with theenvironment. Then, we designed the ECR attack strategies,which make the attacker replace a formation robot and gainpartial control over the MRN. The scheme also providedfeedback to optimize the accuracy of approximated interactionrules further. Comprehensive theoretical analysis and numeri-cal results illustrated the effectiveness of the proposed attack.This work reveals the possibility of learning the interactionrules of an MRN from physical observations and excitationsand utilizing them to design an intelligent sneak attack, eventhough the interaction rules are model agnostic to the attacker.

    In a broader sense, our work broadens the horizons ofthe security of CPSs by exploiting the system vulnerabilitiesfrom physical observations and interactions. Furthermore, weemphasize that the system interaction nature can be leveragedby malicious attackers to cause disastrous consequences on thesystem. Concerning the CPS security from this perspective, alot of issues remain to be further addressed, including i) howto design efficient detection methods to identify the potentialthreats; ii) how to reliably secure the interaction protocolsthat only leak confusing states on the external observer; andiii) how to construct robust and dynamic interaction defensivestrategies with the physical world.

    REFERENCES

    [1] F. Bullo, J. Cortes, and S. Martinez, Distributed control of roboticnetworks: A mathematical approach to motion coordination algorithms.Princeton University Press, 2009, vol. 27.

    [2] R. Olfati-Saber and R. M. Murray, “Consensus problems in networksof agents with switching topology and time-delays,” IEEE Transactionson Automatic Control, vol. 49, no. 9, pp. 1520–1533, 2004.

    [3] K.-K. Oh, M.-C. Park, and H.-S. Ahn, “A survey of multi-agentformation control,” Automatica, vol. 53, pp. 424–440, 2015.

    [4] P. London, S. Vardi, and A. Wierman, “Logarithmic communicationfor distributed optimization in multi-agent systems,” Proc. ACM Meas.Anal. Comput. Syst., vol. 3, no. 3, Dec. 2019. [Online]. Available:https://doi.org/10.1145/3366696

    [5] A. Sankararaman, A. Ganesh, and S. Shakkottai, “Social learningin multi agent multi armed bandits,” Proc. ACM Meas. Anal.Comput. Syst., vol. 3, no. 3, Dec. 2019. [Online]. Available:https://doi.org/10.1145/3366701

    [6] M. A. Kamel, X. Yu, and Y. Zhang, “Formation control and coordinationof multiple unmanned ground vehicles in normal and faulty situations:A review,” Annual Reviews in Control, 2020.

    [7] H. Choi, W.-C. Lee, Y. Aafer, F. Fei, Z. Tu, X. Zhang, D. Xu, andX. Xinyan, “Detecting attacks against robotic vehicles: A control invari-ant approach,” in Proceedings of the 2018 ACM SIGSAC Conference onComputer and Communications Security. ACM, 2018, pp. 801–816.

    [8] X. Fu and E. Modiano, “Fundamental limits of volume-based networkdos attacks,” Proc. ACM Meas. Anal. Comput. Syst., vol. 3, no. 3, Dec.2019. [Online]. Available: https://doi.org/10.1145/3366698

    [9] Y. Yuan, X. Tang, W. Zhou, W. Pan, X. Li, H.-T. Zhang, H. Ding, andJ. Goncalves, “Data driven discovery of cyber physical systems,” NatureCommunications, vol. 10, no. 1, pp. 1–9, 2019.

    [10] R. Olfati-Saber, J. A. Fax, and R. M. Murray, “Consensus and coop-eration in networked multi-agent systems,” Proceedings of the IEEE,vol. 95, no. 1, pp. 215–233, 2007.

    [11] A. Pandey, S. Pandey, and D. R. Parhi, “Mobile robot navigation andobstacle avoidance techniques: A review,” Int Rob Auto J, vol. 2, no. 3,p. 00022, 2017.

    [12] C. W. Reynolds, “Flocks, herds and schools: A distributed behavioralmodel,” in Proceedings of the 14th Annual Conference on ComputerGraphics and Interactive Techniques, 1987, pp. 25–34.

    [13] X. Sun and C. G. Cassandras, “Optimal dynamic formation control ofmulti-agent systems in constrained environments,” Automatica, vol. 73,pp. 169–179, 2016.

    [14] S. Zhao, “Affine formation maneuver control of multiagent systems,”IEEE Transactions on Automatic Control, vol. 63, no. 12, pp. 4140–4155, 2018.

    [15] J. Alonso-Mora, E. Montijano, T. Nägeli, O. Hilliges, M. Schwager,and D. Rus, “Distributed multi-robot formation control in dynamicenvironments,” Autonomous Robots, vol. 43, no. 5, pp. 1079–1100, 2019.

    [16] Y. Xu, S. Zhao, D. Luo, and Y. You, “Affine formation maneuver controlof high-order multi-agent systems over directed networks,” Automatica,vol. 118, p. 109004, 2020.

    [17] M. Deghat, I. Shames, B. D. O. Anderson, and C. Yu, “Localizationand circumnavigation of a slowly moving target using bearing measure-ments,” IEEE Transactions on Automatic Control, vol. 59, no. 8, pp.2182–2188, 2014.

    [18] T.-H. Cheng, Z. Kan, J. R. Klotz, J. M. Shea, and W. E. Dixon, “Event-triggered control of multiagent systems for fixed and time-varyingnetwork topologies,” IEEE Transactions on Automatic Control, vol. 62,no. 10, pp. 5365–5371, 2017.

    [19] M. H. Trinh, S. Zhao, Z. Sun, D. Zelazo, B. D. O. Anderson, andH.-S. Ahn, “Bearing-based formation control of a group of agents withleader-first follower structure,” IEEE Transactions on Automatic Control,vol. 64, no. 2, pp. 598–613, 2018.

    [20] Z. Kan, A. P. Dani, J. M. Shea, and W. E. Dixon, “Network connectivitypreserving formation stabilization and obstacle avoidance via a decen-tralized controller,” IEEE Transactions on Automatic Control, vol. 57,no. 7, pp. 1827–1832, 2012.

    [21] S. Zhao, Z. Li, and Z. Ding, “Bearing-only formation tracking control ofmultiagent systems,” IEEE Transactions on Automatic Control, vol. 64,no. 11, pp. 4541–4554, 2019.

    [22] Z. Feng, C. Sun, and G. Hu, “Robust connectivity preserving rendezvousof multirobot systems under unknown dynamics and disturbances,” IEEETransactions on Control of Network Systems, vol. 4, no. 4, pp. 725–735,2016.

  • 13

    [23] M. Khalili, X. Zhang, M. M. Polycarpou, T. Parisini, and Y. Cao,“Distributed adaptive fault-tolerant control of uncertain multi-agentsystems,” Automatica, vol. 87, pp. 142–151, 2018.

    [24] C. Liu, B. Jiang, R. J. Patton, and K. Zhang, “Hierarchical-structure-based fault estimation and fault-tolerant control for multiagent systems,”IEEE Transactions on Control of Network Systems, vol. 6, no. 2, pp.586–597, 2019.

    [25] H. Zhu, J. Juhl, L. Ferranti, and J. Alonso-Mora, “Distributed multi-robot formation splitting and merging in dynamic environments,” in 2019International Conference on Robotics and Automation (ICRA). IEEE,2019, pp. 9080–9086.

    [26] Y. Li, H. Wang, J. He, and X. Guan, “Optimal topology recovery schemefor multi-robot formation control,” in 2019 IEEE 28th InternationalSymposium on Industrial Electronics (ISIE). IEEE, 2019, pp. 1847–1852.

    [27] L. Zhe, J. Ju, W. Chen, X. Fu, and H. Wang, “A gradient-based self-healing algorithm for mobile robot formation,” in IEEE/RSJ Interna-tional Conference on Intelligent Robots and Systems, 2015.

    [28] Z. Liu, W. Chen, H. Wang, Y.-H. Liu, Y. Shen, and X. Fu, “A self-repairing algorithm with optimal repair path for maintaining motion syn-chronization of mobile robot network,” IEEE Transactions on Systems,Man, and Cybernetics: Systems, 2017.

    [29] Z. Liu, H. Wang, L. Xu, Y.-H. Liu, J. Lu, and W. Chen, “A failure-tolerant approach to synchronous formation control of mobile robotsunder communication delays,” in 2018 IEEE International Conferenceon Robotics and Automation (ICRA). IEEE, 2018, pp. 1661–1666.

    [30] X. Dong and G. Hu, “Time-varying formation tracking for linear multi-agent systems with multiple leaders,” IEEE Transactions on AutomaticControl, vol. 62, no. 7, pp. 3658–3664, 2017.

    [31] H. Yang, Q.-L. Han, X. Ge, L. Ding, Y. Xu, B. Jiang, and D. Zhou,“Fault tolerant cooperative control of multi-agent systems: A survey oftrends and methodologies,” IEEE Transactions on Industrial Informatics,vol. 16, no. 1, pp. 4–17, 2020.

    [32] F. Pasqualetti, F. Dörfler, and F. Bullo, “Attack detection and identi-fication in cyber-physical systems,” IEEE Transactions on AutomaticControl, vol. 58, no. 11, pp. 2715–2729, 2013.

    [33] Y. Mo and B. Sinopoli, “On the performance degradation of cyber-physical systems under stealthy integrity attacks,” IEEE Transactionson Automatic Control, vol. 61, no. 9, pp. 2618–2624, 2015.

    [34] H. Sandberg, S. Amin, and K. H. Johansson, “Cyberphysical security innetworked control systems: An introduction to the issue,” IEEE ControlSystems Magazine, vol. 35, no. 1, pp. 20–23, 2015.

    [35] H. S. Sánchez, D. Rotondo, T. Escobet, V. Puig, and J. Quevedo, “Bibli-ographical review on cyber attacks from a control oriented perspective,”Annual Reviews in Control, vol. 48, pp. 103–128, 2019.

    [36] F. Pasqualetti, A. Bicchi, and F. Bullo, “Consensus computation inunreliable networks: A system theoretic approach,” IEEE Transactionson Automatic Control, vol. 57, no. 1, pp. 90–104, 2012.

    [37] D. Silvestre, P. Rosa, J. P. Hespanha, and C. Silvestre, “Stochasticand deterministic fault detection for randomized gossip algorithms,”Automatica, vol. 78, pp. 46–60, 2017.

    [38] Y. Son, H. Shin, D. Kim, Y. Park, J. Noh, K. Choi, J. Choi, and Y. Kim,“Rocking drones with intentional sound noise on gyroscopic sensors,”in 24th {USENIX} Security Symposium ({USENIX} Security 15), 2015,pp. 881–896.

    [39] N. O. Tippenhauer, C. Pöpper, K. B. Rasmussen, and S. Capkun, “Onthe requirements for successful gps spoofing attacks,” in Proceedings ofthe 18th ACM Conference on Computer and Communications Security.ACM, 2011, pp. 75–86.

    [40] G. Bianchin, Y.-C. Liu, and F. Pasqualetti, “Secure navigation of robotsin adversarial environments,” IEEE Control Systems Letters, vol. 4, no. 1,pp. 1–6, 2019.

    [41] Y. Li, J. He, C. Chen, and X. Guan, “Learning-based intelligent attackagainst formation control with obstacle-avoidance,” in Proceedings ofthe 2019 American Control Conference. IEEE, 2019, pp. 2690–2695.

    [42] F. L. Lewis, H. Zhang, K. Hengster-Movric, and A. Das, Cooperativecontrol of multi-agent systems: Optimal and adaptive design approaches.Springer Science & Business Media, 2013.

    [43] M. M. Zavlanos, H. G. Tanner, A. Jadbabaie, and G. J. Pappas, “Hybridcontrol for connectivity preserving flocking,” IEEE Transactions onAutomatic Control, vol. 54, no. 12, pp. 2869–2875, 2009.

    [44] Z. Kan, E. A. Doucette, and W. E. Dixon, “Distributed connectivitypreserving target tracking with random sensing,” IEEE Transactions onAutomatic Control, vol. 64, no. 5, pp. 2166–2173, 2018.

    [45] M. Aranda, G. López-Nicolás, C. Sagüés, and M. M. Zavlanos,“Coordinate-free formation stabilization based on relative position mea-surements,” Automatica, vol. 57, pp. 11–20, 2015.

    [46] A. J. Dobson and A. Barnett, An introduction to generalized linearmodels. CRC press, 2008.

    [47] N. Cristianini, J. Shawe-Taylor et al., An introduction to supportvector machines and other kernel-based learning methods. CambridgeUniversity Press, 2000.

    [48] S. Boyd, S. P. Boyd, and L. Vandenberghe, Convex optimization.Cambridge university press, 2004.

    [49] O. Khatib, “Real-time obstacle avoidance for manipulators and mobilerobots,” in Autonomous Robot Vehicles. Springer, 1986, pp. 396–404.

    [50] F. Pukelsheim, “The three sigma rule,” The American Statistician,vol. 48, no. 2, pp. 88–91, 1994.

    APPENDIXA. Proof of Lemma 1

    Proof. Based on (8) and e−Lt = Γe−ΛtΓ−1, its modeldecomposition form is given by

    z(t) =

    N∑i=1

    qipTi e−λitz(0) +

    ∫ t0

    N∑i=1

    qipTi e−λi(t−τ)dτ · u

    =q1pT1 z(0) +

    N∑i=2

    qipTi e−λitz(0)

    +

    [q1p

    T1 t+

    N∑i=2

    1

    λiqip

    Ti (1− e−λit)

    ]· u. (50)

    When t → ∞, the damping terms decrease to zero, and weobtain

    zss = q1pT1u · t+ q1pT1 z(0) +

    N∑i=2

    1

    λiqip

    Ti u. (51)

    Note u = Lh+ u0 = (N∑i=1

    qipTi λi)h+ u0, and by utilizing (1)

    and λ1 = 0, we haveq1p

    T1 (

    N∑i=1

    qipTi λi) = 0,

    N∑i=2

    qipTi (

    N∑i=1

    qipTi ) =

    N∑i=2

    qipTi = (I−q1pT1 ).

    (52)

    Taking (52) into (51), we obtain

    zss = q1pT1 u0·t+q1pT1 z(0)+(I−q1pT1 )+

    N∑i=2

    1

    λiqip

    Ti u0. (53)

    Since rN is a unique leader and does not use informationfrom others, meaning the last row of matrix A and L are zeros,then the following equations hold{

    pT1L = [p11, · · · , p1N ]TL = [0, · · · , 0],pT1 q1 = [p11, · · · , p1N ]T1 = 1.

    (54)

    By solving (54), we obtain{p1N = 1

    p1k = 0, k = 1, 2, · · · , N−1.(55)

    Finally, taking (55) into (53) and it follows that

    zss = q1pT1 z(0) + (I − q1pT1 )h+

    N∑i=2

    cpiNλi

    qi + ct·1 = s+ ct·1

    (56)The proof is completed.

  • 14

    B. Proof of Theorem 1

    Proof. First, under Assumption 1, the formation pattern is notin steady before ks. Thus ∀k′ < ks, ∆S2:k

    i > 0 holds for allrobots except for the leader r with ∆S2:k

    i = 0. Therefore, theuniqueness of rN guarantees its identifiability. Second, after∀k > ks, the observations of the state evolution in [k, k + l]obeys (20). Considering the Gaussian distributed characteristicof the observation noise and ĉ is a least square solution, whenl→∞, lim

    l→∞‖ĉ(k, l)− c‖ = 0. The proof is completed.

    C. Proof of Theorem 2

    Proof. Here we adopt the limit analysis method. First, if Rf =Rc (i.e., the circle center locates on a robot i), then N ini ⊆ VF ,which yields

    z̃k+1i = W[i,:]FF z̃

    kF + εT û

    ki + ξ

    ki , (57)

    where W [i,:]FF represents the corresponding row for zi. Bymoving terms, it follows that

    W[i,:]FF z̃

    kF = z̃

    k+1i − εT û

    ki − ξki = yk+1i − ξ

    ki . (58)

    Supposing the observations are noise-free and no other priorinformation is available, then |VF | groups of observationequations are sufficient to solve W [i,:]HF . This case shows roboti is only influenced by its in-neighbors that are within itscommunication range.

    Next, if Rf > Rc, then ∃i ⊆ VH ⊆ VF , {i ∪ N ini } ⊆ VF .Therefore, (57) is generalized as

    z̃k+1H = WHH z̃kH +WHH′ z̃

    kH′ + εT û

    kH + ξ

    kH

    = WHF z̃kF + εT û

    kH + ξ

    kH . (59)

    Take ûH = ĥH + ĉIH into (59) and it follows that

    z̃k+1H = WHF z̃kF + εT{ĉIH +

    I −WHHεT

    ĥH}

    = WHH(z̃kH − ĥH) +WHH′ z̃kH′ + εT ĉIH + ĥH + ξkH .

    (60)

    By the definition of yH and yF , we obtain

    yk+1H = z̃k+1H − εT ĉIH − ĥH

    = WHH(z̃kH − ĥH) +WHH′ z̃kH′ + ξkH

    = WHFykF + ξ

    kH . (61)

    Ignoring the noise, the mapping relation φ(Dc) is obtained.Finally, note that a single group yk+1H = WHFy

    kF is based

    on two consecutive observations over VF , and contains |H|groups of equations in terms of WHF , which consist of (|H|·|VF |) parameters. Like (58), for i ∈ H, yk+1i = W

    [i,:]HF y

    kF , after

    transposing, we have

    yk+1Hi = (ykF )

    T(W[i,:]HF )

    T, (62)

    and further stacking the states of continuous moments yieldsy2Hi...

    ylHi

    =

    (y1F )T

    ...

    (yl−1F )T

    (W [i,:]HF )T. (63)

    Then, the vector (W [i,:]HF )T is directly obtained by least squaremethods, and it follows that

    Y TH =

    (y2H)

    T

    ...

    (ylH)T

    =

    (y1F )T

    ...

    (yl−1F )T

    WTHF = Y TF WTHF . (64)If no other prior information is available, to avoid a under-determined solution of WHF , at least (|H| + 1) consecutiveobservations over VF are needed to obtain the least squaresolution of WHF , which is calculated by (31). The proof iscompleted.

    D. Proof of Lemma 2

    Proof. Due to ra /∈ VF , when ‖za − zi‖2 < Ro, ra isregarded as an external obstacle by ri. Note that in cases where‖za − zi‖2 → Rs, the influence of (zi∗−zi) is neglected sinceri aims to move as far from ra as it could. By Definition 1,given the goal state z∗c , the proof turns to find a groups ofinputs ua = {uka, k = 1 · · · kH} to satisfy

    z∗c−z0i =kH∑k=1

    g(zka(u

    ka)−zki , zki∗−zki , vka , vki

    ). (65)

    With g known and (zi∗− zi), (za− zi) and vi measurable, anavailable choice of ua and kH can always be found such that∣∣g (zka(uka)−zki , zki∗−zki , vka−vki )∣∣ = ∣∣∣∣z∗c−z0ikH

    ∣∣∣∣≤b, (66)where Rs ≤

    ∥∥zka − zki ∥∥2 ≤ Ro. The proof is completed.E. Proof of Theorem 3

    Proof. We prove this theorem by contradiction.For i), suppose there are two distinct convex polygons Vpi,a

    and Vpi,b and the cardinal number |Vpi,a| ≥ |V

    pi,b|, then we

    obtainVpi,∆ = V

    pi,a\(V

    pi,a ∩ V

    pi,b) 6= ∅. (67)

    Then, ∀i′ ∈ Vpi,∆, i′ is not covered by Vpi,b, which renders a

    contradiction by Definition 2. Therefore, Vpi is unique.For ii), the conclusion obviously holds when |Vpi | = 2,

    thus we consider nontrivial cases where |Vpi | ≥ 3. First, when|Vpi | = 3, there are only two vertex adjacent to i, here denotedas j1 and j2, and let dijk =‖zjk−zi‖2, k = 1, 2. In this case,P(zj1 , dij1)∩P(zj2 , dij2) = ∅ if and only if the states zi, zj2and zj2 are linear dependent, which contradicts with convexvertex condition of i. Therefore, it follows that

    Pi(dij1 , dij2) = P(zj1 , dij1) ∩ P(zj2 , dij2) 6= ∅. (68)

    When |Vpi | > 3, ∀j3 ∈ {Vpi \{i ∪ j1 ∪ j2}}, utilizing (68),

    we easily obtain

    Pi(dij1 , dij3) 6= ∅, Pi(dij2 , dij3) 6= ∅. (69)

    Then, what we need to do is to prove

    Pi(dij1 , dij3) ∩ Pi(dij2 , dij3) 6=∅. (70)

  • 15

    Similarly, suppose Pi(dij1 , dij3)∩Pi(dij2 , dij3) 6= ∅. This caseis equivalent to the states of four vertex satisfying

    zj3 − zi = α1(zj1 − zi) + α2(zj2 − zi), (71)

    where α1 ≤ 0 and α2 ≤ 0. Note α1 = 0 (or α2 = 0) indicatesj1 (or j2), j3 and i are on the same line, and α1, α2 are notzero at the time. Consequently, we have

    zi =zj3 − α1zj1 − α2zj2

    1− α1 − α2, (72)

    which means zi is a convex combination of zjk (k = 1, 2, 3)and thus contradicts with convex vertex condition of i. There-fore, it follows that if Pi(dij1 , dij3) 6=∅ and Pi(dij1 , dij3) 6=∅,then Pi(dij2 , dij3) 6=∅. Utilizing this transitivity property, weobtain

    Zf0i =⋂

    j∈{Vpi \i}

    P(zj , dij) 6= ∅. (73)

    If there exists j ∈ {N outi \Vpi }, they are covered by V

    pi .

    Likewise, by the convex properties, P(zj , dij) ∩ Zf0i = ∅.To sum up, define the feasible position set as

    Zfi =⋂

    j∈Nouti

    P(zj , dij), (74)

    and by the definition of P(z,R), we have ∀z ∈Zfi , ‖z − zj‖2 < ‖zi − zj‖2. The proof is completed.

    F. Proof of Lemma 3

    Proof. Based on Lemma 1, let t→∞, one obtains the globalstate evolution as

    zss = qe1(p

    e1)

    Tu · t+ qe1(pe1)Tz(0) +N∑i=2

    1

    λiqei (p

    ei )

    Tu. (75)

    Note that the desired state deviation vector h contained in uonly incurs a constant offset in zss, therefore, here we ignoreh and let u = [0, · · · , 0, ue, 0, · · · , 0, uc]T. Recalling qe1 = 1,we have

    żss = (pe1jue + p

    e1Nuc)1. (76)

    When ueuc > 0, it means that the excitation of ra aims tostrengthen the movement in the direction of the original leader.In this case, arbitrary ue satisfying ueuc > 0 is available tomeet the indirect controllability. When ueuc < 0, it meansthat the excitation of ra aims to strengthen the movement inthe direction of ue. Thus, one infers that to counteract theinfluence of the original leader, the following condition musthold, given by

    |pe1jue| > |pe1Nuc|. (77)

    The proof is completed.

    G. Proof of Theorem 4

    Proof. First, we suppose that ∀j′ ∈ {N ini \j}, aj′j > 0.Recalling the dynamics of ri given by (6), we have

    żi =aij(zj − zi − hj + hi) +∑

    j′∈{N ini \j}

    aij′(zj′ − zi − hj′ + hi)

    =aij(uet+ z0j − zi − hj + hi)

    +∑

    j′∈{N ini \j}

    aij′(uct+ z0j′ − zi − hj′ + hi). (78)

    Let bij = aij(z0j′−hj′+hi), b̄ij =∑

    j′∈{N ini \j}aij′(z

    0j′−hj′+hi),

    and āij =∑

    j′∈{N ini \j}aij′ , then (78) is rewritten as

    żi = aijuet− aijzi + bij + āijuct− āijzi + b̄ij= (aijue + āijuc)t− (aij + āij)zi + (bij + b̄ij)= b1t− b2zi + b3. (79)

    Note that (79) is a first-order constant coefficient non-homogeneous linear equation, and leveraging constant vari-ation method, the solution is given as

    zi(t) =b1b2t+ (

    b1b22− b3b2

    )(e−b2t − 1). (80)

    Then, we obtain

    żi(t) =b1b2

    =aijue + āijucaij + āij

    . (81)

    The next step is the same as the proof of Lemma 3. It followsthat whether ri is indirectly controllable is determined by (z∗c−z0i )uc and aijue + āijuc. Accordingly, one infers that ri isindirect controllable if and only if (39)is satisfied.

    Finally, consider ∃j′ ∈ {N ini \j}, aj′j > 0, then (78) isrepresented as

    żi=aij(zj−zi−hj+hi)+∑

    j′∈{N ini \j}

    aij′(zj′(zj)−zi−hj′+hi),

    (82)

    where zj′ is no longer linearly increasing. However, aj′j > 0yields |zj′(t)− zj(t)| < |uct+ z0j′ − zj(t)|, which incurs zi(t)more closer to z∗c compared with that when aj′j = 0 at thesame time. Therefore, by (39), it is sufficient to guarantee theindirect controllability of ri. The proof is completed.

    H. Proof of Theorem 5

    Proof. Suppose ri ∈ N outv and the attack excitation causedby ra is ue. Ideally, the state of ri is updated by

    zk+1i =∑

    rj∈Nini

    wij(zkj − zki − hji). (83)

    When ra launches the attack excitation against rv , ri’s stateis observed by

    z̃k+1i =∑

    rj∈Nini

    wij(zkj − zki − hji) + wivuke + ξk. (84)

    Then, we have

    z̃k+1i,∆ = z̃k+1i − z

    k+1i = wivu

    ke + ξ

    k. (85)

  • 16

    z̃k+1i,∆ represents the state deviation between the ri’s ideal andattacked states.

    Based on the famous 3σ-principle [50], for a Gaussian dis-tribution with zero expectation value, the variable is basicallyregarded to locate in the interval [−3σ,+3σ]. Therefore, todifferentiate the influence of the measurement noise and theattack excitation, a sufficient condition is given by

    |ue| ≥ 3σ/wiv. (86)

    However, since ue is upper-bounded and 0 < wiv ≤ 1, (86) ishard to guarantee. To alleviate this effect, consider the attackexcitation continues in the following m instants. Then, at (k+m)-th instant, we obtain

    z̃k+mi,∆ =

    m∑j=1

    w(j)iv u

    k+m−je +

    m∑j=1

    w(j−1)iv ξ

    k+m−j , (87)

    where w(j)iv means the j-power of wiv . Given a constant ue isconstant, the accumulated state deviation of ri is given bym∑l=1

    z̃k+li,∆ =

    m∑l=1

    l∑j=1

    w(j)iv u

    k+l−je +

    m∑l=1

    l∑j=1

    w(j−1)iv ξ

    k+l−j

    = ue

    m∑j=1

    m∑l=j

    w(j)iv +

    m∑j=1

    m∑l=j

    w(j−1)iv ξ

    k+l−j

    = ue

    m∑j=1

    (m−j+1)w(j)iv +m∑j=1

    m∑l=j

    w(j−1)iv ξ

    k+l−j .

    (88)

    Note that in the first term,m∑j=1

    (m− j + 1)w(j)iv is a typical

    arithmetico-geometric sequence and is calculated asm∑j=1

    (m−j+1)w(j)iv =wiv(w

    (m)iv −1)

    (1−wiv)2+

    (m+1)wiv+w(m)iv + 1

    1− wiv.

    (89)For the second term, since the expectation of ξ satisfies

    E(ξ) = 0, there exists 0≤bξm∑j=1

    m∑l=j

    w(j−1)iv ξ

    k+l−j , which

    indicates the position fluctuation of ri will exceed the thresh-old β. At this instant, the connection between the two robotsis identified. The proof is completed.