perceiving for acting with teleoperated robots: ecological principles to human robot interaction...

17
Mantel, B., Hoppenot, P., & Colle, E. (2012). Perceiving for Acting With Teleoperated Robots: Ecological Principles to Human–Robot Interaction Design. IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans, 42(6), 1460-1475. doi: 10.1109/TSMCA.2012.2190400 This is the accepted version of the article. The published version of the article is available at: http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6185687 Copyright notice © 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. BibTeX reference @article {Mantel12:SMC, AUTHOR={Mantel, B. and Hoppenot, P. and Colle, E.}, TITLE={Perceiving for Acting With Teleoperated Robots: Ecological Principles to Human-Robot Interaction Design}, JOURNAL={ IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans, }, YEAR={2012}, VOLUME={42}, NUMBER={6}, PAGES={1460-1475}, DOI={10.1109/TSMCA.2012.2190400}, } Keywords human-robot interaction; telerobotics; ecological principles; functional perspective; human robot interaction design; man-machine interfaces; remote perception problem; robot teleoperation; teleoperated robots; telerobotics; Haptic interfaces; Human-robot interaction; Robots; Teleoperators; Telerobotics; Affordance; ecological approach; human– robot interaction (HRI); interface design; perception–action;teleoperation

Upload: unicaen

Post on 29-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Mantel, B., Hoppenot, P., & Colle, E. (2012). Perceiving for Acting With TeleoperatedRobots: Ecological Principles to Human–Robot Interaction Design. IEEE Transactionson Systems, Man and Cybernetics, Part A: Systems and Humans, 42(6), 1460-1475.

doi: 10.1109/TSMCA.2012.2190400

This is the accepted version of the article.The published version of the article is available at:

http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6185687

Copyright notice© 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must beobtained for all other users, including reprinting/ republishing this material for advertising orpromotional purposes, creating new collective works for resale or redistribution to serversor lists, or reuse of any copyrighted components of this work in other works.

BibTeX reference@article {Mantel12:SMC,

AUTHOR={Mantel, B. and Hoppenot, P. and Colle, E.},TITLE={Perceiving for Acting With Teleoperated Robots: Ecological Principles to Human-RobotInteraction Design},JOURNAL={ IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans, },YEAR={2012},VOLUME={42},NUMBER={6},PAGES={1460-1475},DOI={10.1109/TSMCA.2012.2190400},

}

Keywordshuman-robot interaction; telerobotics; ecological principles; functional perspective; humanrobot interaction design; man-machine interfaces; remote perception problem; robotteleoperation; teleoperated robots; telerobotics; Haptic interfaces; Human-robotinteraction; Robots; Teleoperators; Telerobotics; Affordance; ecological approach; human–robot interaction (HRI); interface design; perception–action;teleoperation

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS 1

Perceiving for Acting With Teleoperated Robots:Ecological Principles to Human–Robot

Interaction DesignBruno Mantel, Philippe Hoppenot, and Etienne Colle

Abstract—By primarily focusing on the perceptual informationavailable to an organism and by adopting a functional perspective,the ecological approach to perception and action provides a uniquetheoretical basis for addressing the remote perception problemraised by telerobotics. After clarifying some necessary concepts ofthis approach, we first detail some of the major implications of anecological perspective to robot teleoperation. Based on these, wethen propose a framework for coping with the alteration of theinformation available to the operator. While our proposal sharesmuch with previous works that applied ecological principles tothe design of man–machine interfaces (e.g., ecological interfacedesign), it puts a special emphasis on the control of action (insteadof process) which is central to teleoperation but has been seldomaddressed in the literature.

Index Terms—Affordance, ecological approach, human–robotinteraction (HRI), interface design, perception–action,teleoperation.

I. INTRODUCTION

A ROBOTIC teleoperation platform is a powerful tool. Itenables humans to act on and explore environment from

a distance. This permits, for example, an operator to reach anduse remote objects and to scout remote environments that, oth-erwise, may be inaccessible (as in disabled person’s assistanceor search and rescue) or hostile (e.g., underwater, space, rubble,and radioactive area). It also enables us to manipulate andexamine objects at a different scale than that of the operator. Forexample, it can provide an operator with increased precision orstrength (as in telesurgery or industrial manipulator). However,all of these new opportunities come at a cost, and allowingthe operator to control these actions raises important humanfactor issues [1], [2]. A particular concern of teleoperation isthe alteration of the information available to the operator and itsdetrimental influence on task performance [3], [4]. In this paper,we consider how the ecological approach to perception andaction (e.g., [5]–[7]) might contribute to address these issues.

Manuscript received May 3, 2011; accepted November 22, 2011.This work was supported in part by the European Community’s Sev-enth Framework Program [FP7/2007-2013] under Grant agreement 216487(http://www.companionable.net). This paper was recommended by AssociateEditor R. van Paassen.

The authors are with the Computer Science, Integrative Biology and Com-plex Systems Laboratory (IBISC), University of Evry-Val d’Essonne, 91020Evry Cedex, France (e-mail: [email protected]; [email protected]; [email protected]).

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TSMCA.2012.2190400

The ecological approach proposes that an agent’s behavior isnot triggered by a stimulus or by the brain but emerges fromthe coupling between the agent and his/her environment. As aconsequence, it is argued that perception, action, and cognitioncannot be understood by focusing on the sole agent but requireus to consider the agent–environment (A-E) system as a whole.Within this perspective, perception and action are not con-sidered as separate issues but rather as mutually constrainingeach other [8], [9]. The agent uses available information abouthis/her relation to the environment to control his/her actions,and by doing so, he/she modifies the A-E layout and, thus, theinformation to which he/she has access (and which he/she usesto control his/her actions and so on). This circular causality hastwo major implications for perception. First, to act successfullyand efficiently, the agent needs functional information, i.e.,information which is adapted to the intended action and to itscontext. Second, the information available to the agent dependson his/her own activity.

Several ecological concepts found an echo in telerobotics-related fields such as computer vision, e.g., [10] and [11],and autonomous robots and agents, e.g., [12]–[14]. In parallel,building on shared concerns (e.g., A-E fit and functional per-spective) and historical roots [15], [16], an ecological approachto human factors was developed over the last 20 years, e.g.,[17]–[22]. In the field of HCI, the ecological perspective madea breakthrough with Norman’s popularization of the affordanceconcept [23] (but see [24] and [25]) and with the advent of theso-called ecological interface design (EID) framework whichwas developed in the context of complex work domains suchas plants, e.g., [26]. The EID framework (or its first stage,work domain analysis) has seldom been applied to telerobotics.When it was used, it has provided thorough analyses of thecognitive aspects (e.g., tactical) of the work domain (e.g.,unmanned aerial vehicle for military use [27] or search andrescue [28]). However, we know of few works (in or outsidethe field of teleoperation) that used EID to address the specificissues pertaining to the control of action—as distinguishedfrom the control of process—which is at the core of mosttelerobotics applications. One notable exception is the workconducted by van Paassen, Mulder, and their colleagues oninterface design for flight control in manned aircrafts [29]–[31].Besides the EID framework, the ecological approach to humanfactors and ergonomics also led to other notable developmentsof displays for aircraft pilots, e.g., [32]. In telerobotics, thepotential benefits of using the ecological approach to address

1083-4427/$31.00 © 2012 IEEE

2 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS

the issues raised by the remote control of action have beenacknowledged [1], but ecological principles have little beenexplicitly applied to the design of interfaces for human–robotinteraction (HRI),1 e.g., [33] and [34].

In the present paper, we focus precisely on how the eco-logical approach to perception and action might be used toaddress the prospective and online control of action in the con-text of teleoperation. After clarifying essential (but sometimesmisused) ecological concepts (Part II) and analyzing how ateleoperation platform alters the agent–environment coupling(Part III), we propose a framework for interface design whichaims at providing the operator with the relevant functionalinformation (or at compensating for not providing him/her withthat information; Part IV).

II. ECOLOGICAL APPROACH TO PERCEPTION AND ACTION

Compared to other psychological theories, the ecologicalapproach has the singularity to take as an entry point whatinformation is available for perception, rather than how thisinformation can be detected by an organism [35, p. 7]. Onone hand, the “what” question points toward the purpose ofperception. It is proposed that, when considering the perceptualsupport for action, we should look first for information whichis functional (as opposed to general purpose, such as “3-Dspace”). On the other hand, the “what” question also pointsto the form in which information is available to the agent.Available information is assumed to lie in the structure ofambient energies which stimulate the perceptual systems of theagent.

In the following sections, we first address the availableinformation about opportunities for action, both at the levelof the A-E system which constrains action (Section II-A,affordances) and at the level of the invariant properties ofstimulation which specify these A-E regularities (Section II-B,invariants). We then address the available information forcontrolling actions (online) while they are being executed(Section II-C, control laws). We conclude with an attempt atsynthesizing the whole framework (Section II-D).

A. Affordances of the Agent–Environment System

At the first level of description, the ecological approachproposes that what an actor needs to perceive are not themeaningless physical characteristics of the environment (e.g.,this object is 43 cm away, has a diameter of 6 cm, weights 2 kg,etc.) but rather what he/she can do or not do in a given situation(e.g., is this object reachable, graspable, liftable, “throwable,”etc.). Such opportunities for action, which have been termedaffordances, do not depend on mere properties of objects but onrelations between properties of the environment and properties

1HRI covers a wide range of interaction schemes, that may differ accordingto the degree of autonomy of the machine (e.g., continuous human control,supervisory control, and cooperation), according to whether the machine’sinterface is the robot itself (permitting direct physical interaction) or a distinctconsole (as in teleoperation), according to whether the operator and the robotare colocated or distant, etc. Our present concern is with the teleoperation ofrobot of varying autonomy, primarily distant.

of the actor [5, p. 129], [36], [37]. For example, a surface whichis sufficiently large relative to one’s feet and sufficiently strongrelative to his/her weight constitutes a platform on which he/shecan stand. An affordance can thus be defined as a relationalproperty of the agent–environment system [38].2 Therefore, anaffordance is not a mental construct. It exists whether or not itis perceived. Being grounded in the physical description of theA-E system, an affordance is as real as the objects and agent itcharacterizes.

An affordance provides information about some environmen-tal properties scaled in terms of the perceiver’s own actionsystem. It does not only define whether a given action is(not) possible but also whether it is efficient, comfortable, andso on. Formally, an affordance can be characterized with adimensionless ratio (i.e., the so-called pi number). For example,Warren [40] noted that whether a stairs is climbable by anindividual can be captured by the ratio R/L of riser heightto leg length. If this ratio exceeds a certain value (0.88 in thecase of bipedal climbing), called the critical point (or absolutecritical boundary [41]; see the following discussion), the stairscannot be climbed anymore, and a new mode of action mustbe used (e.g., quadrupedal climbing). The R/L ratio also hasan optimal point (0.25) at which the energy expenditure asso-ciated to climbing the stairs is minimum. When experimentalparticipants were shown stairs with different riser heights andwere asked to judge whether they could climb the stairs andto rate how comfortable this would be, their responses werefound to match the two previously identified values of the R/Lratio [40].

Warren’s study [40] focused on a single action mode (i.e.,bipedal climbing). Mark et al. [41] extended this work by exam-ining the perception of action boundaries in the context of twodifferent modes. They investigated the perception of whetheran object is within reach and, specifically, the transition fromreaches performed with the arm only versus with the arm plusleaning torso forward and/or twisting shoulders (respectivelytermed one degree of freedom reaching, or 1-dof reaching,and multiple-dof reaching). The authors observed that, whenthe distance to the object is increased, the participants switchfrom the first mode of action (1-dof) to the second (multiple-dof) before the absolute critical boundary is reached. Theycalled this anticipated transition the preferred critical boundaryand showed that it coincided with a transition in felt comfort,suggesting that it could correspond to the region where thefirst mode of action (1-dof reaching) becomes less comfortablethan the second one (multiple-dof reaching). Besides energyexpenditure and comfort, preferred and optimal points alsodepend on other situational/task constraints such as accuracy[42] or quickness and safety [43]. The gap between preferredand absolute critical boundaries can thus be considered as amargin of safety retained by the actor, e.g., [41] and [43]. Fig. 1summarizes all of these notions with a schematic diagram.

2While the principle of complementary A-E dispositions is largely agreed,there is an ongoing debate on the exact ontological status of affordances. Theinterested reader may refer, for example, to [39] and two special issues ofEcological Psychology, #12(1) and #15(2).

MANTEL et al.: PERCEIVING FOR ACTING WITH TELEOPERATED ROBOTS 3

Fig. 1. Action boundaries and optimal points characterizing affordances.Given a particular action context (e.g., stairs climbing or reaching), a vari-able capturing a relevant situational/task constraints (e.g., energy expendi-ture, discomfort, and danger) is plotted for two different action modes (e.g.,bipedal/quadrupedal climbing or 1-dof/multiple-dof reaching) as a function ofa relational property of the A-E system (e.g., riser height/leg length or objectdistance/arm length). See text for details; the figure is abstracted from theaffordance literature, particularly [40] and [41].

Of course, the ability of an agent to perform an action doesnot depend on the sole geometrical characteristics of his/herbody relative to the geometrical properties of the environment.Trivially, an agent wearing a heavy backpack may not be ableto climb stairs of the same riser height as when not wearing thebackpack (or not with the same efficiency, etc.). An affordancecan be characterized in different ways whose relevance dependson context. Formally, this can be achieved by using differenttypes of pi numbers which capture the influence of differenttypes of constraints on action [44]: geometrical (length combi-nations), kinematic (length and time combinations), and kinetic(length, time, and force combinations).

B. Invariants in Ambient Energy Arrays

As Gibson underlined, “the central question for the theory ofaffordances is not whether they exist and are real, but whetherinformation is available in ambient light for perceiving them”[5, p. 140]. Thus, at a second level of description, the ecologicalapproach proposes that the structure of ambient energies whichstimulate an agent’s perceptual systems specify the regularitiesof the A-E system. These structures which extend within andacross optics, acoustics, haptics, etc., are called invariants. Forexample, light propagation is constrained by the constitutivesubstance of objects and the layout of their surfaces. As aconsequence, the structure of light converging at a point ofobservation, or optic array, contains information about thenature of the objects, their shape, and their spatial layoutrelative to the perceiver. For example, when an object’s textureis approximately regularly distributed, the relative density oftexture elements in the optic array (the so-called texture gra-dient) specifies the slant of the surface relative to the pointof observation [45, p. 77]. Similarly, the specific manner inwhich sound waves propagating from a source are altered by

the listener’s head and external ear contains information aboutthe direction of the source relative to the head [46, p. 97].

1) Persistence Over Change: Among the most importantinformational structures in ambient energies, many result fromchanges in the agent–environment system. First, change is ubiq-uitous in the A-E system, principally in the form of movements,both on the side of the perceiver and on the side of externalobjects. In particular, an agent is never completely at rest (i.e.,stationary). Even the most elementary aspects of its behavior,such as breathing and controlling posture, involve movementswhich, in turn, generate information. Second, change is anextremely rich source of information because it reveals theunderlying structure: “what is invariant does not emerge un-equivocally except with a flux. The essentials become evident inthe context of changing nonessential” [5, p. 73].

The dynamic structure of light variations that converge ontoa point of observation, or optic flow [8], [47], [48], offersmany obvious examples of these movement-induced patterns.For example, the global flow rate is directly related to thevelocity of the point of observation relative to the illuminatedenvironment [16], [49, p. 805], while the relative rate of opticflow elements (i.e., the so-called motion parallax) specifies the(dis-)continuity and slant of surfaces relative to the point ofobservation [50]. If the point of observation translates, the flowis radial in the direction of motion and laminar in the orthogonaldirection. The center of the radial pattern, called the focus ofexpansion, corresponds to the direction of displacement [8].Thus, the flow structure specifies the direction toward whichone is moving. Among other well-documented structures, therelative expansion rate of an object’s projection in optic flow,the so-called τ variable, specifies the first order time-to-contact,i.e., the time remaining before contact if the relative speed ofthe point of observation and object were to be maintained [51],[52]. Of course, similar informative structures exist in all ambi-ent energies stimulating the senses. For example, when a sourceis emitting a constant sound, the relative rate of change of thesound pressure or intensity at the ears (the so-called acoustic τ )conveys information about the source’s time-to-contact and/ordistance [53], [54]. In haptic stimulation, the inertial momentsof an object held or wielded convey information about howmuch the tool extends beyond where it is grasped [55].

The structures of ambient energies do not only convey spatialinformation. For example, by simply seeing the motion of afew patches attached to the articulations of a human body (theso-called point-light display), humans can perceive as diverseinformation as the action being performed, the morphology,age, or gender of the actor, or even his/her intention [56], [57].

2) Specification of Affordances: One important claim of theecological approach is that affordances are specified, i.e., thatthe structures of ambient energies map one to one to the rela-tional properties of the A-E system which define what the agentcan do and not do. Given that the agent’s own body and actionsalso participate to the structuring of energies, it is argued thatthere exist higher order invariants in ambient energy arrayswhich provide body-scaled and action-scaled information.

For example, the height of the point of observation relativeto the ground is an important constraint on optic flow and thuscould play a major role for providing body-scaled information

4 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS

Fig. 2. Eye-height-based optical specification of “passability” (i.e., whetheran aperture can be passed through). For an agent facing the aperture, the opticarray specifies the width of the aperture in units of his/her own eye-height and,thus, up to a scaling constant, in units of his/her own shoulder width (adaptedfrom [43]).

[5, p. 164], [47], [58]. As an illustration, consider the infor-mation available about passing-ability. Fig. 2 shows an agentstanding on a flat horizontal ground and facing the middle ofan aperture. In that case, the relation between β, the angulardeclination below the horizon of the doorstep, and α, theangular horizontal width of the object, specifies the object’swidth W in units of eye-height H [43]

W

H=

tan(β)

2tan(α2 ). (1)

If we assume that standing eye-height bears a stable relation toother anthropometrical characteristics such as shoulder width S(e.g., H = c.S), it follows that the optic array specifies, up toa scaling constant, the width of an aperture in units of shoulderwidth. In a similar fashion, the optic array under the constraintof eye-height specifies the distance of an object lying on theground or the height of a riser in units of eye-height. Empiri-cal studies have shown that participants exploit the constraintof eye-height on optic flow to judge whether they can passthrough an aperture [43], pass under a barrier [59], or seat on astool [60].

It is worth noting that informative structures are not onlyavailable within individual energies but also extend acrossenergies, either of the same type (e.g., binocular or binauralinvariants) or of different types (intermodal invariants) [61].For example, the interaural time difference and level differencecontain information about the direction of a sound sourcerelative to the listener [46, p. 137]. Given that the relationalproperties of the A-E system which constrain actions oftenaffect simultaneously multiple ambient energies, intermodalinvariants could play a particularly important role for the spec-ification of affordances. For example, humans accurately judgewhether they can reach a visible object with a rod, even whenthey can wield the rod but not see it, suggesting that they aresensitive to information about “reachability” available acrossoptic and haptic stimulation [55], [62].

C. Laws of Control of Movement

The specification of affordances provides an informationalbasis for perceiving what actions can be performed, at whichcost for the organism, and so on. Besides this first prospectivedimension of control [39], the agent also needs informationfor regulating his/her actions while they are being executed.The proponents of the ecological approach argue that guidingactions with respect to environmental goals does not necessarilyrequire to construct a 3-D (internal) representation of the worldand to plan movements in details. Instead, it is proposed thatactions can be regulated “online” by exploiting (task-) specificmappings between control variables3 and informational vari-ables, termed control laws [8], [37], [64].

In a seminal article [8], Gibson described a complete setof such laws that could be used to control locomotion (start-ing, stopping and backing-up, pursuing and fleeing, etc.). Heproposed, for example, that steering toward a visible goalcan be achieved by continuously adjusting one’s direction ofmotion so as to maintain the optical focus of expansion alignedwith the goal’s projection in the optic array—while avoidancewould consist in keeping the focus of expansion outside ofthe obstacle’s projection. Similarly, to steer toward a soundsource, an agent can move so as to simultaneously minimizethe difference in sound intensities at his/her ears and increasethe overall sound intensity [35, p. 83]. Among popular controllaws, the so-called constant bearing and constant absolute di-rection strategies have been used for years by ship navigatorsto move out of a collision course but also by bats to ensure asuccessful catch over their prey [65]. Perhaps the most strikingexamples of control laws which obviate the need for mentalmaps and action plans are the so-called balance strategy andavoid-closest strategy [66]. They show that, without recover-ing any depth information, a small flying robot is capable ofavoiding obstacles by simply: 1) equating the average opticflow rate on the left and right halves of the camera image; or2) turning away from the direction where the optically specifiedtime-to-contact is the lowest.

Fajen and Warren [67] (see also [64] and [68]) formalized acontrol law for steering toward goals and avoiding obstacles inwhich turning rate (φ̇) is regulated as a function of the object’sdistance d and of the object-relative heading specified in opticflow (φO − ψ) and across haptic/gravitoinertial and optic flows(φHGO − ψ). The version presented in the following corre-sponds to a situation with one goal (g) and one obstacle (o). Inprinciple, it could be extended to more obstacles by duplicatingthe obstacle term (i.e., the second line):

φ̇=−kg ((φHGO−ψg)+wv(φO − ψg))(e−c1dg + c2)

+ ko((φHGO−ψo)+wv(φO−ψo))(e−c3|φ−ψo|

)(e−c4do) (2)

where kg and ko represent, respectively, the attractiveness ofthe goal and the repulsiveness of the obstacle, v is the agent’s

3In accordance with the dynamic system theory, it is assumed that the actionsystem self-organizes under task constraints such that its various neuromus-culoskeletal degrees of freedom reduce to a few free parameters, or controlvariables, that need to be regulated [9], [63].

MANTEL et al.: PERCEIVING FOR ACTING WITH TELEOPERATED ROBOTS 5

Fig. 3. Perception for action and action for perception in the agent–environment system (see text for details).

speed, w accounts both for the amount of optical structure andfor the size of the field of view, and ci’s constants define theexponential rate of decay and minimal value of the distance’sinfluence on steering.

A great number of studies also investigated how the opticalvariable τ could contribute to the control of actions such asbraking or intercepting a mobile object. For example, Lee [52]showed that an agent can achieve a smooth contact with anobject by regulating its braking in such a way that the firsttemporal derivative of τ is kept at a certain value (−0.5). Later,Yilmaz and Warren [107] observed that, instead of maintaininga steady state, experimental participants appeared to be adjust-ing the brake position x as a function of the deviation of τ̇ froma reference value (−0.52)

Δx = b(−0.52− τ̇) + ε (3)

where b is a scaling constant and ε is a noise term.Control Rules: We propose to complement this formalism

by adding a new level of description, called control rules. Incontrol rules, the control variables are expressed as a functionof the relational properties of the A-E system which constrainaction, in place of the invariants of ambient stimulation whichspecify these constraints (as it is the case for control laws).

Within this formalism, φ̇ in (2) would be expressed as afunction of distance and object-relative heading and Δx in (3)as a function of the rate of change of first order time-to-contact,regardless of how these properties are specified in ambientstimulation (e.g., in optics, in acoustics, or across differentambient energies).

The reason for adding this new level of description istwofold. First, it provides a unified (two levels) frameworkfor characterizing the available information about action op-portunities (i.e., affordances and specifying invariants) andthe information for regulating actions while they are executed(i.e., control rules and control laws). Second, it could proveparticularly useful in the context of HCI and of teleorobotics

in particular, because these induce a decoupling between theproperties constraining actions and the ambient energy struc-tures specifying these constraints (a point on which we willcome back later).

D. Synthesis

Fig. 3 summarizes the information available to an agent forcontrolling his/her action. At the right side of the figure, thetwo aspects of control are represented in two different columns.Specifically, it is proposed (see also [37]) that, for each actionmode, there exists information about the opportunity to performthe action, its efficiency, etc. (column 1), and informationfor regulating the free parameters of the action system whilethe action is executed (column 2). For example, the actionmode “walking through” requires information about whetherthe aperture can be passed through (1) and information forcontrolling locomotion on-line relative to the (edges of the)opening (2). Then, it is proposed that, for both aspects ofcontrol, the characterization of available information involvestwo distinct levels of description, represented with the twohorizontal boxes in the upper part of the figure: first, the levelof the relational properties of the A-E system at which actionis constrained (box a) and, second, the level of the invariantproperties of ambient stimulation which specify the control-relevant A-E properties of the first level (box b).

One central claim of the ecological approach is that the rela-tion between the two levels is one of specification. Specificationrefers to a lawful 1:1 mapping between invariants of ambientstimulation and the properties of the A-E system that giverise to them. Accordingly, information available to an agent isassumed to be unambiguous. However, this does not mean thatall affordances and control rules are specified all the time. Forexample, some objects might occlude others, a room might bedark, and a place might be filled with fog. More generally, someproperties of an object will not affect the optic, acoustic, orhaptic flows stimulating the agent’s perceptual systems unless

6 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS

he/she rotates the object, knocks it, wields it, or scrapes itssurface. It follows that perception shall not be conceived as alinear process (going from the top to the bottom box of Fig. 3)but as a circular process in which the agent adjusts his/heractions until the invariants of ambient stimulation specify theinformation sought: in Fig. 3, perception is spread over thewhole loop which goes from box (b) down to the bottom,and then follows the dotted arrows leftward to the A-E systemand then rightward, back to box (b). The two rightward dottedarrows emphasize the dual roles played by the agent’s action:performatory and exploratory [7, p. 80]. Altogether, the notionsof affordance, invariant, control rule, and control law emphasizethe mutual dependence of perception and action, or the so-called perception–action cycle [9, p. 88].

III. TELEOPERATION AS TOOL-MEDIATED INTERACTION

Teleoperation falls in the category of instrumented interac-tions. A teleoperation platform is a tool. Tools vary greatly incomplexity, ranging from projectiles, hammer, or magnifyingglass to cars, sonars, or computers, but they all share thesame function. As noted by Shaw et al. [44, p. 306], “toolsenhance, extend, or restore the action or perception capabilitiesof humans and animals.”

By mediating the relation of an agent to his/her environment,a tool alters simultaneously the A-E properties constrainingactions (Fig. 3, box a) and the invariants in ambient stimulationspecifying these properties (Fig. 3, box b). In Section III-Aand B, we first consider the implications of each of theselevels for teleoperation. Then, we emphasize the two differentperspectives that they represent for design (Section III-C).

A. Alteration of Affordances and Control Rules

A tool changes the layout of affordances. It provides newopportunities for action and withdraws other opportunities thatwere previously afforded. An agent holding a hammer gainsnew opportunities for hitting things, for reaching farther ob-jects, etc. At the same time, he/she may not be able to graspa glass of water anymore or to explore a surface with his/herhand in the search of a light switch. Similarly, by teleoperatinga robotic arm, a technician gains the opportunity to assemblepipes underwater or a disabled person gains the opportunity toreach and grasp distant objects. However, while doing so, theseoperators may not be able to simultaneously sign a register,drive a car or a wheelchair, read a book, etc.

When considering the affordances of an actor–tool–environment system (A-T-E system), the first distinction canbe made as a function of whether the tool is manipulated ornot. The opportunities for action granted by a tool consideredas a detached object of the environment A-(T-E) are not thesame as the opportunities for action granted by the same toolconsidered as a functional extension of the actor’s action system(A-T)-E [44]. For example, a hammer lying on a table can affordpicking up, while when firmly held in the hand, it can affordhammering. This dual function of a tool can be reformulatedwithin the more general framework of action modes nesting,e.g., [22] and [37]. We propose to distinguish three differ-ent kinds of nesting: sequence, concomitance, and hierarchy.

Fig. 4. Different types of action mode nesting illustrated in the context ofopening a door (top) and revealing window content on a GUI (bottom). Nestedaction modes can be either sequential (a), concomitant (b), or they can embraceeach other and form a hierarchy (c) (see text for details).

They are shown in Fig. 4 with the now classic examples ofa door with a pivoting handle and of a scroll-bar on a GUI[23], [69]. Sequence refers to the nesting of action modes intime (Fig. 4, boxes a) [69]. For example, a horizontal doorhandle first affords grasping to the actor; then, once grasped,it affords pressing downward; and, once pressed, it affordspulling. Similarly, in a scroll-bar, the moveable box first affordsgrabbing to the user, and when grasped, it affords dragging.4

Concomitance designates the relation between action modeswhich are simultaneous but nevertheless dependent on eachother (Fig. 4, boxes b). For example, if a door is not locked,a door handle affords both pulling the handle and pulling thedoor. Similarly, if appropriately programmed, a grasped scroll-bar box affords both dragging the box and dragging the contentof the linked window. Lastly, hierarchy refers to the relationby which an action mode embraces other modes (Fig. 4, boxesc) [22]. For example, the action mode “picking up a hammer”can be decomposed into the lower order modes of reaching,grasping, and lifting the hammer. Similarly, grasping a doorhandle, pressing it downward, and pulling it are components ofthe higher order action mode of opening a door, while grabbingand dragging a scroll-bar are part of the higher order actionmode of revealing new content [69].

Importantly, a door handle can be pressed only if it can bereached and grasped first, and at the same time, the handlewill not be grasped in the same way if it is grasped to pressit downward (to open the door) or to hold it up (to preventsomeone from opening the door). These two examples under-line that sequential action modes do not merely follow eachother in time: within a sequence, each mode is contingent onboth its predecessor and its successor. The transition from onemode to the other does not occur at a predetermined momentin time but (if and) when the second mode becomes afforded

4Remind that we are discussing affordances (i.e., what the agent can do ornot do) and not the availability of information about these affordances. Hence,the scroll box refers here to a functional area (i.e., an area programmed suchthat it can be clicked, grabbed, and dragged) and not to something displayed ona screen that would make this area visible. The opportunity to drag a box doesnot depend on whether there is information about such opportunity. The boxwithin a scroll-bar can be dragged even when the computer screen is off.

MANTEL et al.: PERCEIVING FOR ACTING WITH TELEOPERATED ROBOTS 7

Fig. 5. Redefining the two levels for describing the information available to the agent in the context of robot teleoperation (see text for details).

or more efficient than the first, etc. (see Section II-A; [37]).Similar causal dependencies exist between concomitant modesand between modes within a hierarchy, and they also haveimportant consequences on both affordances and control rules.For example, when revealing content on a GUI, the action mode“moving the mouse” is constrained by the concomitant mode“dragging content” in such a way that hand movement can beregulated online on the basis of the induced scrolling of thecontent on the screen.

In telerobotics, it is the robot which ultimately acts in andon the remote environment (e.g., manipulating objects andrevealing their characteristics). Thus, it may seem at first thatwhat the operator needs to know are the affordances of therobot–environment system (R-E system). Can the mobile robotpass through that opening? Is that object within reach of therobot’s arm? Such R-E functional fits can definitely not beignored. However, they are not sufficient for teleoperation.5 Ifthe operator is to control the robot’s actions, he/she will notonly need to know what the robot can do but what he/she canmake the robot do [70]. Can I drive the mobile robot throughthat opening? Can I reach that object with the robotic arm?

The nesting of action modes is further increased in teler-obotics because the tool is distributed among a commandconsole and a robot. Fig. 5 takes up the top left part ofFig. 3 and applies it to the case of teleoperation. As illus-trated in box (a), the operator’s opportunities to act on theremote environment (through the platform) are constrained bythe properties of the agent–console–robot–environment system(A-C-R-E system). An affordance (or a control rule) of theA-C-R-E system depends simultaneously on the relationalproperties of the robot–environment subsystem (R-E), on therelational properties of the agent–console subsystem (A-C),6

and on the existence of an adequate mapping between theactions granted by these two sets of constraints. For example,the opportunity to drive a robot uphill is constrained by: 1) therelation between the power of the robot and the slant of the

5This could be enough if the operator were to team up with an autonomousrobot in order to achieve a collaborative task or if the operator were to delegatesome actions to the robot, a point on which we will come back later.

6By “console,” we refer here to both the console itself and the operator’ssurroundings.

path; 2) the size and stiffness of the console’s joystick relativeto the operator’s size and strength; and 3) the characteristicsof the mapping (if any) between the operator’s opportunitiesfor manipulating the joystick and the robot’s opportunitiesfor climbing the hill. Each type of constraints does not onlyaffect which actions can or cannot be performed but also howefficient and comfortable these actions are (for the operator; cf.,Section II-A).

An important concern with such a complex tool is thatthe mapping between the actions performed by the operatorand their consequences in the remote environment can changeacross teleoperation sessions or even within a session. This willbe the case, for example, when an obstacle avoidance system(or any other assistance) is turned on, when the console’sinputs (keyboard, joystick) are changed, when transmissiondelay changes, when a tire is underinflated, etc. In some cases(e.g., when the batteries are low), these alterations can be infor-mative, to the extent that they are specified (see Section III-B).However, in all cases, these alterations also change the controlrules and affordances of the A-C-R-E system, forcing the oper-ator to rediscover them.

B. Alteration of Specifying Invariants in Ambient Stimulation

In addition to changing the opportunities for action, a toolalso alters the structure of ambient energies which stimulatesthe perceptual systems of the agent and, thus, the informationavailable to the agent. On one hand, this alteration is partlyfunctional as it allows the new (in-)opportunities for action andthe new control rules to be specified. On the other hand, thisalteration also restrains available perceptual information.

In the case of teleoperated robotics, the alteration ofavailable perceptual information is all the more importantthat—again—the tool is split up into a command console and arobot. As shown in Fig. 5 (box b), the availability of informationspecifying what the operator can (make the robot) do dependson the availability of patterns, within ambient stimulation,which specify the relevant properties of the A-C-R-E system.Within the A-C subsystem, the physical characteristics of boththe console and the operator participate to the structuring ofthe ambient energies which stimulate the operator’s perceptual

8 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS

systems. It follows that the information about what the operatorcan do with the console’s inputs (reaching, grabbing, andmoving a mouse or a joystick, reaching a screen, etc.) can,in principle, be directly available in the structure of ambientstimulation. As concerns the R-E subsystem, the physical char-acteristics of the robot and its environment also participate tothe structuring of ambient energies. However, as the operatorand the robot are often not physically located at the same place,these structures may not reach the operator’s perceptual systems(e.g., if the robot is not in direct line of sight, not within hearingdistance, etc.). In that case, availability of information aboutwhat the robot can do depends on the ability of the robot (orany other entity) to gather the relevant information (e.g., bysampling the specifying energy structures), on the capabilityof the communication system to transmit this information, andon the capability of the console to render it appropriately tothe operator. Similarly, whether and how the mapping betweenA-C’s and R-E’s functions will influence the structure of am-bient energies stimulating the operator is also contingent onthe robot’s sensors (camera, laser, sonar, etc.), the console’sdisplays (screen, speaker, force-feedback joystick, etc.), and thecommunication between the two.

Exploratory Activity, Multimodality, and Specification: Be-sides the aforementioned aspects, there are at least three trans-verse aspects that are worth mentioning. First, the informationavailable to the operator also depends on the opportunities foraction granted by the teleoperation platform. As we argued ear-lier (see Section II-B and D), movement is central to perception,and thus, any limitation on the operator’s ability to control therobot’s body and sensors is also a restriction on his/her abilityto create information [62], [71]–[73]. For example, a displayedvideo is all the more flat and lacking information about spatiallayout that it is hermetic to the operator scrutinizing. Typically,moving the head will not reveal the objects that are occluded;looking successively at objects lying at different depths willnot change the distance of convergence of lines of sight norwill it sharpen the contour of the focused object (or blur thecontour of other objects). Second, the teleoperation platformdoes not only affect the structure of individual ambient energies(optics, haptics, acoustics, etc.) but also alters the informationavailable across energies. Even if an operator is provided withcoordinated patterns in optic and in haptic/gravitoinertial flows(e.g., by wearing an HMD and driving the robot camera withhis/her head), the availability of information about the scale ofthe scene will depend on the gain between the two patterns ofstimulation [62]. Third, the teleoperation platform also ques-tions the principle of specification, i.e., that there exists a lawful1 : 1 mapping between A-C-R-E properties and the structure ofstimulation. In complex tools such as teleoperation platforms,the mapping is mediated by complex mechanical and electronicdevices.7 As a consequence, the mapping between A-C-R-Eproperties and the resulting structure in ambient energies can

7Interestingly, several GUI design choices can be viewed as attempts torestore artificially a seemingly unmediated mapping between ambient energystructure and afforded actions. For example, in a scroll-bar, the empty areasextending above and below the cursor but not on its left and right indicate thatthe box can be dragged upward and downward but not side to side [69].

change from one platform to the other or even from one sessionto the next (e.g., if the camera orientation has changed, if onesensor is off or not calibrated, etc.). This is a major concernbecause, in the absence of specification (i.e., if the mappingis 1:many, many:1, or many:many), the structure of ambientstimulation is ambiguous relative to the opportunities for actionand to how actions can be regulated.

C. Different Entry Points for Designers

The designer of a teleoperation platform can step in at differ-ent levels that shall not be confused. At a first level, the levelof affordances and control rules, he/she can work on layingout an appropriate functional fit within the agent–console–robot–environment system. As noted by McGrenere and Ho[25], a designer working on this first level addresses both use-fulness and usability. He/she first addresses usefulness becausethe relevance of having the system afford a particular actiondepends on the appropriateness of the action relative to thegoals of the platform. He/she also addresses usability becausethe relevance of affording an action also depends on the cost(e.g., energy expenditure) of performing this action for theoperator.

At a second level, the level of invariants of ambient stimu-lation, the designer can work on the information available tothe operator for controlling the robot. In other words, he/shecan work on the means by which the structure of ambient stim-ulation can be made to specify the control-relevant propertiesof the A-C-R-E system of the first level. When working onthe specification of affordances and control rules, a designerthus addresses another aspect of usability, the availability andsalience of the information which is necessary for controllingthe robot [25]. He/she further addresses usability through thecost associated to the exploratory activity which generates theinformation.

The two levels are, of course, not independent of one another.However, as we noted at the end of Section III-B, in the caseof telerobotics, there is no guarantee that the relation betweencontrol-relevant A-C-R-E property and the invariants in ambientstimulation will be one of specification (i.e., 1 : 1 mapping).The existence and consistency of this mapping, on which itstractability and the operator’s ability to exploit it depend, arein large part up to the designer (thence, the importance ofstandards [74]).

IV. DESIGNING INTERFACES FOR

TELEOPERATED ROBOTICS

A. Data Versus Information

An important concern in research on HRI for robot teleoper-ation is the severe alteration of the information available to theoperator and its detrimental consequences on task performanceand felt comfort [1]–[3]. As a consequence, HRI designers arefaced with the question of determining what type of sensors anddisplays are needed and how the data should be presented.

Traditional interfaces for robot teleoperation have classicallyused one different display to represent the data provided by

MANTEL et al.: PERCEIVING FOR ACTING WITH TELEOPERATED ROBOTS 9

each sensor (i.e., termed conventional 2-D interface in [33]). Asresearchers worked on improving teleoperation performance,this so-called philosophy of one-measurement-one-display (orsingle-sensor-single-indicator) led to a proliferation of displaysand alarms (e.g., video, laser, sonar, robot pose, energy gauge,compass, etc. [33]; the argument was initially made in the con-text of process control; see [20] and [22]). Recently, however,there has been an important trend in HRI design communitytoward combining data from several sensors within a singledisplay (e.g., overlaying heat and sound data on video [75],fusing laser, sonar and video data [76], integrating video witha 3-D reconstruction of laser data and/or known map [33],[77]–[80], and integrating images from different cameras [34]).Still, in most cases, the content of the interface did not change(i.e., the same set of data was displayed); only the form in whichdata were presented was modified (typically, by representingthem within a common reference frame—or at least withinclose frames).8

1) Functional Perspective: We propose to shift the primaryfocus from how data should be presented to what data should bepresented [73] (see also [33]) and hence to move from designingdisplays of data to designing displays of information [20], [22].Given our ecological perspective, information has here to beunderstood as behaviorally relevant (i.e., specific to the actor,task, and environment). We argue that the primary objective ofconsole displays is to provide the operator with the informationthat is relevant to achieve the actions for which the teleoperationplatform is intended. We further argue that this objective is thecornerstone of the whole interface design process, the point onwhich all other decisions depend. As Flach et al. [81, p. 297]noted, “the limitations of technologies, humans, and controlsystems are important considerations—but the significanceof these limitations can only be appreciated relative to thefunctional demands of a work domain”. To underline thiscentral aspect and to contrast it with other approaches todesign whose primary focus is on structure rather than onfunction (e.g., technology-centered, user-centered, and control-centered), these authors labeled the ecological perspective use-centered [81].

2) Behavioral Fidelity, Not Presence: In mediated percep-tual experiences, the concept of presence (or telepresence)refers to the impression or subjective feeling of being in theremote environment, to an illusion of reality, and of non-mediation [82]–[84]. An important corollary of the functionalperspective promoted here is that presence does not appear to bea relevant criterion for designing teleoperation interfaces (seealso [1], [85], and [86]). Building on the literature on virtualreality, we argue that there are at least three reasons for that.First, there exists no evidence that the subjective feeling ofpresence is a prerequisite for behavioral performance [86]. Onthe contrary, it has been showed, for example, that using wire-frame versus photorealistic computer graphics (CG) had nosignificant influence on humans’ ability to judge distance in vir-tual environments [87]. Similarly, humans can accurately judgewhether a virtual object is within reach or not, even though

8In [33], [75], [79], and [80], the video and laser/map data on the interfacestill belong to two different frames, and thus, their contents remain incoherent.

they are perfectly aware that this object is immaterial and thuscannot, properly speaking, be “reached” [62]. Therefore, if oneis interested in behavioral performance (task success, efficiency,comfort, etc.), achieving a subjective feeling of presence doesnot appear to be necessary. Second, it is very unlikely thatoperators will ever be fooled and take the world interfaced asbeing their real (uninterfaced) surroundings [86]. The reasonis that any alteration of the structure of ambient stimulationcaused by the teleoperation platform does not only potentiallyalter the information about the world interfaced: it also specifiesthe interface itself (i.e., the teleoperation platform). Thus, elic-iting a sensation of presence is not a realistic goal to pursue, atleast within the next decades. Third, any research on interfacedesign that restricts itself to trying to improve stimulation corre-spondence (instead of information correspondence) will, in alllikelihood, also restrict itself to providing the operator with thesame means for controlling actions that he/she could have usedin unaided situations [88]. Thus, achieving a subjective feelingof presence restrains the repertoire of solutions and therebypotentially underutilizes technology.

It follows that, from an ecological point of view, the chal-lenge is not to design an interface that would be invisiblebut rather to create an interface that would be functionallytransparent, i.e., transparent to the informative patterns that arerelevant to achieve the task at hand [22], [86].

B. Proposed Framework

Determining the information the operator needs to controlthe robot is a complex task, but it is only then that we candevise how this information can be sampled and rendered tothe operator—or any other means to allow the operator aidedwith the teleoperation platform to achieve the intended actions[20], [22], [88]. We propose that the overall process can bedivided into three steps: a first step of identifying all the actionmodes that the A-C-R-E system should afford, a second step ofcharacterizing these modes with the control relevant propertiesof the A-C-R-E system, and a third step of implementing theseaction modes, which will principally consist in modifying theplatform such that the ambient energies stimulating the operatorspecify these A-C-R-E properties. In the terminology of Vicenteand Rasmussen [22], the first two steps aim at identifying thecontent and structure of information, while the third step isconcerned with the form in which this information is givento the operator. In essence, steps 1 and 2 are similar to thework domain analysis used to identify constraints on processcontrol in cognitive systems engineering (for example, as partof the EID framework, e.g., [22], [26], and [27]). However,in the present case, our concern is with the control of action,not process. This implies notably to take into account theconstraints arising from the operator (Section III-A). In thefollowing, we go back to each of these steps.

1) Identifying Action Modes: The first step deals with iden-tifying all the affordances of the A-C-R-E system that need tobe realized (i.e., the actions that need to be performed) to fulfillthe goals for which the teleoperation platform is intended. Thisfirst step thus addresses the following questions: what is thepurpose of the teleoperation platform? What is it designed for?

10 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS

What should it allow the operator to (make the robot) do? Giventhe nested structure of action modes and its consequences oncontrol (see Section III-A), the purpose of this first step is notjust to build a list of all the action modes that the system shouldafford but also to identify and keep track of the nesting relationsbetween actions [22]. An abstraction hierarchy (AH) could beused to model the hierarchical relations between actions [22],[89]. However, the two other types of nesting (sequence andconcomitance) also need to be modeled (Section III-A). Inaddition, such use of the AH, although it would comply withthe principles of the AH (e.g., [22]), would differ from the wayit is traditionally used in at least two respects. First, there is noreason why the number of levels in a hierarchy of action modesshould be known in advance: in all likelihood, this numberwill depend on the purpose of the platform being studied (seealso [37]). Second, this AH would only capture the relationsbetween action modes. As an alternative, an AH could also beused to model how (individual) actions relate to the underlyingphysical laws and properties of the system which constrainthem. This second perspective, which seems to be the way theAH is primarily used in the literature (e.g., in EID for flightcontrol [29]–[31]), addresses the second step of our frame-work: the characterization of action modes (Section IV-B2).The identification of action modes can be handled as a recursiveprocess in which higher order modes (e.g., searching victimsin rubble) are decomposed into lower order modes (e.g., loco-moting and scanning), which are themselves decomposed intolower modes (e.g., steering toward a goal, avoiding obstacles,passing a doorway, scrutinizing, unoccluding, etc.), and so on.

2) Characterizing Action Modes: Using the action modesand the nesting structure previously identified (Section IV-B1),the second step then consists in characterizing the associ-ated affordances and control rules. In other words, this stepis concerned with identifying the relational properties of theA-C-R-E system which define: 1) whether the actionscan be performed or not (and at which cost, etc.); and2) how the corresponding action variables can be regulatedwhen actions are executed (see Section II).

Being relational properties of the A-C-R-E system, the af-fordances are not only action-specific but also contingent onthe characteristics of the actor (operator and robot) and of theenvironment within which the action takes place (commandconsole and robot environment). Affordance characterizationthus requires to take into account constraints such as thosearising from the special populations that will operate the system(operators with particular disabilities, children, elderly, engi-neers, medical staff, etc.) or from the particular ecological nichein which the robot is to operate (home, underwater, forest,rubbles, etc.). Typically, an affordance will be characterizedwith one or several dimensionless ratios (see Section II-A).The relevance of using a particular type of ratio (e.g., anthro-pometrical, biomechanical, and bioenergetical) and of using agiven ratio in place of or in conjunction with others depends oncontext. For example, whether a given path affords locomotionor not depends principally on its width (relative to the robot’s)in a flat home environment, while if in rubbles or outdoor,it may further depend on whether the floor is sufficientlystrong (relative to the robot’s weight), on the slant and material

of the surface (relative to the robot’s power, grip, and massdistribution), and so on. Importantly, finding the description thatbest captures a functional fit does not necessarily imply to reachcompleteness or the highest precision but rather to fulfill theprecision requirements of the action considered.

A similar analysis also applies to control rules. Like af-fordances, control rules are defined across properties of theA-C-R-E system, and thus, the degrees of freedom that haveto be actively regulated when an action is being performedalso depend on the interplay of task, actor, and environmentalconstraints (see Sections II-C and III-A). For example, steeringtoward a goal requires to adjust the direction of self-motion(i.e., heading) so as to minimize the difference between headingand the direction of the goal. However, if the goal is the entranceof a tunnel further ahead on a road, steering can also be definedas adjusting heading so as to stay in the lane (or, for certaincombinations of vehicle and roads, as adjusting heading andspeed, etc.).

3) Implementing Action Modes: Once the functional infor-mation that the operator needs to control the robot has beenidentified, the final step consists in adapting the teleopera-tion platform in such a way that the operator aided by theconsole–robot interface can successfully perform all the actionsidentified during step 1 (Section IV-B1). We propose that themeans for adapting the interface can be divided in four cat-egories: anthropomorphism, information augmentation, actionaugmentation, and delegation. The first three approaches aimat granting the operator access to the functional informationidentified during step 2 (Section IV-B2). The fourth approachis concerned with compensating for not granting him/her thisaccess. The four approaches are not mutually exclusive, and infact, it is likely that they will be used in combination ratherthan in isolation. In the next sections, we describe each of thesetechniques and illustrate them with concrete realizations.

a) Anthropomorphism: This first approach aims at takingadvantage of two facts. First, the operator has considerable priorexperience with interacting through his/her own body. Second,even when operating the robot, the operator continues to co-perceive his/her own body (and presumably in a far more richerway than any display could ever permit). Anthropomorphismproceeds by improving the robot’s resemblance to its humanoperator (cf., also the bionic approach [90]). It rests upon theidea that, by mapping certain properties of the robot to thoseof the operator, the operator will perceive (or be attuned to)these robot’s characteristics through the perception of his/herown body (and/or his/her pre-attunement to these).

The properties matched can be of two different types. First, amapping can be established between properties of the operatorand of the robot which constrain what they can do (e.g., armlength for reaching, grasp size for grasping, and body width forpassing through). More generally, a relational property of theR-ER subsystem which constrains action can be made to reflectan equivalent property of the A-EA subsystem (where ER andEA denote the environment where the robot and agent evolve,respectively). For example, the ratio of the robot’s frontal widthto the size of remote apertures can be mapped to the ratio ofthe operator’s shoulders width to the size of his/her everydaydoors.

MANTEL et al.: PERCEIVING FOR ACTING WITH TELEOPERATED ROBOTS 11

Second, a mapping can also be established between proper-ties of the operator and of the robot (or relational propertiesof R-ER and A-EA) which constrain how opportunities foraction and control rules are specified within the structure ofambient stimulation. A typical example is that of eye-height.For example, under the constraint of an agent’s eye-height, theoptic flow can specify, up to a scaling factor, the width of anaperture in units of the agent’s shoulders width (cf., Fig. 2). Inmobile telerobotics, it has been shown that raising the robot’scamera height toward the operator’s own eye-height had abeneficial influence on his/her ability to judge whether the robotcan pass through an aperture [91]. Eye-height is also known toinfluence the way other agent–environment properties map toparticular features of optic flow, such as the relation betweenground speed and global optic flow rate. With practice, humansbecome attuned to these relations, and subsequently, they maybe fooled (at least for a certain period) when this relation ischanged. For example, Owen and Warren [92] proposed that thehigher height of 747 cockpits (compared to the plane on whichpilots were taught) could explain the excessive taxiing speedof the first 747 pilots. Similarly, drivers switching between carand truck may undergo a recalibration phase [93]. This suggeststhat, in mobile telerobotics, placing the robot’s camera at theoperator’s eye-height (e.g., current, walking, or car driving [1])could prevent operators from misperceiving the robot’s speedand allow for shorter adaptation times.

b) Information augmentation: Information augmentationis the most intuitive solution. As anthropomorphism, it con-sists in providing the operator the functional informationhe/she needs to control the actions identified during step 1(Section IV-B2). However, this time, the idea is to proceedby adding new feedbacks to the console. Information aug-mentation thus addresses the question of how to equip theconsole’s displays such that the resulting structure of ambi-ent stimulation (optics, acoustics, etc.) specifies the functionalA-C-R-E relationships that have been identified during step 2(Section IV-B1).

Before this functional information can be rendered to theoperator, it must first be available to the platform. SomeA-C-R-E system properties characterizing affordances and con-trol rules are known a priori, either whole or in part, andthus can be directly stored on the teleoperation platform. Forexample, this is often the case for the robot’s size and shape(e.g., which constrain its ability to pass through apertures). Inall other situations, the relevant variables have to be capturedby the platform, and thus, the designer has to determine howthey might be picked up. Typically, this preliminary phasewill involve: 1) identifying the ambient energy invariants thatspecify the affordances and control rules gathered during step 2;and 2) identifying the technologies with which these invariantscould be sampled (i.e., the sensors, possibly supplemented withdata processing techniques).

The core phase of information augmentation then consistsin rendering this functional information to the operator. This isachieved by laying out the console’s displays so as to establishnew 1 : 1 mappings between the control-relevant variables fromstep 2 and the invariants in ambient energies that stimulate theoperator’s perceptual systems (see Fig. 5 and Section III-B).

This formulation underlines that the information available tothe operator does not merely depend on what is rendered on theconsole. Rather, it is also contingent on the characteristics ofthe A-C system (e.g., spatial layout). As a consequence, the A-Crelationship must be taken into account (and possibly adapted)when augmenting information.

A possible means for providing the operator with robot-scaled information about the remote environment is to includeparts of the robot’s body in the depiction of its environment onthe console. In the context of mobile robotics, for example, thecamera position, orientation, or size of the field of view canbe adapted in such a way that the visual field comprises therobot’s body [91]. Similarly, when the display includes a CGreconstruction of the environment in 3-D (e.g., built from laseror map data), a scale robot avatar can be incorporated into it[33]. In both cases, if the robot stands sufficiently close to anaperture, the optical structure induced by the display will spec-ify whether the robot can pass through the aperture. However,given the well-known scale ambiguity of video images [94] andvirtual environments [95], the relation between the width of anaperture and the width of the robot may not be specified in theoptic array for apertures farther away. For example, the robotscale will not propagate to visible distant paths and objects if thearea in-between is not filled with enough objects and/or texture,grid, or random tiles on the ground that would materialize thespatial continuity between the robot and the distant elements[45, p. 77], [88]. One possibility is to display a perspectivepresentation of video feeds from several slanted cameras orthe so-called perspective folding [34] (note, however, that theauthors did not include the robot in the display). For groundvehicles, another solution would be to present a bird’s eyeview of the scene (i.e., with the line of sight orthogonal to theground).

In the approaches mentioned previously, the affordance wasrendered to the operator by presenting on the console’s displaythe constitutive elements of the A-C-R-E relationship whichdefine the affordance (e.g., the robot and the aperture width).An alternative approach for providing body- and action-scaledinformation is to directly display the A-C-R-E relationship itselfand/or the associated action boundaries and optimal point (seeSection II-A). For example, as the robot’s width is known andthe width of paths can be retrieved (e.g., online from laser data),it should be possible to directly mark on the virtual ground ofa laser-based 3-D reconstruction, the paths that afford passage,and those that do not. More generally, the virtual ground couldbe painted with gradients of colors (e.g., from green to red) toreflect the degree to which safe, easy, and efficient passage isafforded, given all that is known a priori and being gatheredby the sensors: the size of the passage relative to the robot,the strength of the surface of support relative to its weight, theambient heat relative to what the robot can handle, the path’sslope and material relative to the power, mass distribution, andgrip of the robot, etc.

Predictive displays (e.g., [80], [96], and [97]) could alsoconstitute an interesting tool for augmenting information. In-deed, by readily providing the operator with information aboutthe consequence of his/her actions, this technique could al-low to preserve the perception–action loop [98]. However,

12 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS

to qualify as augmentation of information in the ecologicalsense, the overlaid computer simulation must be designed insuch a way that it, and more generally the induced struc-tures within stimulation, provides the operator with the rele-vant functional information (e.g., the particular informationalvariables by which he/she can control the action at stake).This brings us back to the three steps procedure that we aredescribing.

During the last decades, the related research field of inter-face design for aircraft control has offered several benchmarkexamples of how displays can directly provide the pilot withinformational variables that are relevant for flight control. Afirst example is the WrightCAD [32]. Unlike traditional primaryflight displays which show state variables (altitude, compass,attitude, airspeed, vertical speed, etc.), the WrightCAD presentsthe pilot with functional information. For example, instead ofmaterializing the plane’s airspeed, it shows horizontal “depres-sion lines” whose rate of flow represents the difference betweencurrent speed and stall speed. A second example is the so-calledperspective flight path display or tunnel-in-the-sky display, e.g.,[99]–[101]. It presents the pilot with the flight trajectory tobe followed by overlaying a perspective view of a wire frametunnel on the display monitor. A central characteristic of theresulting display is that the tunnel’s geometrical shape (and theway it changes) specifies the error in aircraft’s position and ori-entation relative to the flight path to be followed (together witherror dynamics). As a consequence, it provides the pilot withfunctional information that he/she can directly use to regulatehis/her action on the aircraft’s controls. Another example isthe energy-augmented version of the tunnel-in-the-sky display[29]. It provides information to the pilot about the aircraft’senergy states and energy rates of change (e.g., the deviationof current total energy from the reference profile) which thusmaps to the precise aspects over which the pilot’s has controlover (i.e., total energy through engine thrust and exchangebetween kinetic and potential energy through elevator). Contin-uing this line of work, new interface elements providing func-tional information about the spatiotemporal separation fromother aircrafts [31] and from the ground [30] have also beendeveloped.

c) Action augmentation: As we underlined inSections II-B1 and D, the availability of information inthe structure of ambient energies strongly depends on theexploratory actions in which the agent might engage to createinformation. It follows that, on some points, an action canplay the same role as a new display. An action can change themapping between the properties of the A-E system and thestructure of ambient stimulation, such that properties that werepreviously not specified become specified (and vice versa).Thus, an action can reveal agent–environment properties thatwould otherwise remain hidden to the perceiver.

As for anthropomorphism and information augmentation, theobjective of action augmentation is to establish new mappingsbetween functional variables identified during step 2 and pat-terns in the ambient energies stimulating the operator (see Fig. 5and Section III-B). However, contra the two former approaches,the latter proceeds by granting the operator the opportunity toperform information-generating actions. Action augmentation

is thus concerned with “enhancing the operator’s resources foraction” [102, p. 16].

Of course, action augmentation will fail to provide functionalinformation to the operator if there is no display on the console.Action augmentation requires some information augmentation.Nonetheless, the approach is worth using because its originalperspective can lead to consider different solutions. A simpleexample is the depth of objects which can be rendered ei-ther through stereo images or by providing the operator withthe opportunity to generate motion parallax (e.g., [103]; notethat depth per se is not information in the ecological sense).To count as action augmentation, the camera motion or anyother actions have to be actively controlled by the operatoras opposed to being passively experienced (e.g., driven by acomputer program). Besides a difference in terms of stimu-lation content, the two versions also differ as a function ofwhether the informative patterns are obtained by the operatoror imposed to him/her. This latter aspect is a central featureof action augmentation. This technique gives the operator theopportunity to generate information when he/she needs it andabout the properties he/she is interested in. For example, insteadof continuously overlaying a specific color on all the objects ofa video display which are within reach of the robot, the mousepointer could simply take a specific color (e.g., green) whenpassing over a reachable object.

By mapping control-relevant A-C-R-E properties on invari-ants of ambient stimulation, information augmentation providesthe operator with an external model of task dynamics andthus obviates the need for the operator to build an internalmodel of this dynamics [26]. In a similar fashion, by grantingthe operator with the opportunity to reveal hidden A-C-R-Econstraints on action, action augmentation allows the operatorto create his/her own external model of task dynamics [102].9

d) Delegation: The last approach consists in delegating tothe robot the action modes for which the information availableto the operator is insufficient. Instead of trying to augment theoperator in one way or another, the robot is left with the respon-sibility to achieve the actions whose control require informationthat the operator is missing. As a result, the operator remains incharge of the sole actions for which he/she has access to thenecessary functional information.

Even if this technique does not imply to render informationto the operator, it nevertheless necessitates that this informationbe available to the robot. Accordingly, delegation requires apreliminary phase of finding the means by which to gatherthe control-relevant A-C-R-E properties identified during step 2(see Section IV-B3b).

The delegation of control is a technique widely used intelerobotics (e.g., obstacle avoidance and safe guarding) andin the related field of manned vehicle control (e.g., driving:ABS, power steering, cruise control, turn suggestion, etc.). Animportant concern of delegation is that, unless the control of ac-tions is completely entrusted to the robot, it necessarily impliesthat control be shared between the operator and the robot. This

9In his paper, Kirlik illustrates that point very neatly with the example ofa cook preparing steaks on a grill in a restaurant. Even if the domain is notdirectly related to robotics, the example is really worth reading.

MANTEL et al.: PERCEIVING FOR ACTING WITH TELEOPERATED ROBOTS 13

Fig. 6. Synthesis of the proposed framework for compensating for the alteration of available perceptual information in teleoperation.

cohabitation of human and autonomous control raises importantissues, e.g., [104] and [105]. From an ecological point of view,the presence of closed control loops (e.g., obstacle avoidance)changes the mapping between the operator’s actions and theirremote consequences (cf., Section III-A) and thus can impairthe operator’s ability to discover the invariants of stimulationwhich maps to the control-relevant A-C-R-E properties (cf.,Section III-B). For example, when an obstacle avoidance sys-tem is activated, the same command issued by the operator canresult in completely different robot behaviors depending on thespatial surroundings of the robot.

The nesting structure of action modes identified duringstep 1 (Section IV-B1) could provide a useful framework forsharing out the work between the operator and the robot ona functional basis. In particular, the hierarchical relations be-tween modes could be used to delegate the control of subsidiaryactions to the robot while the operator remains in charge ofcontrolling actions that exist at a higher level of abstraction. Forexample, to (make the robot) reach an object, the operator canbe responsible for controlling the position of the robotic arm’send point (e.g., the robot’s grasp) relative to the targeted en-vironmental object, while the subsequent control of individualjoints of the robotic arm and of the mobile basis is delegatedto the robot [106]. In complement, the distinction betweenthe prospective versus online control of action (which requireknowledge of affordances and control rules, respectively) couldalso provide a useful delimitation for splitting responsibilitiesbetween the operator and the robot. For example, experimentalparticipants seem to be more precise in judging whether a robotcan pass through an opening than in being actually able to driveit through [70].

C. Concluding Remarks

The proposed framework for coping with the alteration ofavailable information is summarized in Fig. 6. One importantcharacteristic of this framework (which is also common to EIDand WDA) is that it requires knowledge of the domain beingconsidered (steps 1 and 2). As a corollary, it will not be suited

to design interfaces for controlling all-purpose robots—in thesense of platforms with no defined purpose—because thiswould require to identify and characterize every single possibleaction. As we already noted, the four techniques proposed forimplementation (step 3) shall not be considered as independentsolutions but more likely as complementary facets of a solution.These four approaches have different scopes and come withdifferent advantages and disadvantages. For example, actionaugmentation could prove particularly useful for revealing thedynamical properties of the A-C-R-E system which constrainactions but are not specified in the absence of movement (e.g.,maximum turning rate or maximum deceleration in mobilerobotics). For their part, anthropomorphism and delegationcould facilitate and accelerate the operator’s mastery of theactions he/she has to perform with the teleoperation platform.However, the scope of action modes that can be handled withthese three methods (e.g., the A-C-R-E properties that can bemade available through action augmentation and anthropomor-phism) is much narrower than that for information augmenta-tion. On the other hand, information augmentation proceedsby adding new feedbacks on the console and thus comes atthe risk of overloading the display and of drowning relevantinformation of the moment, whereas the three techniques ofaction augmentation, anthropomorphism, and delegation offeralternatives to imposing more information on the console’sdisplay.

From a methodological point of view, the work to beundertaken in each step (identification, characterization, andimplementation) requires both analytical work and empiricalinvestigation. For example, analysis can be used to determinewhat needs to be displayed to make the operator’s ambientstimulation specify control-relevant A-C-R-E properties (whatFlach called inverse ecological optics in case of a visual display[20]). However, the objective is not merely to make the structureof ambient stimulation specific but also to ensure that thesestructures can be picked up by the operator and successfullyexploited to achieve the task at hand. Thus, experimental testsremain of great importance. Although this may appear a colos-sal work, it can capitalize on a large existing literature on human

14 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS

perception–action (but also in animals and autonomous agents)in which themes such as locomotion (walking, driving, flying,etc.) and manipulation (reaching, grasping, intercepting movingobjects, etc) have extensively been studied.

V. CONCLUSION

In this paper, we have attempted to illustrate how the ecolog-ical approach to perception and action could provide a usefulbasis for addressing the issues raised by the perceptual controlof remote actions. It is worth noting that, in return, teleoperatedrobotics provides a unique opportunity for psychologists toput their models and hypotheses to the test. In particular, bylowering the number of degrees of freedom involved in theagent–environment interaction, instruments such as teleoper-ation platforms should allow for a better tractability of theprocesses at work [44]. Our approach differs from the EIDframework in that it focuses on the control of action, notprocess. Throughout this paper, we have pointed out someimplications of this difference in scope for domain analysis(particularly that constraints on action spread over the entireA-C-R-E system). Our analysis of the means by which theinterface can be made to support for action modes also providesa complementary perspective on the problem of the so-calledsemantic mapping [26]. Besides the four techniques that wedescribed, the proposal that, in fine, it is the structure of theoperator’s stimulation—and not the structure of the display perse—that should map to the domain constraints, could providea common ground to address the use of 2-D visual displays, ofother displays (e.g., stereo, head-driven, see-through, auditory,haptic, and multimodal), and thus of alternative interactionparadigms (enaction and mixed reality).

After clarifying important ecological concepts and analyzingthe impact of a teleoperation platform on perception and actionfrom an ecological perspective, we have proposed a frameworkfor interface design which aims at providing the operator withthe relevant functional information (or at compensating fornot providing him/her that information). Our goal was not toprovide a fully operational toolbox but rather to give someguidelines and general principles of an ecological approachto designing HRI for teleoperated robotics. We hope that thiscontribution will stimulate debate and applications and thatit will be completed and improved. In a forthcoming paper,we will present a concrete example of how we used theseguidelines to develop an interface for navigating a robot in ahome environment.

REFERENCES

[1] D. D. Woods, J. Tittle, M. Feil, and A. Roesler, “Envisioning human–robot coordination in future operations,” IEEE Trans. Syst., Man, Cy-bern. C, Appl. Rev., vol. 34, no. 2, pp. 210–218, May 2004.

[2] R. R. Murphy, “Human–robot interaction in rescue robotics,” IEEETrans. Syst., Man, Cybern. C, Appl. Rev., vol. 34, no. 2, pp. 138–153,May 2004.

[3] J. Y. C. Chen, E. C. Haas, and M. J. Barnes, “Human performance issuesand user interface design for teleoperated robots,” IEEE Trans. Syst.,Man, Cybern. C, Appl. Rev., vol. 37, no. 6, pp. 1231–1245, Nov. 2007.

[4] J. S. Tittle, A. Roesler, and D. D. Woods, “The remote perception prob-lem,” in Proc. Hum. Factors Ergonom. Soc. 46th Annu. Meeting, 2002,pp. 260–264.

[5] J. J. Gibson, The Ecological Approach to Visual Perception. Boston,MA: Houghton-Mifflin, 1979.

[6] C. F. Michaels and C. Carello, Direct Perception. Englewood Cliffs,NJ: Prentice-Hall, 1981.

[7] E. S. Reed, Encountering the World: Toward an Ecological Psychology.New York: Oxford Univ. Press, 1996.

[8] J. J. Gibson, “Visually controlled locomotion and visual orientation inanimals,” Brit. J. Psychol., vol. 49, no. 3, pp. 182–194, Aug. 1958.

[9] P. N. Kugler and M. T. Turvey, Information, Natural Law, and the Self-Assembly of Rhythmic Movement. Mahwah, NJ: LEA, 1987.

[10] R. Bajcsy, “Active perception,” Proc. IEEE, vol. 76, no. 8, pp. 966–1005,Aug. 1988.

[11] J. Y. Aloimonos and A. Badyopadhyay, “Active vision,” in Proc. FirstIEEE Int. Conf. Comput. Vis., 1987, pp. 35–54.

[12] A. P. Duchon, L. P. Kaelbling, and W. H. Warren, “Ecological robotics,”Adapt. Behav., vol. 6, no. 3–4, pp. 473–507, Jan. 1998.

[13] R. R. Murphy, “Case studies of applying Gibson’s ecological approachto mobile robots,” IEEE Trans. Syst., Man, Cybern. A, Syst., Humans,vol. 29, no. 1, pp. 105–111, Jan. 1999.

[14] A. C. Slocum, D. C. Downey, and R. D. Beer, “Further experimentsin the evolution of minimally cognitive behavior: From perceiving af-fordances to selective attention,” in Proc. 6th Int. Conf. Simul. Adapt.Behav.—From Animals to Animats, 2000, pp. 430–439.

[15] J. J. Gibson, “A theoretical field-analysis of automobile-driving,” Amer.J. Psychol., vol. 51, no. 3, pp. 453–471, Jul. 1938.

[16] J. J. Gibson, P. Olum, and F. Rosenblatt, “Parallax and perspective dur-ing aircraft landings,” Amer. J. Psychol., vol. 68, no. 3, pp. 372–385,Sep. 1955.

[17] J. M. Flach, P. A. Hancock, J. Caird, and K. J. Vicente, Eds., GlobalPerspectives on the Ecology of Human–Machine Systems. Hillsdale,NJ: LEA, 1995.

[18] P. A. Hancock, J. M. Flach, J. Caird, and K. J. Vicente, Eds., LocalApplications of the Ecological Approach to Human–Machine Systems.Hillsdale, NJ: LEA, 1995.

[19] G. Smets, “Industrial design engineering and the theory of direct percep-tion and action,” Ecol. Psychol., vol. 7, no. 4, pp. 329–374, 1995.

[20] J. M. Flach, “The ecology of human–machine systems I: Introduction,”Ecol. Psychol., vol. 2, no. 3, pp. 191–205, 1990.

[21] J. M. Flach and P. A. Hancock, “An ecological approach to human–machine systems,” in Proc. Hum. Factor Ergonom. Soc. 36th Annu.Meeting, Chicago, IL, 1992, pp. 1056–1058.

[22] K. J. Vicente and J. Rasmussen, “The ecology of human–machine sys-tems II: Mediating “direct perception” in complex work domains,” Ecol.Psychol., vol. 2, no. 3, pp. 207–250, 1990.

[23] D. A. Norman, The Psychology of Everyday Things. New York: BasicBooks, 1988.

[24] D. A. Norman, “Affordance, conventions, and design,” Interactions,vol. 6, no. 3, pp. 38–43, 1985.

[25] J. McGrenere and W. Ho, “Affordances: Clarifying and evolving aconcept,” in Proc. Graph. Interface, Montreal, QC, Canada, 2000,pp. 179–186.

[26] J. Rasmussen and K. J. Vicente, “Coping with human errors throughsystem design: Implications for ecological interface design,” Int. J. ManMach. Stud., vol. 31, no. 5, pp. 517–534, Nov. 1989.

[27] J. Rasmussen, “Ecological interface design for complex systems: Anexample: SEAD-UAV systems,” HURECON, Smorum, Denmark, 1998,Tech. Rep.

[28] J. A. Adams, C. M. Humphrey, M. A. Goodrich, J. L. Cooper,B. S. Morse, C. Engh, and N. Rasmussen, “Cognitive task analysis fordeveloping unmanned aerial vehicle wilderness search support,” J. Cogn.Eng. Decis. Making, vol. 3, no. 1, pp. 1–26, 2009.

[29] M. H. J. Amelink, M. Mulder, M. M. van Paassen, and J. M. Flach,“Theoretical foundations for a total energy-based perspective flight-pathdisplay,” Int. J. Aviat. Psychol., vol. 15, no. 3, pp. 205–231, 2005.

[30] C. Borst, “Ecological approach to pilot terrain awareness,” Ph.D. disser-tation, Fac. Aerosp. Eng., Delft Univ. Technol., Delft, The Netherlands,2009.

[31] S. B. van Dam, M. Mulder, and M. M. van Paassen, “Ecological interfacedesign of a tactical airborne separation assistance tool,” IEEE Trans.Syst., Man, Cybern. A, Syst., Humans, vol. 38, no. 6, pp. 1221–1233,Nov. 2008.

[32] J. M. Flach, “Ready, fire, aim: A “meaning-processing” approach todisplay design,” in Attention Perform., vol. XVII, D. Gopher andA. Koriat, Eds. Cambridge, MA: MIT Press, 1999, ch. 7, pp. 197–222.

[33] C. W. Nielsen, M. A. Goodrich, and R. W. Ricks, “Ecological interfacesfor improving mobile robot teleoperation,” IEEE Trans. Robot., vol. 23,no. 5, pp. 927–941, Oct. 2007.

MANTEL et al.: PERCEIVING FOR ACTING WITH TELEOPERATED ROBOTS 15

[34] M. Voshell, D. D. Woods, and F. Phillips, “Overcoming the key-hole in human–robot interaction: Simulation and evaluation,” in Proc.Hum. Factors Ergonom. Soc. 49th Annu. Meeting, Orlando, FL, 2005,pp. 442–446.

[35] J. J. Gibson, The Senses Considered as Perceptual Systems. Boston,MA: Houghton-Mifflin, 1966.

[36] J. J. Gibson, “The theory of affordances,” in Perceiving, Acting,and Knowing: Toward an Ecological Psychology, R. E. Shaw andJ. Bransford, Eds. Hillsdale, NJ: LEA, 1977, pp. 67–82.

[37] W. H. Warren, “Action modes and laws of control for the visual guidanceof action,” in Complex Movement Behaviour: “The” Motor-Action Con-troversy, O. G. Meijer and K. Roth, Eds. Amsterdam, The Netherlands:Elsevier, 1988, ch. 14, pp. 339–380.

[38] T. A. Stoffregen, “Affordances as properties of the animal–environmentsystem,” Ecol. Psychol., vol. 15, no. 2, pp. 115–134, 2003.

[39] M. T. Turvey, “Affordances and prospective control: An outline of theontology,” Ecol. Psychol., vol. 4, no. 3, pp. 173–187, 1992.

[40] W. H. Warren, “Perceiving affordances: Visual guidance of stair climb-ing,” J. Exp. Psychol. Human, vol. 10, no. 5, pp. 683–703, 1984.

[41] L. S. Mark, K. Nemeth, D. Gardner, M. J. Dainoff, J. Paasche, M. Duffy,and K. Grandt, “Postural dynamics and the preferred critical boundaryfor visually guided reaching,” J. Exp. Psychol. Human, vol. 23, no. 5,pp. 1365–1379, Oct. 1997.

[42] D. L. Gardner, L. S. Mark, J. A. Ward, and H. Edkins, “How do task char-acteristics affect the transitions between seated and standing reaches?”Ecol. Psychol., vol. 13, no. 4, pp. 245–274, 2001.

[43] W. H. Warren and S. Whang, “Visual guidance of walking throughapertures: Body-scaled information for affordances,” J. Exp. Psychol.Human, vol. 13, no. 3, pp. 371–383, Aug. 1987.

[44] R. E. Shaw, O. M. Flascher, and E. E. Kadar, “Dimensionless invari-ants for intentional systems: Measuring the fit of vehicular activitiesto environmental layout,” in Global Perspectives on the Ecology ofHuman–Machine Systems, J. M. Flach, P. A. Hancock, J. Caird, andK. J. Vicente, Eds. Hillsdale, NJ: LEA, 1995, pp. 293–358.

[45] J. J. Gibson, The Perception of the Visual World. Boston, MA:Houghton-Mifflin, 1950.

[46] J. Blauert, Spatial Hearing. The Psychophysics of Human Sound Local-ization. Cambridge, MA: MIT Press, 1997.

[47] D. N. Lee, “The optic flow field: The foundation of vision,” Philos.Trans. Roy. Soc. Lond. B, Biol. Sci., vol. 290, no. 1038, pp. 169–179,Jul. 1980.

[48] J. J. Koenderink, “Optic flow,” Vis. Res., vol. 26, no. 1, pp. 161–179,1986.

[49] H. von Helmholtz, Optique Physiologique, vol. 1/2. Paris, France:Masson, 1867.

[50] E. J. Gibson, J. J. Gibson, O. W. Smith, and H. Flock, “Motion parallaxas a determinant of perceived depth,” J. Exp. Psychol., vol. 58, no. 1,pp. 40–51, Jul. 1959.

[51] D. N. Lee, “Visual information during locomotion,” in Per-ception: Essays in Honour of J. J. Gibson, R. B. MacLeod,J. Herbert, and L. Pick, Eds. Ithaca, NY: Cornell Univ. Press, 1974,ch. 14, pp. 250–267.

[52] D. N. Lee, “A theory of visual control of braking based on informa-tion about time-to-collision,” Perception, vol. 5, no. 4, pp. 437–459,1976.

[53] B. K. Shaw, R. S. McGowan, and M. T. Turvey, “An acoustic variablespecifying time-to-contact,” Ecol. Psychol., vol. 3, no. 3, pp. 253–261,1991.

[54] D. H. Ashmead, D. L. Davis, and A. Northington, “Contribution oflisteners’ approaching motion to auditory distance perception,” J. Exp.Psychol. Human, vol. 21, no. 2, pp. 239–256, Apr. 1995.

[55] H. Y. Solomon and M. T. Turvey, “Haptically perceiving the distancesreachable with hand-held objects,” J. Exp. Psychol. Human, vol. 14,no. 3, pp. 404–427, Aug. 1988.

[56] G. Johansson, “Visual perception of biological motion and a model for itsanalysis,” Attention, Percept. Psychophys., vol. 14, no. 2, pp. 201–211,Jun. 1973.

[57] S. Runeson and G. Frykholm, “Kinematic specification of dynamics asan informational basis for person-and-action perception: Expectation,gender recognition, and deceptive intention,” J. Exp. Psychol. Gen.,vol. 112, no. 4, pp. 585–615, 1983.

[58] H. A. Sedgwick, “Space perception,” in Handbook of Perception andHuman Performance, Volume I: Sensory Processes and Perception,J. P. Thomas, Ed. New York: Wiley, 1986, ch. 21, pp. 21.1–21.57.

[59] R. Marcilly and M. Luyat, “The role of eye height in judgment of anaffordance of passage under a barrier,” Current Psychol. Lett., vol. 24,no. 1, pp. 12–24, 2008.

[60] L. S. Mark, “Eyeheight-scaled information about affordances: A studyof sitting and stair climbing,” J. Exp. Psychol. Human, vol. 13, no. 3,pp. 361–370, 1987.

[61] T. A. Stoffregen and B. G. Bardy, “On specification and the senses,”Behav. Brain Sci., vol. 24, no. 2, pp. 195–261, Apr. 2001.

[62] B. Mantel, B. G. Bardy, and T. A. Stoffregen, “Multimodal perceptionof reachability expressed through locomotion,” Ecol. Psychol., vol. 22,no. 3, pp. 192–211, 2010.

[63] J. A. S. Kelso, Dynamic Patterns: The Self-Organization of Brain andBehavior. Cambridge, MA: MIT Press, 1995.

[64] W. H. Warren, “The dynamics of perception and action,” Psychol. Rev.,vol. 113, no. 2, pp. 358–389, Apr. 2006.

[65] K. Ghose, T. K. Horiuchi, P. S. Krishnaprasad, and C. F. Moss, “Echolo-cating bats use a nearly time-optimal strategy to intercept prey,” PLoSBiol., vol. 4, no. 5, p. e108, 2006.

[66] A. P. Duchon and W. H. Warren, “Robot navigation from a Gibsonianviewpoint,” in Proc. IEEE Int. Conf. Syst., Man, Cybern., San Antonio,TX, 1994, pp. 2272–2277.

[67] B. R. Fajen and W. H. Warren, “Behavioral dynamics of steering, ob-stacle avoidance, and route selection,” J. Exp. Psychol. Human, vol. 29,no. 2, pp. 343–362, Apr. 2003.

[68] G. Schöner, M. Dose, and C. Engels, “Dynamics of behavior: Theory andapplications for autonomous robot architectures,” Robot. Auton. Syst.,vol. 16, no. 2–4, pp. 213–245, Dec. 1995.

[69] W. W. Gaver, “Technology affordances,” in Proc. CHI, New Orleans,LA, 1991, pp. 79–84.

[70] K. S. Jones, E. A. Schmidlin, and L. B. R. Johnson, “Tele-operators’judgments of their ability to drive through apertures,” in Proc. 4thACM/IEEE Int. Conf. HRI, La Jolla, CA, 2009, pp. 249–250.

[71] C. Lenay, “Enaction, externalisme et suppléance perceptive,” Intellec-tica, vol. 43, pp. 27–52, 2006.

[72] B. G. Bardy and B. Mantel, “Ask not what’s inside your head, but whatyour head is inside of (Mace, 1977), a commentary on Lenay (2006) andJacob (2006),” Intellectica, vol. 43, pp. 53–58, 2006.

[73] T. A. Stoffregen, B. G. Bardy, and B. Mantel, “Affordances in thedesign of enactive systems,” Virtual Reality, vol. 10, no. 1, pp. 4–10,2006.

[74] D. A. Norman and J. Nielsen, “Gestural interfaces: A step backward inusability,” Interactions, vol. 17, no. 5, pp. 46–49, Sep./Oct. 2010.

[75] M. Baker, R. Casey, B. Keyes, and H. A. Yanco, “Improved interfacesfor human–robot interaction in urban search and rescue,” in Proc. IEEEInt. Conf. Syst., Man, Cybern., 2004, vol. 3, pp. 2960–2965.

[76] G. Terrien, T. Fong, C. Thorpe, and C. Baur, “Remote driving with amultisensor user interface,” in Proc. 30th SAE Int. Conf. Environ. Syst.,Toulouse, France, Jul. 2000.

[77] F. Ferland, F. Pomerleau, C. T. L. Dinh, and F. Michaud, “Egocentric andexocentric teleoperation interface using real-time, 3-D video projection,”in Proc. 4th ACM/IEEE Int. Conf. HRI, La Jolla, CA, 2009, pp. 37–44.

[78] H. K. Keskinpala and J. A. Adams, “Objective data analysis for a PDA-based human robotic interface,” in Proc. IEEE Int. Conf. Syst., Man,Cybern., 2004, pp. 2809–2814.

[79] B. Keyes, M. Micire, J. L. Drury, and H. A. Yanco, “Improv-ing human–robot interaction through interface evolution,” in Human–Robot Interaction, D. Chugo, Ed. Rijeka, Croatia: INTECH, 2010,pp. 183–202.

[80] R. W. Ricks, C. W. Nielsen, and M. A. Goodrich, “Ecological displaysfor robot interaction: A new perspective,” in Proc. IEEE/RSJ Int. Conf.IROS, Sendai, Japan, 2004, vol. 3, pp. 2855–2860.

[81] J. M. Flach, K. J. Vicente, F. Tanabe, K. Monta, and J. Rasmussen, “Anecological approach to interface design,” in Proc. Hum. Factor Ergonom.Soc. 42nd Annu. Meeting, Chicago, IL, 1998, pp. 295–299.

[82] E. Pasquinelli, “Presence (theories of),” in Enaction and Enactive Inter-faces: A Handbook of Terms, A. Luciani and C. Cadoz, Eds. Grenoble,France: Enactive System Books, 2007.

[83] T. B. Sheridan, “Teleoperation, telerobotics and telepresence: A progressreport,” Control Eng. Pract., vol. 3, no. 2, pp. 205–214, 1995.

[84] J. M. Loomis, “Distal attribution and presence,” Presence-Teleop. VirtualEnviron., vol. 1, no. 1, pp. 113–119, Winter 1992.

[85] R. Casati and E. Pasquinelli, “Is the subjective feel of “presence” anuninteresting goal?” J. Vis. Lang. Comput., vol. 16, no. 5, pp. 428–441,2005.

[86] T. A. Stoffregen, B. G. Bardy, L. J. Smart, and R. Pagulayan, “On thenature and evaluation of fidelity in virtual environments,” in Virtual andAdaptive Environments, L. J. Hettinger and M. Haas, Eds. Mahwah,NJ: LEA, 2003, pp. 111–128.

[87] W. B. Thompson, P. Willemsen, A. A. Gooch, S. H. Creem-Regehr,J. M. Loomis, and A. C. Beall, “Does the quality of the computer

16 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS

graphics matter when judging distances in visually immersive environ-ments?” Presence-Teleop. Virt., vol. 13, no. 5, pp. 560–571, Oct. 2004.

[88] P. J. Stappers, W. W. Gaver, and C. J. Overbeeke, “Beyond thelimits of real-time realism: Moving from stimulation correspondenceto information correspondence,” in Virtual and Adaptive Environ-ments, L. J. Hettinger and M. Haas, Eds. Mahwah, NJ: LEA, 2003,pp. 91–110.

[89] T. A. Stoffregen and T. Nonaka, “Nested affordances,” presented at the14th Int. Conf. Perception Action, Yokohama, Japan, Jul. 2007.

[90] Y. Rybarczyk, D. Mestre, P. Hoppenot, and E. Colle, “Implementationtélérobotique de l’anticipation sensorimotrice pour optimiser la coopéra-tion homme-machine,” Travail Humain, vol. 67, no. 3, pp. 209–233,2004.

[91] K. S. Moore, J. A. Gomer, C. C. Pagano, and D. W. Moore, “Perceptionof robot passability with direct line of sight and teleoperation,” Hum.Factors, vol. 51, no. 4, pp. 557–570, 2009.

[92] D. H. Owen and R. Warren, “Perceptually relevant metrics for the marginof safety: A consideration of global optical flow and density variables,”Ohio State Univ. Res. Foundation, Columbus, OH, 1982, Tech. Rep.

[93] W. Schiff and W. Arnone, “Perceiving and driving: Where parallelroads meet,” in Local Applications of the Ecological Approach toHuman–Machine Systems, vol. 2, P. A. Hancock, J. M. Flach, J. Caird,and K. J. Vicente, Eds. Hillsdale, NJ: LEA, 1995, ch. 1, pp. 1–35.

[94] P. J. Stappers, “Scaling the visual consequences of active head move-ments: A study of active perceivers and spatial technology,” Ph.D. dis-sertation, Delft Univ., Delft, The Netherlands, 1992.

[95] J. M. Loomis and J. M. Knapp, “Visual perception of egocentric distancein real and virtual environments,” in Virtual and Adaptive Environments,L. J. Hettinger and M. Haas, Eds. Mahwah, NJ: LEA, 2003, pp. 21–46.

[96] M. V. Noyes, “Superposition of graphics on low bit rate video as an aidin teleoperation,” M.S. thesis, MIT, Cambridge, MA, Jun. 1984.

[97] T. B. Sheridan, “Space teleoperation through time delay: Review andprognosis,” IEEE J. Robot. Autom., vol. 9, no. 5, pp. 592–606, Oct. 1993.

[98] H. G. Stassen and G. J. F. Smets, “Telemanipulation and telepresence,”Control Eng. Pract., vol. 5, no. 3, pp. 363–374, Mar. 1997.

[99] A. J. Grunwald, J. B. Robertson, and J. J. Hatfield, “Evaluation of acomputer-generated perspective tunnel display for flight-path follow-ing,” NASA, Washington, D.C., Tech. Rep. 1736, 1980.

[100] M. Mulder, “Cybernetics of a tunnel-in-the-sky display,” Ph.D. disser-tation, Fac. Aerosp. Eng., Delft Univ. Technol., Delft, The Netherlands,1999.

[101] E. Theunissen, “Integration of human factors aspects in thedesign of spatial navigation displays,” in Virtual and Adaptive Environ-ments, L. J. Hettinger and M. Haas, Eds. Mahwah, NJ: LEA, 2003,pp. 369–389.

[102] A. Kirlik, “The ecological expert: Acting to create information to guideaction,” in Proc. 4th Symp. Human Interact. Complex Syst., 1998,pp. 15–27.

[103] G. J. F. Smets, C. J. Overbeeke, and M. H. Stratmann, “Depth on aflat screen,” Perception Motor Skill, vol. 64, no. 3c, pp. 1023–1034,Jun. 1987.

[104] J.-M. Hoc, F. Mars, I. Milleville-Pennel, E. Jolly, M. Netto, andJ.-M. Blosseville, “Evaluation of human–machine cooperation modesin car driving for safe lateral control in bends: Function delegation andmutual control modes,” Travail Humain, vol. 69, pp. 153–182, 2006.

[105] D. J. Bruemmer, D. A. Few, R. L. Boring, J. L. Marble, M. C. Walton, andC. W. Nielsen, “Shared understanding for collaborative control,” IEEETrans. Syst., Man, Cybern. A, vol. 35, no. 4, pp. 494–504, Jul. 2005.

[106] K. Nait-Chabane, S. Delarue, P. Hoppenot, and E. Colle, “Strategy ofapproach for seizure of an assistive mobile manipulator,” Robot. Auton.Syst., vol. 57, no. 2, pp. 222–235, Feb. 2009.

[107] E. H. Yilmaz and W. H. Warren, “Visual control of braking: A test of theτ̇ hypothesis,” J. Exp. Psychol. Hum. Percept. Perform., vol. 21, no. 5,pp. 996–1014, Oct. 1995.

Bruno Mantel received the Dipl.Ing. degree incomputer science (specialisation cognitive science)from the Graduate School of Computer Science andAdvanced Technologies (EPITA), Kremlin-Bicêtre,France, in 2002, the D.E.A. degree in human motorskills from Université Paris-Sud, Orsay, France, in2004, and the Ph.D. degree in human movementsciences from Montpellier 1 University, Montpellier,France, in 2009.

He is currently a Postdoctoral Fellow in roboticswith the Computer Science, Integrative Biology and

Complex Systems (IBISC) Laboratory, University of Evry-Val d’Essonne,Evry, France. His research focuses on the intrinsically multimodal, active,and functional nature of perception and on its implications for the design andevaluation of man–machine interfaces.

Dr. Mantel is a member of the International Society for Ecological Psy-chology, the French Association for Cognitive Research (ARCo), and theAssociation for Computing Machinery (SIGCHI).

Philippe Hoppenot received the Dipl.Ing. degreefrom the National Institute of Telecommunication(INT), Evry, France, in 1991, the M.S. degree inapplied physics from Paris XII University, Créteil,France, in 1992, and the Ph.D. degree in roboticsfrom the University of Evry, Evry, in 1997.

He is currently a Professor with the University ofEvry-Val d’Essonne, Evry. His research interests in-clude man–machine cooperation, rehabilitation engi-neering, robotics, and remote control. He has workedon disabled people assistance by the way of giving

them the possibility to remotely control a mobile arm to perform daily lifeactivities. He is currently involved in European and French projects on helpingelderly people, especially with cognitive impairments, to stay at home.

Dr. Hoppenot is a member of the Federative Institute for Research onTechnical Aids to Disabled People (IFRATH) and a member of the editorialcommittee of the Handicap conference, organized by IFRATH every two years.

Etienne Colle received the P.G.Dip. and the Ph.D.degrees in assistive robotics from Paris XII Univer-sity, Créteil, France, in 1979 and 1983, respectively.

He is a First-class Professor with the Universityof Evry-Val d’Essonne, Evry, France. He is the for-mer director (until 2009) of the Complex SystemsLaboratory (LSC) and IBISC Laboratory. He is cur-rently the head of the HANDicap et Santé (HANDS)group. His field of expertise includes robotics andtelerobotics for disabled and elderly people, andhuman–robot interaction (or cooperation). His cur-

rent work focuses on coordinated control of a mobile arm, man–machinecooperation, control modes, user-centered design, and, more recently, ambientrobotics.

Dr. Colle is a co-founder (and co-leader) of the Federative Institute forResearch on Technical Aids to Disabled People (IFRATH), an institutioncreated in 1996 which aims at federating rehabilitation researches in France.