environmental adaptation of robot morphology and control … · 2020-03-31 · 1 environmental...

1

Environmental Adaptation of Robot Morphologyand Control through Real-world Evolution

Tønnes F. Nygaard, Charles P. Martin, David Howard, Jim Torresen, Kyrre Glette

Abstract—Robots operating in the real world will experiencea range of different environments and tasks. It is essential forthe robot to have the ability to adapt to its surroundings towork efficiently in changing conditions. Evolutionary roboticsaims to solve this by optimizing both the control and body(morphology) of a robot, allowing adaptation to internal, as wellas external factors. Most work in this field has been done inphysics simulators, which are relatively simple and not able toreplicate the richness of interactions found in the real world.Solutions that rely on the complex interplay between control,body, and environment are therefore rarely found. In this paper,we rely solely on real-world evaluations and apply evolutionarysearch to yield combinations of morphology and control forour mechanically self-reconfiguring quadruped robot. We evolvesolutions on two very different physical surfaces and analyzethe results in terms of both control and morphology. We thentransition to two previously unseen surfaces to demonstrate thegenerality of our method. We find that the evolutionary searchadapts both control and body to the different physical envi-ronments, yielding significantly different morphology-controllerconfigurations. Moreover, we observe that the solutions found byour method work well on previously unseen terrains.

I. INTRODUCTION

The evolutionary theory describes how animals exhibit be-havioral, structural, and physiological adaptations to environ-mental changes across multiple generations, which increasesthe likelihood of survival and the preservation of their genes.In nature, this process is dependent on a large number ofgenerations of animals breeding and raising their young,resulting in many years for adaptation to take place. Naturalorganisms can adapt through learning reasonably quickly, buttheir morphological adaptation is a long process. In robotics,we can also learn quickly through, e.g., controller adaptation,but as opposed to nature, we can also perform morphologicaladaptation in real-time.

In the context of robotics, adaptability to dynamic environ-mental conditions and mission parameters is a key enablingtechnology that allows robots to perform increasingly complextasks in challenging environments. Practically, improving theadaptability of a robot unlocks an ever-expanding repertoire ofdeployment scenarios such as disaster response, autonomoussurveying, and others.

Legged robots are noted for their agility and ability to tra-verse a multitude of terrain types and, as such, hold a particular

This work was partially supported by The Research Council of Norwayunder grant agreement 240862 and its Centres of Excellence scheme, projectnumber 262762.

T. F. Nygaard is with the Department of Informatics, University of Oslo,Norway, e-mail: [email protected].

K. Glette and J. Torresen are with RITMO and the Department of Infor-matics, University of Oslo, Norway

Fig. 1. The Dynamic Robot for Embodied Testing (DyRET) standing on thefour different surfaces used in our experiments. The robot can change thelength of its legs to adapt its body to the environment it operates in.

promise for completing such missions [1]. They also providean intrinsic and straightforward route towards adaptability, inthat their controllers, typically the rhythms produced by agait engine or the arcs that describe the movement of therobot’s foot-tip positions, can easily be optimized through,e.g., reinforcement learning [2] and evolutionary techniques[3].

As a simple motivating case, consider a legged robot de-ployed in a forest or garden. The robot can reasonably beexpected to have to scramble over some obstacles as wellas squeeze between others. It follows that high in-missionperformance requires the robot to be tall (at times) to stepover obstacles, and small (at other times) to squeeze intotight spaces. In cases like these, behavioral adaptation throughpure controller optimization, as described above, provides onlylimited adaptability, in that the morphology of the robot isstatic1. Recent work has shown examples where morphologicaladaptation of a physical robot can serve as an effectivealternative to classic adaptation of control [4], and even caseswhere adapting morphology is the only feasible option [5].

We describe an approach that bridges the fields of embod-ied intelligence [6] and evolvable hardware [7] by focusingon in-environment behaviors produced by the simultaneousevolution of controller and morphology. All experimentsare conducted in hardware on a dynamically reconfigurablequadruped platform (shown in Fig. 1). Embodied AI tellsus that intelligent in-environment behavior arises from astrong link between morphology, controller, and environment.

1although posture may change through control, the actual geometry of therobot is fixed

arX

iv:2

003.

1325

4v1

[cs

.RO

] 3

0 M

ar 2

020

2

By co-evolving morphology and controller on a hardware-reconfigurable robot, we can expect to perform a broaderrange of missions in more challenging scenarios than ifjust controller tuning was considered [8]. The relatively fewprevious studies on the simultaneous evolution of leggedrobot morphology and control have mostly performed theevolution in simulated environments [9], [10], [11], [12]. Thefew examples performed on real-world systems are on simplerobots or require too much human intervention or time to adaptto continuously changing environments [13], [14].

Our earlier experiments on real-world evolution of leggedrobot morphology and control include adaptation to differentinternal hardware states [15]. In this paper, we tackle the chal-lenge of real-world adaptation to external states. Specifically,we conduct experiments on adapting robot morphology andcontrol to different types of planar surfaces, see Fig. 1.

We propose two hypotheses to investigate robotic structuraladaptation in the real world:

H1: Performing an evolutionary search in diverse physicalenvironments will result in individuals with significantlydifferent control and morphology.

This states that the evolutionary search adapts the in-dividuals to the environments they are evolved in andcan be disproven if we observe quantitatively similarindividuals after evolving on characteristically differentterrains.

H2: The performance of the evolved individuals will transferbetter to qualitatively similar environments.

This states that the results found in evolution willgeneralize and that individuals adapted to one type ofterrain will perform comparably in other qualitativelysimilar terrains. It can be disproven if the performanceof individuals is shown to be very different in similarterrains.

Our results reveal that evolving on different surfaces hasa significant effect on both the control and the morphologyof the robot. This supports our first hypothesis on adaptation.Subsequently, we tested the resulting individuals from the evo-lutionary runs on previously unseen surfaces. The results showthat individuals perform best on surfaces that are qualitativelyclose to the one they were trained on. This supports our secondhypothesis on generalization to unseen environments.

Being able to handle unknown, dynamic environments isa compelling reason for adaptation, and our ultimate goal. Inthis paper, we contribute with an important first step by testingadaptation on a variety of indoor planar surfaces by showingthat morphological adaptation is a crucial factor in unlockingheightened performance, even in these relatively innocuousenvironmental scenarios. Moreover, in the vein of Auerbachand Bongard’s work [16], we show that evolution can adaptto different terrains, but this time using a real-world-onlysearch. Adaptation to these kinds of environmental differencesis highly dependent on real-world evaluations as the naturalnoise, uncertainty, and detailed interaction dynamics can beimpossible to model with current physics simulators [17].

II. BACKGROUND

In this section, we provide a brief overview of the field ofevolutionary robotics, with a particular focus on gait learning,morphological adaptation, and optimization on physical robotsin the real world.

Most effort in robot adaptation has focused on improv-ing the control of a robot. Typically, the gait pattern [18],foot-tip arcs [3], or more high-level gait parameters [19]are candidates for optimization. Approaches include heuristicterrain adaptation [20] to switch gait based on, e.g., ener-getics [21] and power consumption [22] as terrain changes,as well as more complex hierarchical optimization-based ap-proaches [23]. There are many methods to optimize the controlof a robot, including reinforcement learning [2], transferlearning [24], and deep reinforcement learning [1]. Anotherpopular approach is to use methods from evolutionary com-putation [25]. Many different types of legged robots are usedin research, but doing machine learning on physical robotsputs high demands on reliability and maintainability, makingsome robots more suitable than others [26].

Evolutionary robotics is a field that uses techniques fromevolutionary computation to optimize different aspects ofrobots. Most work in the field has traditionally been done onvirtual robots in simplified physics simulations and is onlyconcerned with optimizing the controller [27]. Working insimplified simulations alone often results in controllers andmorphologies that are hard to transfer to the real world dueto inaccuracies in the modeling, a problem referred to as thereality gap. Many solutions to reduce the reality gap have beenproposed, such as adding noise [28], starting in simulationand finishing in the real world [29], modeling the reality gapitself [30], and treating simulators as just another environmentthat needs adaptation [31]. There have been several significantcontributions to solve this problem, but they have not kept upwith the increased complexity in the terrains and environmentsof current robots [27].

There are many examples of evolutionary robotics tech-niques being used to evolve gaits in the real world on differentphysical robots, including commercial off-the-shelf leggedrobots like the AIBO [12], [32], as well as purpose-builtcustom robots [33]. Most optimize a limited set of parametersthat control all the legs identically. However, there are alsoexamples that generate separate control arcs for each leg [3],which allows adaptation to the specifics of the hardware (e.g.,an actuator slipping, or delivering reduced torque due to wear).

Optimizing control can be a very effective way of adaptinga robot to new tasks and environments. However, there areseveral examples where optimizing the body of the robot,its morphology, can also be an effective method. Due tothe inherent difficulties associated with making a variable-morphology robot, most research to date on adaptive mor-phology on physical robots does the optimization ahead oftime in simulation, then transfers a select few individuals tothe real world for testing. Examples of this include leggedrobots [9], soft robots [34], modular robots [10], and evenflying robots [35].

3

Lateral Cranial

Dorsal

Femur

Tibia

Coxa

Fig. 2. The directional terminology used for the robot, as well as the namesof the three links of the legs.

Morphological adaptation offers more freedom to tailorin-environment behavior, and through the lens of embodiedcognition, provides the means to tightly couple control andmorphology with environmental performance. Morphologicaladaptation approaches in hardware are now becoming increas-ingly viable. This is largely thanks to ongoing improvementsin the quality and availability of the prerequisite roboticcomponents, and the rapid adoption of flexible fabricationtechniques, e.g., 3D printing, into the robot design process.There are broadly two approaches to achieve morphologicalrobot adaptation: (i) optimize 3D-printable components andattach them to a robotic base to provide bespoke terrain-specific performance properties [36]; or (ii) create a singlerobot with built-in adaptation abilities [37]. We focus on thelatter approach as it allows morphology to be changed in-situ,e.g., as a real-time response to environmental stimuli. It alsoopens up the possibility to incrementally learn a controller ona simpler hardware configuration, which has previously beenshown to increase robustness [38].

We note that the literature shows a distinct lack of mor-phological variation on real-world robots, making our focuson employing evolutionary methods for optimization of themorphology of a physical quadruped robot novel.

III. MATERIALS AND METHODS

This section introduces our robot, before describing the gaitcontroller, evolutionary setup, and physical environments usedin the experiments.

A. The DyRET Robot

We used the Dynamic Robot for Embodied Testing(DyRET), a mammal-inspired quadruped robot [37]. Thiscustom robot was developed at the University of Oslo asa platform for evolutionary experiments, in particular forsimultaneous optimization of morphology and control. Thedesign allows the robot to independently and automaticallychange the length of its femurs and tibias, as seen in Fig. 2.

The DyRET project is certified by the Open Source Hard-ware Association (OSHWA) as fully open source. All software,

Air front

Air top

Air back

Ground front Ground back

Fig. 3. An example of a typical foot tip trajectory, including the five controlpoints used to calculate the spline shape. The solid line shows the full step,while the dashed lines show the trajectory for the previous and next steps.This view shows the side of the robot, with the front of the robot to the leftof the figure.

hardware design files, documentation, and simulation modelsare available online2.

The robot body is mostly built with 3D-printed fiber-reinforced plastic and milled aluminum parts, along withCommercial-Off-The-Shelf (COTS) parts, including carbonfiber tubing and aluminum brackets. Being able to endure badmorphology and gait combinations can require a superabun-dance of motor power. Keeping weight down was, for thisreason, a priority during the design phase, while keeping highrobustness and maintainability of the platform.

Each leg features three revolute joints, with a DynamixelMX-64 in the coxa, and Dynamixel MX-106 servos in thefemur and tibia. These are connected on a common bus, andeach run a separate PD position controller. Each femur andtibia consist of a custom linear actuator that allows the lengthof each leg segment to be changed. The mechanical meansof achieving repeatable, physically strong morphological re-configuration on a robot of this size requires the use of,e.g., screw-based linear actuators, the trade-off being that themechanism is too slow to be used actively as actuation in adynamic gait. The femur can extend by 50mm, and the tibia by100mm. This is powered by a brushed DC motor, connectedto a lead screw through a chain, resulting in a linear speed ofapproximately 1mm/sec. An absolute encoder gives the linearactuators an accuracy of around half a millimeter.

The robot features an MTi-30 Attitude and Heading Refer-ence System (AHRS) in the center of its body, which givesabsolute orientation, rotational velocity, and linear accelera-tion. Reflective markers for motion capture are mounted onthe robot body for measuring absolute position and orientation.Each servo has sensors for temperature, voltage, current, andjoint position.

The robot is tethered to a desktop computer over USB,which runs all software on the Robot Operating System(ROS) framework. The gait controller runs at a rate of 50Hz,exchanging commands and data to and from the servos, whilethe sensors and motion capture system are sampled at 100Hz.

B. Parameterizable Control and Morphology

The robot uses a high-level spline-based gait controllerworking in Cartesian space [39]. For generating the foot tiptrajectory, a three-dimensional looping cubic Hermite spline isbuilt using five control points. These define the path the legtakes through the air, as seen in Fig. 3. All legs follow the

2https://github.com/dyret-robot/dyret documentation

4

TABLE IPARAMETERS AND RANGES DEFINING THE SPLINE SHAPE. THE AXES ARE

SHOWN IN FIG. 2.

Control point Lateral Cranial DorsalGroundfront 0 [0, 100] 0Groundback 0 [-150, -50] 0Airfront [-12.5, 12.5] [25, 125] [19, 41]Airtop [-12.5, 12.5] [-30, 30] [39, 61]Airback [-12.5, 12.5] [-125, -25] [19, 41]

TABLE IIRANGES FOR GAIT AND MORPHOLOGY PARAMETERS.

Parameter RangeWag phase ≈[-0.394, 0.394]Wag amplitudes [0, 14.0]Lift duration [0.13, 0.20]Frequency [0.25, 1.0]Femur length [0, 50]Tibia length [0, 100]

same trajectory but are offset with a static phase shift. Theadjustable parameters and ranges of the parameters can beseen in Table I. Note that the controller only generates targetpositions and that the actual leg movement can vary dependingon the surface it is walking on.

In addition to the parameters controlling the spline shape,the seven global parameters are shown in Table II. Two param-eters define the gait timing; the frequency of the gait (steps persecond for each leg), and lift duration (the percentage of thegait period spent moving the leg through the air). We have alsoadded a balancing wag—a counter-balancing movement—toincrease the stability of the robot by leaning to the oppositeside of the leg it is currently lifting. This movement is neededsince the weight of each leg is high compared to the overallbody weight. It is especially useful at slow speeds, where theinertia from movement alone can not be used to counteractthe shifting weight of the legs. The phase of this movementcan be changed, along with separate amplitudes for the lateraland cranial amplitude (see Fig. 2).

For controlling the morphology, the Femur length and Tibialength parameters are used. For the experiments in this paper,all legs share the same femur and tibia lengths. The legs willreconfigure fully to the target length before the evaluation ofa new gait is performed.

C. Evolutionary Setup

Our evolutionary search was configured to find combina-tions of control and morphology that achieve both high speedand stability, and expose the Pareto front of trade-offs betweenthese two objectives. For this, we apply the commonly usedNSGA-II [40] algorithm with two fitness objectives, Fspeed

and Fstability.Speed is calculated by using the position of the robot from

motion capture equipment (P ) and can be seen in Equation 1.Stability is found using the on-board Attitude and HeadingReference System (AHRS) sensor, calculated as a combinationbetween standard deviations of orientation (Ang) and linearacceleration (Acc), seen in Equation 2. For our sensor, α has

experimentally been set to 150 to give approximately equal

contributions from linear acceleration and orientation.

Fspeed =‖Pend − Pstart‖

timeend − timestart(1)

Fstability = −axes∑i

(α ∗ std(Acci) + std(Angi)

)(2)

The two fitness objectives are both needed. Evolving onlyfor speed incentivizes solutions that are at the brink of falling,and evolving only for stability incentivizes solutions that moveas little as possible. Combining the two objectives results inthe ability to choose robust trade-offs over a range of differentspeed-stability combinations.

Our genetic algorithm uses Gaussian mutation with a prob-ability of 1 and a sigma of 1/6, with no recombination.To prevent values accumulating around the upper and lowerbounds, we continue mutation back towards the middle of therange when we hit a maximum or minimum value.

Before evaluating a new individual, the length of the legsis changed, and the feet repositioned one at a time to the startpose of the new gait, to remove any measurement contamina-tion from previous individuals. Each evaluation was completedwhen either the robot had walked one meter forwards or 10seconds had elapsed. Previous experiments have shown thisto give a reasonable estimation of the actual performance ofboth fast and slow gaits, regardless of stability. The robotwas then manually moved back to the starting position, beforethe next individual was evaluated. The operator had a remotecontrol with the ability to pause, retry, or discard individualsif anything unexpected happened. Servo temperatures weremonitored, and the robot was left to cool down when any servoexceeded 60◦C. The evaluation time for a single individualwas approximately 30 seconds, including setup, cooldown andother manual intervention, as well as the actual walking.

Performing evolution on a real-world robot severely limitsthe number of evaluations attainable for each run of thealgorithm. The larger the number of evaluations, the higher theprobability of damage to the robot that could potentially skewthe results. Based on previous successful experimentation [15],we run 256 evaluations, encompassing 32 generations of 8individuals. These parameters promote convergence, withoutwasting an excessive amount of evaluations at the end of theruns on minor improvements.

D. Experimental Environments

The purpose of our experimentation is to assess how differ-ent environments affect the evolution of different individuals,looking at both performance and behavior. Robots operatein an increasing number of terrains with varying roughness,slope, discontinuity, and hardness characteristics [41]. Weselect four surfaces that can be characterized by the twoqualitative features, shown as different dimensions in Figure4: hardness (soft and hard) and roughness (coarse-textured andfine-textured). In our experiments, we consider hardness to bethe primary feature (used to train two distinct populations)

5

Hard SoftFi

ne

Surface A Surface B

Coa

rse

Surface C Surface D

Fig. 4. The four different carpets used to approximate real-world terrainswith different characteristics.

and roughness to be a secondary feature (used to test howthese populations generalize to new environments). SurfaceA serves as a baseline and is a hard, fine-textured surfacewith high friction. Surface B is also fine-textured, but verysoft, so the robot’s legs easily sink into it. Surface C is hard,with large knots that give it a coarse texture. Surface D issoft, with individual strands in multiple directions giving it acoarser, non-uniform texture. These surfaces reflect featuresfound in natural terrains like concrete, sand, hard-packed soil,and grass, respectively.

IV. EXPERIMENTS AND RESULTS

We conducted two experiments. We first ran several full evo-lutionary runs on the fine-textured hard (A) and fine-texturedsoft (B) surfaces. We then randomly selected 12 individualsfrom the final Pareto front and re-evaluated them on all foursurfaces to observe how the evolved individual’s performancecompares on the previously unseen coarse-textured hard (C)and coarse-textured soft (D) surfaces.

Experiment 1 - Evolutionary Runs

In this experiment, we ran five evolutionary runs on surfaceA and five on surface B, producing two Pareto fronts ofindividuals evolved for the different surface hardnesses. Wealternated runs on the two surfaces to make sure gradual wear-and-tear of the robot would affect results from the two surfacesequally.

Fig. 5 shows the resulting Pareto fronts from all the evo-lutionary runs. The individuals achieved similar performancefor speeds below about 10m/min, and higher fitness on thesofter carpet (surface B) for speeds between about 10m/minand 15m/min. Speeds above 15m/min were only seen on thehard carpet (surface A). For each surface, four out of fiveevolutionary runs had individuals that were part of the finalglobal Pareto front.

Fig. 6 shows the development of the mean hypervolume ofthe Pareto fronts as the search progresses. The search appearsto have converged on the hard carpet (surface A) after lessthan 100 evaluations. The search on the soft carpet (surface

4 6 8 10 12 14 16 18Speed (m/min)

-0.25

-0.20

-0.15

-0.10

Stab

ility

Surface ASurface B

Fig. 5. Performance of all individuals from the Pareto fronts of the evolution-ary runs on two different surfaces. The lines are a result of locally weightedlinear regression using a nonparametric lowess model with default Seabornparameters.

0 50 100 150 200 250Evaluation

5

10

15

Hyp

ervo

lum

eSurface ASurface B

Fig. 6. Mean hypervolume of the Pareto fronts from the evolutionary runs oneach surface. The hypervolume is calculated as the size of the dominated areaunder the Pareto front, to a minimum speed of 0, and a minimum stability of-1. The shaded areas show the 95% confidence interval for the mean value.

B), however, shows slight improvements up to the end of theevolutionary runs.

Fig. 7 shows the parameter values of the individuals in thePareto front for all evolutionary runs on each surface. We rana two-sided Mann-Whitney U test on each parameter (n1 =65, n2 = 52, p < 0.01), with a Holm-Bonferroni p-valuecorrection, to see if the individuals found with the evolutionarysearch had statistically significant differences in some of theirparameters for the different surfaces. For morphology, the tibialength showed significant differences (U = 2733). For thecontroller, we found statistically significant differences for thecranial position of the ground front control point (U = 2478),and the cranial (U = 783) and dorsal position (U = 925) ofthe air back control point. Note that the parameter values seenare from the full resulting Pareto front, including a range ofdifferent trade-offs between speed and stability.

Fig. 8 shows the details of two selected parameters fromcontrol and both parameters from morphology. We see littledifference in the length of the femur (Fig. 8c.), but the tibiahas a clear difference between the two surfaces (Fig. 8d.). Onthe hard carpet (surface A), we see a very uniform distributionacross the whole range. However, for the soft carpet (surfaceB), we only see one single individual with a tibia length above40% of the available length.

To better visualize the differences in the evolved controllerparameters, Fig 9 shows kernel density estimates for the

6

Femur

length

***Tibi

a len

gth

Lift D

uratio

n

Freque

ncy

Wag Pha

se

Wag Ampli

tude l

Wag Ampli

tude c

Ground

Front l

***Grou

nd Fron

t c

Ground

Back l

Ground

Back c

Air Fron

t l

Air Fron

t c

Air Fron

t d

Air Top l

Air Top c

Air Top d

Air Bac

k l

***Air B

ack c

***Air B

ack d

0.0

0.2

0.4

0.6

0.8

1.0G

enot

ype

valu

e

Surface A Surface B

Fig. 7. The genotypic values from all individuals on the final Pareto fronts from all evolutionary runs, grouped by the surface they were evolved on. Controlpoint parameters are denoted with the axis they belong to, with l for lateral, c for cranial, and d for dorsal directions. ***Statistically significant differences.

Surface A Surface B

-100 -50(a) Back air cranial position (mm)

20 30 40(b) Back air dorsal position (mm)

0 20 40(c) Femur length (mm)

0 50(d) Tibia length (mm)

Fig. 8. Rain cloud plots showing the distributions for two selected controlparameters (a-b) and both morphology parameters (c-d).

control points, as well as the mean controller splines. Themean splines were found by averaging all positions for eachtime step in the gait period. The kernel density estimates areshown as shaded areas and were calculated using a Gaussiankernel with Scott’s rule to determine kernel size. The figureshows small differences in the two points on the ground, aswell as the front and top points in the air, but there are moresubstantial differences in the back air point.

From these results, we see statistically significant effects onboth control and morphology due to the impact of evolvingon different surfaces.

-150-100-50050100

0

20

40

60 Surface ASurface B

Fig. 9. The solid lines show the mean trajectory splines of all Pareto-optimalindividuals from each evolutionary run. The shaded areas show the kerneldensity estimates for the spline control points. The plot is seen from the sideof the robot, with the front of the robot to the left of the figure.

Experiment 2 - Verification

In this experiment, we verify the results from the evolution-ary runs by re-evaluating a subset of the individuals multipletimes on the surface they were evolved on, as well as observingthe performance on unseen surfaces. We selected six randomindividuals from the resulting Pareto fronts of the evolutionaryruns for each surface. We then tested these twelve individuals20 times each on all four surfaces, for a total of 80 newevaluations per individual.

Fig. 10 shows the performance of all the re-evaluatedindividuals. The performance on surface A and C (both inshades of blue), and B and D (both in shades of orange)are grouped in most of these plots. This suggests that theperformance of individuals tends to be grouped when hardnessis similar, our primary feature.

To quantify these differences in performance between thedifferent surfaces, we first did min-max normalization ofspeed and stability. We then measured the Euclidean distancebetween the mean performance on the different surfaces,for each individual. The mean results over all individualsare shown in Table III. This distance matrix shows us thedifferences in performance between the different surfaces;

7

Surface A Surface C Surface B Surface D

13.0 13.2 13.4

-0.165

-0.150

-0.135

4.0 4.8 5.6

-0.150

-0.125

-0.100

9.6 10.2 10.8

-0.120

-0.115

-0.110

7 8 9

-0.140

-0.120

-0.100

9.6 9.9 10.2

-0.160

-0.140

-0.120

13.5 14.5 15.5

-0.250

-0.200

-0.150

11.2 11.6 12.0-0.160

-0.140

-0.120

9.9 10.2 10.5-0.130

-0.120

-0.110

9.0 9.5 10.0

-0.110

-0.122

-0.135

7.6 7.8 8.0-0.120

-0.105

-0.090

9.9 10.2 10.5-0.150

-0.135

-0.120

7.75 8.00 8.25-0.120

-0.105

-0.090

Fig. 10. Performance measurements for all re-evaluated individuals, with the same axes as Fig. 5. Each plot contains the 80 evaluations, distributed acrossthe four surfaces, of one individual. The individuals in the two top rows were initially evolved for surface A, while the bottom two were evolved for surfaceB. Note that the scaling is different for each plot to best show relative differences in performance.

TABLE IIIDISTANCE MATRIX SHOWING THE MEAN NORMALIZED EUCLIDEAN

DISTANCE BETWEEN EVALUATIONS ON DIFFERENT SURFACES.

Surface A Surface C Surface BSurface D .108 .106 .048Surface B .073 .082Surface C .053

smaller values show a more similar performance. We see thatthe two combinations of surfaces with the smallest differencein performance are surfaces A and C, and B and D. We alsosee that performance on surface B is closer to both A and Cthan performance on surface D is.

From these results, we see that individuals display similarperformance on new surfaces that are closer to the surface theywere evolved on.

V. DISCUSSION

The results in Experiment 1 showed statistically significanteffects on both control and morphology due to the impactof evolving on different surfaces. This directly supports ourfirst hypothesis (H1) and suggests that there is a benefit ofoptimizing morphology for different surfaces. Moreover, theexperiment shows that the evolutionary search was able to findsuitable individuals adapted to the real-world characteristics ofthe surfaces, using only real-world evaluations.

The results of Experiment 2 showed that individuals displaysimilar performance on new surfaces that are closer to thesurface of their evolution. This supports the second hypothesis(H2) and suggests that the results apply to qualitatively sim-ilar surfaces. Overall, these two experiments suggest that anarchive of different individuals, evolved for a range of differentterrains in an offline fashion, could be effectively utilized on

8

new terrains by selecting previously found individuals fromsimilar surfaces in the archive. The archived individuals couldthen be further improved by local search or other techniquesif needed.

We believe that the difference in maximum achieved speedbetween the two surfaces, as seen in Fig. 5, is related to thedifference in leg length seen in Fig. 8c/d. The morphologiesfound for surface A have longer legs, and are thus able toachieve higher leg speeds for the same joint velocity. Thisalso fits well with the qualitative estimate of walking difficulty,as softer surfaces typically require more power and torque towalk in. Between about 10m/min and 15m/min, however, indi-viduals evolved on surface B outperforms individuals evolvedfor surface A. This is most likely because the search processinherently spends more effort on improving individuals at theends of the Pareto front, aided by the crowding distance metricin the NSGA-II algorithm. Small performance increases in thisarea for surface B will result in the individuals being kept.For surface A, this effort might instead be spent on trying tofind faster individuals, which is inherently more difficult, andrequires considerable effort.

Fig. 7 shows the differences between the individuals fromthe Pareto fronts on the two different surfaces. There is avery large variance in most of the parameters, but this is notsurprising since the groups we are comparing contains thewhole range of individuals from the Pareto front. One wouldexpect less variation if comparing individuals evolved for onlya single objective. Still, we can observe some clear differencesbetween the sets of individuals evolved for different surfaces.One might think that the parts of the search space that is notpopulated by any individuals would point to unusable valuesthat should be removed as to make the search space smallerand thus the search more effective. However, these areas mightbe necessary for other environments or tasks than what wehave used here.

The fact that individuals from four out of five of the evolu-tionary runs on each surface are part of the global Pareto frontsshows that we did not have substantial changes to the robotor surfaces during our experiments. Gradual damage to therobot or changes to the surface during operation might result inindividuals from early evaluations getting better performance.However, the fact that both the first and last evolutionary runon each surface resulted in globally Pareto-optimal individualssuggests this is not an issue in our experiments.

One of the most substantial differences in the leg trajectoriescan be seen where the leg lifts from the ground, at the backof the step (to the right in Fig. 9). The leg trajectory takesa sharper turn with a more direct path on the soft surface(surface B), while the movement is more rounded on the hardsurface (surface A). The legs of the robot sink into the softsurfaces, which means that there is a larger difference betweencommanded and actual movement close to the ground than forthe harder surfaces. This might explain the sharper transitionof the trajectory for the soft surface (surface B), as the softnessof the terrain provides inherent dampening of the movement,and the trajectory can then be sharper than harder surfaceswithout this.

Performing evolutionary optimization on real-world robots

involves a trade-off between search effectiveness on one sideand platform performance degradation, on the other side.While the number of evaluations in our runs were rela-tively small compared to typical evolutionary experimentsusing, e.g., simulated robots, the results showed reasonableconvergence and performance to support our hypotheses.However, the approach could benefit from hybridization withother approaches, e.g., using physics simulation as a startingpoint [42], [36] or surrogate models [43], [44] to improve thedata-efficiency in the search process.

We have designed a platform with self-modifying morphol-ogy, enabling practical real-world evolution of morphologyand control, and thus exploiting environmental features hard tomodel in simulations. We showed this for internal conditions,namely actuator power, in [15], and external conditions, likesurface roughness, in this paper. While the designed platformoptimizes its morphology, it comes at the cost of additionalhardware required for the mechanism, adding significantly tothe total mass of the system. Future robots featuring newmaterials and mechanisms may allow morphological recon-figuration with less overhead, which might make it even moreapplicable to real-world problems.

There are many avenues for future work arising fromthis paper. When we compare how the individuals performon different surfaces, we mainly look at the difference inperformance. Analyzing the behavior instead would allow amore thorough investigation into the underlying effects, butwould require more effort both during the experiments andafter. In the same vein, looking at the actual movement ofthe legs instead of the target position of the actuators couldyield new insight into the effects different terrains have on therobot. Bringing simulation or surrogate models into the mixcould allow longer evolutionary runs with better convergenceat fewer physical evaluations in the real world. There is alsoa range of other optimization techniques that could be doneinstead of or in combination with evolutionary search.

VI. CONCLUSION

In this paper, we showed that evolutionary optimizationon a real-world legged robot adapts both morphology andcontrol to different external environments, suggesting that suchcapabilities could be a key feature of future adaptive robots.

First, we investigated what effects different ground surfaceshave on evolved individuals. We observed that there arestatistically significant differences in both the control andmorphology of the evolved individuals, showing that the searchcan adapt the robot to different physical environments. We alsoinvestigated generalization by testing individuals on previouslyunseen surfaces, where we observed lower performance differ-ences on surfaces qualitatively similar to the ones they wereinitially evolved on.

Our results suggest that real-world evolutionary optimiza-tion is a suitable technique for adapting both the body and con-trol of a physical legged robot to new physical environments.We also demonstrated how evolved individuals transfer betterto qualitatively similar surfaces. This means that for manyapplications, one might not need to do a continuous adaptation

9

to new environments during operation, but leverage an archiveof different morphology-controller pair generated in safer, butqualitatively similar, indoor terrains.

Our self-modifying quadruped platform allowed us to per-form a large number of real-world evaluations and thusdiscover solutions exploiting the physical characteristics ofdifferent surfaces. We think this extension into the real-world domain is a promising approach for the simultaneousoptimization of morphology and control for legged robots. Weaim to bring our work out of the lab and into realistic terrainsoutside, and hope our work inspires others to also take on theexciting challenges faced when doing evolutionary robotics inthe real world.

REFERENCES

[1] J. Hwangbo, J. Lee, A. Dosovitskiy, D. Bellicoso, V. Tsounis, V. Koltun,and M. Hutter, “Learning agile and dynamic motor skills for leggedrobots,” Science Robotics, vol. 4, no. 26, 2019. [Online]. Available:https://robotics.sciencemag.org/content/4/26/eaau5872

[2] N. Kohl and P. Stone, “Policy gradient reinforcement learning for fastquadrupedal locomotion,” in IEEE International Conference on Roboticsand Automation (ICRA 2004), vol. 3. IEEE, 2004, pp. 2619–2624.

[3] H. Heijnen, D. Howard, and N. Kottege, “A testbed that evolves hexapodcontrollers in hardware,” in 2017 IEEE International Conference onRobotics and Automation (ICRA). IEEE, 2017, pp. 1065–1071.

[4] S. Kriegman, S. Walker, D. Shah, M. Levin, R. Kramer-Bottiglio, andJ. Bongard, “Automated shapeshifting for function recovery in damagedrobots,” Robotics: Science and Systems, 2019.

[5] G. Picardi, H. Hauser, C. Laschi, and M. Calisti, “Morphologicallyinduced stability on an underwater legged robot with a deformablebody,” The International Journal of Robotics Research, 2019.

[6] D. Howard, A. E. Eiben, D. F. Kennedy, J.-B. Mouret, P. Valencia,and D. Winkler, “Evolving embodied intelligence from materials tomachines,” Nature Machine Intelligence, vol. 1, no. 1, p. 12, 2019.

[7] G. W. Greenwood and A. M. Tyrrell, Introduction to Evolvable Hard-ware: A Practical Guide for Designing Self-Adaptive Systems (IEEEPress Series on Computational Intelligence). Wiley-IEEE Press, 2006.

[8] T. F. Nygaard, C. P. Martin, J. Torresen, and K. Glette, “Exploringmechanically self-reconfiguring robots for autonomous design,” 2018ICRA Workshop on Autonomous Robot Design, 2018. [Online].Available: http://arxiv.org/abs/1805.02965

[9] T. F. Nygaard, E. Samuelsen, and K. Glette, “Overcoming initial con-vergence in multi-objective evolution of robot control and morphologyusing a two-phase approach,” in Applications of Evolutionary Computa-tion, G. Squillero and K. Sim, Eds. Springer International Publishing,2017, pp. 825–836.

[10] J. E. Auerbach, A. Concordel, P. M. Kornatowski, and D. Floreano,“Inquiry-based learning with robogen: An open-source software andhardware platform for robotics and artificial intelligence,” IEEETransactions on Learning Technologies, vol. 12, no. 3, pp. 356–369,2019. [Online]. Available: http://infoscience.epfl.ch/record/255227

[11] K. Miras and A. Eiben, “Effects of environmental conditions on evolvedrobot morphologies and behavior,” in Proceedings of the Genetic andEvolutionary Computation Conference, 2019, pp. 125–132.

[12] G. Hornby, M. Fujita, S. Takamura, T. Yamamoto, and O. Hanagata,“Autonomous evolution of gaits with the sony quadruped robot,” inGenetic and Evolutionary Computation Conference, vol. 2, 1999, pp.1297–1304.

[13] V. Vujovic, A. Rosendo, L. Brodbeck, and F. Iida, “Evolutionarydevelopmental robotics: Improving morphology and control of physicalrobots,” Artificial Life, 2017.

[14] M. Jelisavcic, M. de Carlo, E. Hupkes, P. Eustratiadis, J. Orlowski,E. Haasdijk, J. E. Auerbach, and A. E. Eiben, “Real-world evolution ofrobot morphologies: A proof of concept,” Artificial Life, vol. 23, no. 2,pp. 206–235, 2017.

[15] T. F. Nygaard, C. P. Martin, E. Samuelsen, J. Torresen, and K. Glette,“Real-world evolution adapts robot morphology and control to hardwarelimitations,” in Proceedings of the Genetic and Evolutionary Computa-tion Conference. ACM, 2018.

[16] J. E. Auerbach and J. C. Bongard, “Environmental influence on the evo-lution of morphological complexity in machines,” PLoS computationalbiology, vol. 10, no. 1, 2014.

[17] A. E. Eiben, “Grand challenges for evolutionary robotics,” Frontiers inRobotics and AI, vol. 1, p. 4, 2014.

[18] J. D. Weingarten, G. A. Lopes, M. Buehler, R. E. Groff, and D. E.Koditschek, “Automated gait adaptation for legged robots,” in Roboticsand Automation, 2004. Proceedings. ICRA’04. 2004 IEEE InternationalConference on, vol. 3. IEEE, 2004, pp. 2153–2158.

[19] T. F. Nygaard, J. Torresen, and K. Glette, “Multi-objective evolutionof fast and stable gaits on a physical quadruped robotic platform,” in2016 IEEE Symposium Series on Computational Intelligence (SSCI),Dec 2016.

[20] T. Homberger, M. Bjelonic, N. Kottege, and P. V. K. Borges, “Terrain-dependent motion adaptation for hexapod robots,” in To appear inproceedings of the International Symposium on Experimental Robotics(ISER), 2016.

[21] N. Kottege, C. Parkinson, P. Moghadam, A. Elfes, and S. P. Singh,“Energetics-informed hexapod gait transitions across terrains,” in IEEEInternational Conference on Robotics and Automation (ICRA), 2015.

[22] B. Jin, C. Chen, and W. Li, “Power consumption optimization fora hexapod walking robot,” Journal of Intelligent & Robotic Systems,vol. 71, no. 2, pp. 195–209, 2013.

[23] C. Dario Bellicoso, C. Gehring, J. Hwangbo, P. Fankhauser, and M. Hut-ter, “Perception-less terrain adaptation through whole body controland hierarchical optimization,” in 2016 IEEE-RAS 16th InternationalConference on Humanoid Robots (Humanoids), Nov 2016, pp. 558–564.

[24] J. Degrave, M. Burm, P.-J. Kindermans, J. Dambre et al., “Transferlearning of gaits on a quadrupedal robot,” Adaptive Behavior, pp. 4486–4491, 2015.

[25] S. Doncieux, N. Bredeche, J.-B. Mouret, and A. E. G. Eiben, “Evolu-tionary robotics: What, why, and where to,” Frontiers in Robotics andAI, vol. 2, p. 4, 2015.

[26] T. F. Nygaard, J. Nordmoen, K. O. Ellefsen, C. P. Martin, J. Tørresen,and K. Glette, “Experiences from real-world evolution with dyret:Dynamic robot for embodied testing,” in Symposium of the NorwegianAI Society. Springer, 2019, pp. 58–68.

[27] J.-B. Mouret and K. Chatzilygeroudis, “20 years of reality gap: a fewthoughts about simulators in evolutionary robotics,” in Proceedingsof the Genetic and Evolutionary Computation Conference Companion.ACM, 2017, pp. 1121–1124.

[28] N. Jakobi, P. Husbands, and I. Harvey, “Noise and the reality gap: Theuse of simulation in evolutionary robotics,” in Advances in artificial life.Springer, 1995, pp. 704–720.

[29] S. Nolfi, D. Floreano, O. Miglino, and F. Mondada, “How to evolveautonomous robots: Different approaches in evolutionary robotics,”Artificial life IV: Proceedings of the 4th International Workshop onArtificial Life, pp. 190–197, 1994, r. A. Brooks and P. Maes (eds.).

[30] S. Koos, J.-B. Mouret, and S. Doncieux, “The transferability approach:Crossing the reality gap in evolutionary robotics,” IEEE Transactionson Evolutionary Computation, vol. 17, no. 1, pp. 122–145, 2013.

[31] J. Nordmoen, T. F. Nygaard, K. O. Ellefsen, and K. Glette,“Evolved embodied phase coordination enables robust quadrupedrobot locomotion,” in Proceedings of the Genetic and EvolutionaryComputation Conference, ser. GECCO ’19. Association for ComputingMachinery, 2019, p. 133–141. [Online]. Available: https://doi.org/10.1145/3321707.3321762

[32] S. Chernova and M. Veloso, “An evolutionary approach to gait learn-ing for four-legged robots,” in IEEE/RSJ International Conference onIntelligent Robots and Systems (IROS 2004), vol. 3. IEEE, 2004, pp.2562–2567.

[33] J. Yosinski, J. Clune, D. Hidalgo, S. Nguyen, J. Zagal, and H. Lipson,“Evolving robot gaits in hardware: the hyperneat generative encodingvs. parameter optimization,” in Proceedings of the 20th EuropeanConference on Artificial Life, 2011, pp. 890–897.

[34] S. Kriegman, A. M. Nasab, D. Shah, H. Steele, G. Branin, M. Levin,J. Bongard, and R. Kramer-Bottiglio, “Scalable sim-to-real transfer ofsoft robot designs,” arXiv:1911.10290 [cs.RO, 2019.

[35] K. Rosser, J. Kok, J. Chahl, and J. Bongard, “Sim2real gap is non-monotonic with robot complexity for morphology-in-the-loop flappingwing design,” arXiv:1910.13790 [cs.RO], 2019.

[36] J. Collins, W. Geles, D. Howard, and F. Maire, “Towards the targetedenvironment-specific evolution of robot components,” in Proceedings ofthe Genetic and Evolutionary Computation Conference. ACM, 2018,pp. 61–68.

[37] T. F. Nygaard, C. P. Martin, J. Torresen, and K. Glette, “Self-ModifyingMorphology Experiments with DyRET: Dynamic Robot for EmbodiedTesting,” in 2019 IEEE International Conference on Robotics andAutomation (ICRA), May 2019.

https://robotics.sciencemag.org/content/4/26/eaau5872

http://arxiv.org/abs/1805.02965

http://infoscience.epfl.ch/record/255227

https://doi.org/10.1145/3321707.3321762

https://doi.org/10.1145/3321707.3321762

10

[38] J. Bongard, “Morphological change in machines accelerates theevolution of robust behavior,” Proceedings of the National Academy ofSciences, vol. 108, no. 4, pp. 1234–1239, 2011. [Online]. Available:https://www.pnas.org/content/108/4/1234

[39] T. F. Nygaard, C. P. Martin, J. Torresen, and K. Glette, “EvolvingRobots on Easy Mode: Towards a Variable Complexity Controller forQuadrupeds,” in Applications of Evolutionary Computation, P. Kauf-mann and P. A. Castillo, Eds. Springer International Publishing, 2019,pp. 616–632.

[40] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitistmultiobjective genetic algorithm: NSGA-II,” IEEE Transactions onEvolutionary Computation, 2002.

[41] A. Howard and H. Seraji, “Vision-based terrain characterization andtraversability assessment,” Journal of Robotic Systems, vol. 18, no. 10,pp. 577–587, 2001.

[42] J. C. Zagal and J. Ruiz-Del-Solar, “Combining simulation and realityin evolutionary robotics,” J. Intell. Robotics Syst., vol. 50, no. 1, pp.19–39, Sep. 2007.

[43] A. Gaier, A. Asteroth, and J.-B. Mouret, “Data-efficient exploration,optimization, and modeling of diverse designs through surrogate-assisted illumination,” in Proceedings of the Genetic and EvolutionaryComputation Conference, ser. GECCO ’17. New York, NY, USA:Association for Computing Machinery, 2017, p. 99–106. [Online].Available: https://doi.org/10.1145/3071178.3071282

[44] K. Chatzilygeroudis, R. Rama, R. Kaushik, D. Goepp, V. Vassiliades,and J. Mouret, “Black-box data-efficient policy search for robotics,”in 2017 IEEE/RSJ International Conference on Intelligent Robots andSystems (IROS), Sep. 2017, pp. 51–58.

https://www.pnas.org/content/108/4/1234

https://doi.org/10.1145/3071178.3071282

environmental adaptation of robot morphology and control … · 2020-03-31 · 1 environmental...

Documents