genetic algorithms adaptive control for an underactuated system

7/28/2019 Genetic Algorithms Adaptive Control for an Underactuated System

1/9

12 INTERNATIONAL JOURNAL OF COMPUTATIONAL COGNITION (HTTP://WWW.YANGSKY.COM/YANGIJCC.HTM), VOL. 3, NO. 1, MARCH 2005

Genetic Algorithms Adaptive Control for anUnderactuated System

Faical Mnif and Jawhar Ghommem

Abstract In this paper we develop an intelligent controlleron the basis of Genetic Algorithms for the stabilization of aclass of Underactuated Mechanical Systems. First we develop astatic State Feedback Controller with inner noncollocated partialfeedback linearization loop to stabilize the nominal system.Second, we develop a Genetic algorithm controller to maintainsystem stabilization and to adapt the controller variables to thechanges of system parameters. Simulations results made on a2-DOF whirling pendulum are given to illustrate our approach.Copyright c 2003-2005 Yangs Scientific Research Institute, LLC.

All rights reserved.

Index Terms Underactuated systems, whirling pendulum,genetic algorithms, adaptive.

I. INTRODUCTION

Amechanical system may be underactuated in several

ways. The most obvious way is for intentional design as

in the brachiaton robot of Fukuda [1], the passive walker of

McGeer [2], the acrobat [3] or the Pendubot [4]. Underactuated

systems also arise in mobile robot systems. A third way

that Underactuated systems arise is due to the mathematical

model used for control design as, and for example, when

joint flexibility is added in the model. It is also interesting to

note that certain control problems for fully actuated redundant

robots are similar to those for Underactuated robots [5]. The

class of Underactuated mechanical systems is thus rich in both

applications and problems.

In the last decade, control researchers have given consider-

able attention to many examples of control problems associ-

ated with underactuated mechanical systems. The key feature

of many of these problems is the nonlinear coupling between

the directly actuated degrees of freedom and the unactuated

degrees of freedom. Spong[4] proved that partial feedback

linearization may be applied to the control of underactuated

mechanical systems.

Control of underactuated systems is rather challenging even

in lack of any parameter uncertainties, however, it is important

to formulate and address robust stabilization/tracking problemsfor classes of underactuated systems. Using conventional tech-

niques, designing robust controllers for underactuated systems

Manuscript received August 10, 2003; revised August 29, 2003.

Faical Mnif(a) and Jawhar Ghommem(b), (a)Sultan Qaboos University,Department of Electrical and Computer Engineering. P.O. Box 33, Muscat123, Oman. (On Leave from the) Institut national des Sciences Appliqueeset de Technologie, INSAT, Tunisia. (b)Institut national des Sciences Ap-pliquees et de Technologie, INSAT, Tunisia. Email: [email protected](F.Mnif), [email protected](F. Mnif), Jo [email protected](J. Ghommem)

Publisher Item Identifier S 1542-5908(05)10102-X/$20.00Copyright c2003-2005 Yangs Scientific Research Institute, LLC.All rights reserved. The online version posted on September 1, 2003 athttp://www.YangSky.com/ijcc31.htm

can be quite challenging. By making use of heuristics, intel-

ligent control techniques potentially can greatly simplify the

synthesis of a controller for such plants. This paper examines

the design of an adaptive Genetic Algorithm (GA) together

with a noncollocated partial feedback linearization technique

to stabilize all degrees of freedom, including those that are not

directly actuated through this nonlinear coupling. The GA,

discovered by Holland [10], an optimization routine based

on principles of genetics and evolution, performs a directed

random search of a population of controllers to determine

which one is the best to implement. We will use this interesting

feature to online optimize and adapt controller parameters fora system whose parameters can change or are not perfectly

known. The GA has found numerous applications in off-line

optimization in both signal processing and control problems.

In this paper, we present an on-line algorithm to use GA for

adaptive control.

The system considered for our study is the whirling pen-

dulum. The whirling pendulum is a system consisting of an

inverted pendulum connected to an arm rotating within a

horizontal plane. It is a spatial robot with two degrees of

freedom and a single control input (i.e. it is underactuated).

Because it is underactuated and has complicated nonlinear

dynamics, the whirling pendulum is a good testbed for the

development of nonconventional advanced control techniquesand comparative analysis between control methods.

This paper is organized in the following manner: First

we describe the physical dynamic equations of the whirling

pendulum. Then we present the partial noncollocated feedback

linearization used on the system. The next two sections give

a description of the GA algorithm optimization method and

simulation results.

I I . THE WHIRLING PENDULUM

The whirling pendulum is shown in Fig.1. It is a spatial

pendulum whose suspension point is attached to another mass

M by means of a vertical shaft, as shown. The plane of thependulum is orthogonal to the radial arm of length R. The

shaft is subject to torque u. We ignore frictional effects here.

l: pendulum length

I: Bob inertia around its center of gravity

m: pendulum bob mass

M: whirling mass

g: gravitational acceleration

R: radius of arm

: shaft torque


2/9

MNIF AND GHOMMEM, GENETIC ALGORITHMS ADAPTIVE CONTROL FOR AN UNDERACTUATED SYSTEM 13

Fig. 1. The whirling pendulum.

:angle of pendulum from the upward vertical :angle of mass M from a fixed vertical plane.Erect an x-y-z coordinate system, with the z axis vertical

along the shaft and the x-y plane in the plane of the horizontal

rod. Denote the angle of the horizontal rod with respect to the

positive x-axis by . Refer to Fig.2

Fig. 2. Looking down on the whirling pendulum.

The coordinates of the mass M are x = R cos , y =R sin , and z=0 and so the velocity is

x = R cos , y = R cos , z = 0.

The kinetic energy of the mass M is therefore

KM =M

2

x2 + y2 + z2

=

1

2M R22

The coordinates of the pendulum bob with mass m are

x = R cos l sin sin ,

y = R sin + l sin cos

z = l cos .

The velocity of the bob is the vector with components

x = R sin l sin cos l cos sin

y = R cos l sin sin + l cos cos

z = l sin

The kinetic energy of the bob is thus given by

Km =m

2x2 + y2 + z2

=1

2I2 +

m

2

R22 + l22 sin2 + l22

+2Rl cos

. (1)

The potential energy of the system is

V = mgl cos (2)

Euler-Lagrange dynamic equations

The equations of motion can be obtained using the Euler-

Lagrange formulation as

d

dtL

q

L

q = (3)

where L = K V and K = Km + KM.The differential equations describing the system are then

1 = [M R2 + m(R2 + l2 sin2 )] + mRl cos

+ml2 sin(2) mlR sin 2 (4)

0 = mlR cos + (I+ ml2)

ml2 sin cos 2 mgl sin (5)

In compact form, the system can be written as

M(q)q+ C(q, q)q+ G(q) = u (6)

where

q =

q1 q2

=

and

M(q) =

m11 m12m21 m22

=

M R2 + m(R2 + l2 sin2 ) mlR cos mlR cos I+ ml2

C(q, q) =

c11 c12c21 c22

=

2ml2 cos()sin(). mlR sin()2

mlR sin()2 0

G(q) =

g1g2

=

0mgl sin

and u =

10

we call also

C(q, q)q =

h1 h2

Property 1


3/9


The generalized mass matrix M (q) is symmetric andpositive definite.

Property 2

It is also easy to verify that M(q) 2C(q, q)is a skewsymmetric matrix.

III. PARTIAL NONCOLLOCATED FEEDBACK

LINEARIZATION

CONTROL FOR THE

WHIRLING

PENDULUM

A. Controller Design

An interesting property that holds for the entire class of

underactuated mechanical systems is the so called noncollo-

cated partial feedback linearization property [4], which is a

consequence of the positive definitiveness of the inertia matrix.

Noncollocated linearization refers to linearizing the passive

degrees of freedom, which refers here to. The choice of

noncollocated feedback linearization control is obvious here

since is to be considered as the output of our system which

corresponds to the unactuated degree of freedom.

Recall the dynamical equations for the whirling pendulum

are given by two second order coupled differential equations

m11q1 + m12q2 + h1 + g1 = (7)

m12q1 + m22q2 + h2 + g2 = 0 (8)

By solving for q2 in equation (8) and substituting into (7),

we may rewrite Equation (7) as

m11q1 + h1 + g1 = (9)

where the terms m11, h1 and g1are defined as

m11 = m11 m12m122 m12, h1 = h1 m12m

122 h2,

g1 = 1 m12m122 g2.

The nonlinear feedback

= m111 + h1 + g1 (10)

defines a feedback linearizing controller for equation (9) with

the new input 1.

By including the feedback linearizing control the system

dynamics may be rewritten as

q1 = v1 (11)

m22q1 + h2 + 2 = m121 (12)

Equation (11) is linear from input 1to output q2. Equation

(12) represents the internal dynamics of the system.

Definition 1: The system (11)-(12), equivalently the sys-

tem (7)-(8) is said to be Strongly Inertially Coupled iffrank(m12) = 1 for all q.

This definition is essentially a controllability condition and

ensures that the acceleration vector 1 in (12) may be used

as a control input to control the response of q2 according to

the method of Integrator backstepping [8]. Note that Strong

Inertial Coupling that the number of active degrees of freedom

be at least as great as the number of passive degrees of

freedom. Here we have m12 = 0 for =2 + k. Under

this assumption, we may compute the inverse of m12.

Define 1 in (12) according to

1 = m112 (m112 + h2 + g2) (13)

where 1is an additional control input to be determined. With

this choice for the control input 1the system becomes

m12q1 + h2 + g2 = m222 (14)

q2 = 2 (15)

Thus, we see that the passive degree of freedom q2() havebeen linearized and decoupled from the rest of the system and

that equation (15) describing the motion of the active joint

of the whirling pendulum is now representing the internal

dynamics of the system relative to the output q2 = . Theactual control input is given by combining (10) and (13),

and after some algebra, as

= m122 + h1 + g1 (16)

wherem12 = m12 m11m

112 m22

h1 = h1 m11m112 h2

g1 = g1 m11m112 g2

Using the non-collocated partial feedback linearization de-

veloped above, a PD-like controller may be defined for q2 as

to achieve a desired trajectory qd2 :

2 = qd2 + k2(q

d2 q2) + k1(q

d2 q2) (17)

or equivalently

2 = d + k2(

d ) + k1(d ) (18)

Define now

1 = q2 qd2 , 2 = q2 qd2 , z1 = q1, z2 = q2

and the output error y2 = q2 q2d, the closed loop systemmay be written as

1 = 22 = k11 k22z1

= z2z2 = m

112 (h2 + g2) m

112 m22(q

d2 k11 k22)

y2 = 1

in matrix form we can write this as

= Az = s( , z, t)y2 = C

(19)

where

A =

0 1k1 k2

; C =

1 0


4/9


and the function

s( , z, t) =

z2

m112 (h2 + g2) m112 m22(q

d2 k11 k22)

(20)

We see that the surface = 0 in the state space defines aglobally attractive integral manifold for the system and that

the expression

z = (0, z , t) (21)

defines the zero dynamics relative to the output y2 = 1.Theorem 1 Consider the system (19), suppose s(0, z , t) = 0

for t 0, i.e. (0, z0) is an equilibrium of the full system (19)and z0 is an equilibrium of the zero dynamics (21). If A is

Hurwitz and if the equilibrium z0 of the zero-dynamics (21)

is locally (Locally asymptotically) stable, then the full system

(19) is locally (Locally asymptotically) stable.

B. Simulation Results

Simulations were performed over a prototype with the

following parameters: M = 0.5kg, m = .2kg, l = R = 0.5kg,and I = 0.

The heuristic nonlinear feedback controller results are

shown in Fig. 3 the final values for the gains, which were

obtained via heuristic tuning were k1 = 10 and k2 = 15. Theconvergence of the position of the arm can not be determined

a priori since the equilibrium manifold of the system is defined

by (, 0, 0, 0).Figure 4 shows the link displacements for the two-link

simulation by control law (16) when a parametric uncertainty

is considered in the dynamics of the system. A 20% of mass

uncertainty was considered for the two links. One can notice

that the control (16) fail to maintain the stabilization of the

uncertain system.

IV. GENETIC ALGORITHM BASED CONTROLLERS

As it could be imagined, the partial feedback linearization

control discussed in the last section failed to stabilize the sys-

tem when a small change in system parameters is considered.

From an intuitive sense, an adaptive control technique to tune

the system controller parameters may help to improve the

response of the balancing and eliminate the instability behavior

occurred for perturbed system.

The GA is an optimization routine based on the principles of

Darwinian Theory and natural genetics. It has been primarily

used as an off-line technique to perform a directed searchfor the optimal solution to a problem. One of the sought-

after goals in control engineering is to make systems adaptive

to changes in the environment. Adaptive controllers have

been successful and are available in the market. The role of

genetic algorithm could then be to provide a further tool and

complement the existing arsenal of algorithms. Also, structure

of GA could probably be used as a background to real time

control of hybrid systems, that is, one single best adaptive

controller is doing its job on a particular plant while there may

be a whole population of adaptive algorithms that is evolving

at the background. Here we show that the GA can be used

0 5 10 15 20 25 30 35 40 45 50-0.2

0

0.2

0.4

0.6

0.8

1

1.2

time (sec)

position

(rad)

phi

(a)

0 5 10 15 20 25 30 35 40 45 50-1.6

-1.4

-1.2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

position

(rad)

time (sec)

theta

(b)

Fig. 3. Stabilization of and with control law (16).

can be used on-line in a controller design to adaptively search

through a population of controllers and determine the member

most fit to be implemented over a given model.

This section starts with a brief introduction to the GA,

next we will discuss how to implement the GA-based control

algorithm, and the results obtained.

A. The Genetic Algorithm (GA)

The GA performs a parallel, directed, random search for

the fittest element of a population within a search space.

The population simply consists of strings of numbers called

chromosomes that hold possible solutions of a problem. The

members of a population are manipulated cyclically through

three primary genetic operators: selection, mutation and re-

placement to produce a new generation (population) that

tends to have higher overall fitness evaluation. By creating

successive generations which continue to evolve, the GA will

tend to search for a global optimal solution of an objective


5/9


0 5 10 15 20 25 30 35 40 45 50-1

-0.5

0

0.5

1

1.5

2

2.5phi

position

(rad)

time (sec)

(a)

0 5 10 15 20 25 30 35 40 45 50-1.6

-1.4

-1.2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4theta

time (sec)

position

(rad)

(b)

Fig. 4. Responses of (Theta) and (phi) with control for the uncertain system.

function f. Figure 5 shows a general scheme of a Genetic

Algorithm.

The key to the search is the fitness evaluation. The fitness

of each of the members of the population is calculated using

a fitness function that characterizes how well each particular

member solves the given problem. Parents for the next gener-

ation are selected based on the fitness value of strings. That is,strings that possess higher fitness value are more likely to be

selected as parents, and thus, are more likely to survive to the

next generation. The first stage is the selection stage. Although

there are many techniques for the selection of parents, a

commonly used method is the Roulette Wheel selection [11],

a proportionate selection scheme which bases the number of

offspring on the average fitness of the population. Strings with

greater than the average fitness are allocated more than one

offspring.

For the evolutionary process, the selected members of a

population are randomly paired together to mate and share

Fig. 5. General scheme of a Genetic Algorithm.

genetic information. The GA accomplishes this sharing by the

swapping of portions of strings between parents. The simplest

model is the single-point crossover, where the selection of a

random position between 1 and l 1, where l is the lengthof chromosome, indicates which portions are interchanged

between parents. Multiple crossover points can also be se-lected, where strings swap several portions of their strings.

The resulting interchange produces two new strings. The actual

interchange is often based on a crossover probability pc. The

sharing of genetic material occurs only if a randomly generated

normalized number is greater than pc; otherwise, the strings

are not affected. Once this crossover stage is complete, the

new strings are subject to mutation based on the mutation

probability pm. This genetic operation randomly selects a

string and position (between 1 and l1) and changes the valueat that position to a random number. Since the variations due

to the mutation operation occur randomly, this tends to keep

the GA from focusing on a local minimum.

The final step in the GA is the replacement operation, whichdetermines how the new subpopulation produced from the

genetic operations will be introduced as a new generation.

Two methods exist: complete generation replacement, where

each population produces an equal number of strings, which

then completely replace the parent population, and partial

replacement where only small portion of new strings are

developed and introduced into the population. In addition, the

former procedure may have a tendency to throw away a best

fit solution since the entire generation is replaced; for this

reason, often the complete generation replacement method is

combined with an elitist strategy, where a one or more of


6/9


the most fit members of a previous population are passed,

untouched, to the next generation. This tends to ensure that

there is always a string within the population that is good

solution for the problem, this operation is called crossover.

Fig. 6. Crossover operator.

The genetic algorithm procedure is illustrated in Fig 7.

We start with a randomly (or specifically chosen) initial

population of members and perform the three above operations

based on the user-specified probabilities of occurrence. At

the conclusion of this we have automatically produced the

next generation. The repeat the operators on each consecutive

generation until an acceptable result occurs.A key part of the success of the GA is the encoding of

the strings. Obviously we need to choose parameters that tend

to have reasonable information about the problem. The actual

number and type of variables that will be encoded as strings

is application specific. Common encoding of multiple real-

value continuous variables consists of placing them in integerform and concatenating them, keeping track of decimal places

and variable separation points for decoding. Thus two real

numbers 12.345 and 9.8765 could be represented in a 10-digit

long string 123456789. Decoding can occur by first splitting

the string at the appropriate point (in the case, between the

fifth and sixth digit) and keeping track of the appropriate

decimal places. In this case, the actual values are represented

by multiplying the integer given by the first portion of the

string by 103 and the integer in the second portion of thestring by 104. This is representative of the base-10 encodingof strings; a base-2 encoding is also popular, where strings

are represented in the binary form as a series of ones and

zeros. Regardless of the base of the encoding, the procedureis the same. The strings are decoded for fitness evaluation, but

remain in encoded form for genetic manipulation, producing

successive generation until a user-specified termination crite-

rion is reached.In summary, a GA procedure can be summarized in the

following steps.

1) Randomly create an initial population of individuals

(chromosomes).

2) Iteratively perform the sub-steps below (in one genera-

tion) on the population till the termination criterion is

satisfied.

Assign a fitness fi value to each individual of the

population using the fitness measure.

Select one or two individual from the population

probabilistically based on fitness to participate in

genetic operations described next. Reselection is

allowed.

Ps =fi

Ni=1 fi

(22)

Create new individuals from the old using one of

the genetic operations with which are also based on

specific probability. The operations are:

Reproduction: Copy the selected individual into

the population.

Crossover: Create new offspring by combining

randomly chosen parts from the two selected

parents

Mutation: Create new offspring by randomly

mutating a randomly chosen part of one selected

chromosome.

3) After the termination criterion is met, the run is stopped.This may occur when the number of generations is met,

or when the solution is met. Usually the best individual

(best-so-far) is returned as the result of the run. This

may or may not be the satisfactory (or an approximate)

solution to the problem.

If we encode controller gains (k1, k2) in (17) as the stringsof the population and base the fitness the fitness value of each

member on the performance of that particular controller the

string represents on a model of the system, then we can use

the GA to find optimum controller. Furthermore, if we can

implement the GA online, and use it to find the best controller

at every sampling period or every few sampling times thenwe can use the GA to produce an adaptive control routine that

will try to find the optimum controller to use under different

conditions.

B. On-line Control Using Genetic Algorithms

Development of a control algorithm based on the GA

involves searching for an optimal controller during each (or

every several) sampling periods. The fitness evaluation in this

case consists of characterizing the expected performance of

each controller in the population based on error analysis.

This type of evaluation requires both a model of the plant

for prediction purposes and the development of a reference

model for comparison. The nonlinear model developed in

Sec. 2 is used to predict ahead how well the controller will

perform several time steps into the future. The reference model

characterizes the desired response of the system which is

defined here by

qd2 = 0 (23)

Consider the error as the difference between the actual

measure of the angle q2 and the desired angle qd2

e = q2 qd2 (24)


7/9


and the performance index to be minimized:

tft0

e2dt (25)

under the constraint :

|e| emin = 103

(26)One may arrange (25) and (26) into a single function by

defining the whole criterion function to minimize as:

min J =

tft0

e2dt + max (ei)|t1t0

(27)

Our aim is to design genetic algorithm procedure for which

given an objective function (27) to be minimized, provides in

return at each sample the optimal k1, k2 which introduced into

the control law (16) leads a swift exponential stability of the

pendulum angle q2.

The two strings represented by gains k1 and k2 of length2m and decoded as:

ki = ki(low) +

2m1j=0

bitj2j

2n 1

ki(high) ki(low)

, i = 1, 2. (28)

with k(low), k(high), respectively denotes the lower and upper

boundary of the interval wherein varies the parameter ki, i =1, 2 to be specified.

C. Simulation Results

Results of the simulation of an adaptive GA on the whiling

pendulum are shown in Fig. 8. The upper and lower boundsfor the feedback gains are defined as: 8 < k1 < 13 and 12

genetic algorithms adaptive control for an underactuated system

Documents