quasioptimal control search algorithms using state-space partitioning

8
U.S.S.R.Comput.Maths.Matk.Pkys.,Vo1.30,No.5,pp.1-8,1990 0041-5553/90 $10.00+0.00 Printed in Great Britain 01991 Pergamon Press plc QUASIOPTIMAL CONTROL SEARCH ALGORITHMS USING STATE-SPACE PART~TIONING?~ E.N. OREL A depth-first algorithm (DEPTH) and a width-first algorithm (WIDTH) for solving optimal control problems are considered. In El] we considered the construction of quasioptimal control by a depth-first method with backtracking and state space partitioning into a finite number of classes [21. The method was applied in Ill to the problem of driving an automobile to a parking place 131 when there are obstacles around. It was suggested in [ll that this problem should be used as a test case for optimal control algorithms, and the following question was posed: are there other algorithms that are capable of solving this problem with sufficient efficiency? This paper provides a positive answer to this question. We perform a theoretical analysis of two algorithms, DEPTH and WIDTH, and compare the solutions of the test problem produced by these algorithms. The algorithm DEPTH is studied in greater detail than in 11, 41. The algorithm WIDTH is published here for the first time. It is a generalization of Dijkstra's well-known (equal-value)algorithm 15, 61 for solving the shortest path problem on a finite graph. Depth-first algorithms [S-7] move to a maximum distance along a selected trajectory and, if unsuccessful, backtrack a certain number of steps. Width-first algorithms, on the other hand, generate the tree of paths from a given starting point. The algorithms of these two classes obviously have different computer time and memory requirements. Although the algorithms DEPTH and WIDTH considered in this paper are fundamentally different, they share a common idea: both partition the state space into a finite number of sets. 1. Statement of the problem Consider the graph (X, 6-j) in which the vertex set X is partitioned into classes Ye,...,YK (I( is a natural number) and each arc (5, y)=Q, x,$=X, is of length z(s, y) > 0. For a path n=(x,,...,t,) through the graph, the length is defined in the usual way as (1.1) We introduce the following notation: X* = k for xfsY& and X - y for x* = ?J*. The set Y* is called terminal. Let us define a many-move two-person game [2]. The players make moves one after the other. The states of the game before the "white" player makes a move are the class indices k, 1 LZ k c K, and the states of the game before the "black" player makes a move are the elements x=X. The "white" player making a move in state k selects any x=Y,, which may be the state in "black" player's countermove. The "black" player making a move in state x=X\Y,selects an arc(x,y)=Q and pays a penalty 2(x, y), after which the game goes to the state k = y** The game ends normally if it reaches the terminal class after a "black" move (y* = 0). Abnormal termination occurs when the "white" player reaches a deadlock x (the set {yeX:(x, I/) =Qj is empty). In this case, the black player pays an infinitely high penalty. We assume that the black player attempts to minimize the total penalty while the white player attempts to maximize it. This is a positional game with complete informationand it has a saddle point in the class of pure strategies [8, 131. We denote by m(k)the value of the game, i.e. the total penalty of the black player under optimal strategies of the players when if the white player makes the first move in "state" k(m(G)=G). It is easy to show [21 that cpsatisfies the equation cp(k)=maxlmintl(z,y)+cp(y')/(s,y)EQfIx~Y,l. (1.2) De$Ynition 1. The trajectory (1.1) terminating in Y0 is called (weak-sense)quasioptimal ~(~)~~(s*), (1.3) where s is the initial state of the trajectory (s=xo). ~~~k.v~ck~~~.~at.~t.~~~.,30,9,1283-1293,1990 USSR 30:5-A 1

Upload: en-orel

Post on 15-Jun-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Quasioptimal control search algorithms using state-space partitioning

U.S.S.R.Comput.Maths.Matk.Pkys.,Vo1.30,No.5,pp.1-8,1990 0041-5553/90 $10.00+0.00 Printed in Great Britain 01991 Pergamon Press plc

QUASIOPTIMAL CONTROL SEARCH ALGORITHMS USING STATE-SPACE PART~TIONING?~

E.N. OREL

A depth-first algorithm (DEPTH) and a width-first algorithm (WIDTH) for solving optimal control problems are considered.

In El] we considered the construction of quasioptimal control by a depth-first method with backtracking and state space partitioning into a finite number of classes [21. The method was applied in Ill to the problem of driving an automobile to a parking place 131 when there are obstacles around. It was suggested in [ll that this problem should be used as a test case for optimal control algorithms, and the following question was posed: are there other algorithms that are capable of solving this problem with sufficient efficiency?

This paper provides a positive answer to this question. We perform a theoretical analysis of two algorithms, DEPTH and WIDTH, and compare the solutions of the test problem produced by these algorithms. The algorithm DEPTH is studied in greater detail than in 11, 41. The algorithm WIDTH is published here for the first time. It is a generalization of Dijkstra's well-known (equal-value) algorithm 15, 61 for solving the shortest path problem on a finite graph.

Depth-first algorithms [S-7] move to a maximum distance along a selected trajectory and, if unsuccessful, backtrack a certain number of steps. Width-first algorithms, on the other hand, generate the tree of paths from a given starting point. The algorithms of these two classes obviously have different computer time and memory requirements. Although the algorithms DEPTH and WIDTH considered in this paper are fundamentally different, they share a common idea: both partition the state space into a finite number of sets.

1. Statement of the problem

Consider the graph (X, 6-j) in which the vertex set X is partitioned into classes Ye,...,YK

(I( is a natural number) and each arc (5, y)=Q, x,$=X, is of length z(s, y) > 0. For a path

n=(x,,...,t,)

through the graph, the length is defined in the usual way as

(1.1)

We introduce the following notation: X* = k for xfsY& and X - y for x* = ?J*. The set Y* is called terminal.

Let us define a many-move two-person game [2]. The players make moves one after the other. The states of the game before the "white" player makes a move are the class indices k, 1 LZ k c K, and the states of the game before the "black" player makes a move are the

elements x=X.

The "white" player making a move in state k selects any x=Y,, which may be the state in "black" player's countermove.

The "black" player making a move in state x=X\Y,selects an arc(x, y)=Q and pays a

penalty 2(x, y), after which the game goes to the state k = y** The game ends normally if it reaches the terminal class after a "black" move (y* = 0).

Abnormal termination occurs when the "white" player reaches a deadlock x (the set {yeX:(x,

I/) =Qj is empty). In this case, the black player pays an infinitely high penalty.

We assume that the black player attempts to minimize the total penalty while the white player attempts to maximize it. This is a positional game with complete information and it

has a saddle point in the class of pure strategies [8, 131. We denote by m(k)the value of

the game, i.e. the total penalty of the black player under optimal strategies of the players

when

if

the white player makes the first move in "state" k(m(G)=G).

It is easy to show [21 that cp satisfies the equation

cp(k)=maxlmintl(z,y)+cp(y')/(s,y)EQfIx~Y,l. (1.2)

De$Ynition 1. The trajectory (1.1) terminating in Y0 is called (weak-sense) quasioptimal

~(~)~~(s*), (1.3)

where s is the initial state of the trajectory (s=xo).

~~~k.v~ck~~~.~at.~t.~~~.,30,9,1283-1293,1990

USSR 30:5-A 1

Page 2: Quasioptimal control search algorithms using state-space partitioning

2

Statement of the probtem. Find an algorithm that constructs a quasioptimal trajectory

for any given initial state s=X\Yo using a memory space of the order of O(k).

This problem is a restatement of optimal control problems [l, 2, 91, where X is the state

space and Y, is the terminal set. The arcs of the graph are interpreted as elementary traje- ctories forming the set of paths. The notation (2, y) for the arcs does not restrict the generality of analysis and merely simplifies the mathematics.

Let o be the Bellman function (the value of o(s)for x=X is the length of the shortest

path from x toY,).Since the set X is of arbitrary cardinality, the construction of the

Bellman function or the optimal trajectory is a computationally unsolvable problem in general. We will focus on the construction of a quasioptimal trjectory. It is easy to show that the Bellman function and the value of the game satisfy the relationship

OGo(z)Gp(r') VXEX.

As the partition of the set X\Y, into classes is successively refined (K+m),the value

of the game cp(s') tends to the Bellman functiono(s)under relatively unrestrictive topologi-

cal assumptions (see [21). This justifies the definition of quasioptimal trajectory and the problem formulation introduced above.

2. Restriction of the search domain

In general, it is impossible to examine all the trajectories, and the search domain therefore should be restricted. There are obviously two ways to restruct the search domain, which will be implemented below.

First, in the game considered above, which limits the length of the path required, it is not to black's advantage to visit the same classes Yk. Hence we conclude that the solution

should be sought among trajectories that never visit each class Y k more than once.

Second, if the target class has been identified, then the shortest arc should be chosen among all the arcs that lead to this class.

These conclusions are formalized as follows. For arbitrary x=X\Y, and k, 1 5 k s K,

consider the setD(x, k)=(y~Y,:(z, y)=Q). To the couple (x, k) such that D(x, k) f 0 associate

some state Y(Z,k)=D(z,k) that minimizes the length of the elementary trajectory:

(we assume that the minimum is reached; otherwise an arbitrary ~10 is added to the right-

hand side of formula (1.3) and the algorithms are somewhat modified). Put

Y(z)=(y=X:y=P(z,k), O<k=GK).

Definition 2. The trajectory (1.1) is called a quasicircuit if z,-x,.

For s=X\Y, denote by p the set of trajectories (1.1) that start at s,s=X\Y,, do not

contain quasicircuits, and satisfy the condition

xl+~=r(xi,I:+,), i=O,i ,...,n-4.

It is easy to show that the set II" is finite and has at most K-K! elements. Approximately

the same bound applies to the set of arcs comprising II' and to the set Xs of all intermediate

vertices in the paths from 11". The set Xs is defined as follows: x=X9 if and only if there

exists a path JCEII' (see (1.1)) such that xi = x for some i, 0 5 i g n. Finally, putY,'=X'IIY, andr"(s)=r(X)nX",x~X".The couple (x",Y") is a finite graph.

Anticipating a later result, we note that the algorithms WIDTH and DEPTH generate only

trajectories from the class II". Therefore Y(r) for x=x' is interpreted as the set of daughter vertices of x.

Consider a modification of the game of Section 1, the white player in state k,

in which the on$y difference is that 1 5 k 5 K, selects points from the set Yk . Since the freedom

of the white player is now substantially restricted, the odds of the black player substan- tially improve. This is expressed by the formula

m"(k)%(k), lGk<K,

where cp' is the value of the modified game. For the function cp' formula (1.2) takes the form

cp"(k)=max[min[~(~,y)+cp'(~')ly~r(x)llx~Y,'l. (2.1)

Note that the points y=P(x) selected by the black player do not necessarily belong toXS. Definition 3. The trajectory (1.1) that starts at s=X\Y, and ends in Y,is called

(strong-sense) quasioptimal if

Page 3: Quasioptimal control search algorithms using state-space partitioning

3

L(n)Gp’(s’).

Restriction of the search domain not only enables us to inspect in acceptable time all the trajectories that are suspected for "quasioptimum" but also ensures (which is equally important) a better approximation to the optimal trajectory than formula (1.3). Indeed,

with a coarse partition of the set WY, into classes, the function 'p may be substantially

greater than Bellman's function o because the white player has an excessive freedom of

choice. In this case, the (strong-sense) quasioptimal trajectory may turn out to be much

longer than the optimal trajectory.

Yk (in particular, for k = s*, YkS

At the same time, the sets Yks are much smaller than

consists of precisely one element S). This suggests that

a strong-sense quasioptimal trajectory may be sufficiently close (by functional value) to the optimum. The test example considered in Section 5 indicates that this suggestion is not entirely unfounded.

Before presenting a description of the algorithms, let us note some of their main features. Both algorithms construct trajectories that are strong-sense quasioptimal, but they require different data structures. The procedure DEPTH saves only one trajectory, which may be extended or reduced at the right end. Such a trajectory is conveniently stored in a stack. The procedure WIDTH generates the tree of paths originating from s. The tree

nodes are the elements of the set Xs, and the tree contains at most one representative from

each class Y '. k

All tree nodes are assigned to two lists. The tree leaves are stored in

the list OPEN and the remaining nodes are stored in the list CLOSED. The algorithms are described in a free version of an incompletely formalized program

design language PDL (see 171). Comments are enclosed in braces, as in Pascal.

3. Depth-first algorithm

The algorithm generates the trajectory

n=(S,X*,...,5,), 04, n=II”, (3.1)

and the nonnegative function f(k), such that fG$. Initially, f may be defined arbitrarily,

as long as it satisfies the condition

OGf(k)-(cp”(k), OGk<K (3.2)

(for instance, we may take f-o). During the execution of the algorithm, the values of the

function f may only increase. The algorithm generates a path which is stored in a stack; the number of stack elements may not only increase but also decrease. Removal of an element from the stack automatically reduces n by one, while the addition of an element to the stack increases n by one.

The last n-th element in the stack is the watchman and it is denoted by 3cn. Since cp"(S')

is unknown, we specify a sufficiently large number M. If the length of the trajectory exceeds M, then the search is terminated and the problem is declared to be unsolvable.

PROCEDURE DEPTH {Push s in stack:}

n:=O; ( x~:=s’

WHILE r,*Y, DO

Select a vertex yer(s,) that minimizes p=Z(x,,y) +f(y’); IF pGf(z..*) THEN {push y in stack: ]

n: = n+l; cc

ELSE (update f(f,*)l]

: =y

IF n = 0 THEN {no backtracking:}

IF f(s*) > M THEN {trajectory longer than M:}

Print message "Problem unsolvable";

STOP

ELSE {make the first move:}

n := 1; xn = y

ENDIF (f(s*) > M)

ELSE {delete last element from stack:]

n:=n-1

ENDIF (n = 0)

ENDIF (pGj(&,*)) ENDWHILE;

Print stack with the sought trajectory END PROCEDURE {DEPTH).

Page 4: Quasioptimal control search algorithms using state-space partitioning

A

Theorem 1. If @(s')<M and inequality (3.2) is initially satisfied, then the procedure

DEPTH will construct in a finite number of steps a (strong-sense) quasioptimal trajectory that starts at s.

Proof. The procedure makes foward moves only when f(zn*)>p=l(an, y)+f(y'). Therefore, along the entire trajectory (3.1) stored in the stack we have

f(5,')>1(z*,X,+1)+f(5;+,), i=O,i ,...,n--i.

Adding these inequalities term by term from i = m to i = m + j - 1, j > 0, we obtain

Ill+,--L

f(zm*)> r, &%5r+*)+f(lm+,), (3.3) i-m

whence

(3.4)

Inequality (3.4) shows that n=n'. Moreover, the function f is strictly decreasing along the trajectory n. Setting m = 0 and j = n in (3.3), we obtain

f(s')%%)+f(X,'). (3.5)

We will show that the inequality (3.2), which holds initially, remains true throughout the execution of the procedure. Indeed, assume that (3.2) holds prior to some iteration of the WHILE loop. In this iteration, f(z,*) increases only if

f(~~'~~p=~(~.,~)+f(y~)=min[l(s,,z)+j(z')Jz~~(s,)].

By the induction hypothesis,f(z')<r#(z'). From (2.1) it follows that pG~'(x,'). The inequa-

lity f(~;)<;'(z,,') is preserved after the assignment j(z,,'):=p in the procedure, which it was

required to prove. Suppose the procedure DEPTH terminates normally at a certain instant i.e., for x,EY,.

From (3.5) we obtain

i.e., the trajectory n is quasioptimal. It remains to show that the procedure eventually stops. Assume that this is not so.

Since n i K + 1, the procedure backtracks infinitely often. Each such backward move

increases the value of the function f. Let Z={y,,..., y,) be the set of states from which the

backward moves are made an infinite number of times. Let N, be the index of WHILE loop

iteration starting with which backward moves are made only from these points. While

increasing, f(k) never exceeds f(s*) and thus does not exceed min (M,cp"(s')). Therefore, the

function values have finite limits a,,...,a,.

limit Uj is minimal.

Let yj be the vertex of the set Z on which the

Let N, (>N,)be the iteration index starting with which

f(y,')>a,-s/2, i=l, 2,...,m,

where e is the minimal arc length in the graph (x’, r’). LetN,andN,be two successive itera-

tions (N,>N,>N,>N,) on which f(yj:') increased. For the function to increase on the move N,,

it is necessary that for some N, (N,>N,>N,) the value f(k) increased for some k d Y.:~. In

this case, on iteration N,we have 3

p=Uy,, F(yj, k))+f(k)~e+al-e/Z=a,+e/2,

whence f(yj*) > a . . The contradiction proves the theorem.

Note that td algorithm will run somewhat faster if y is sought in the set F(m), and

not in r'(x,). To this end, it suffices to check if the stack contains a state equivalent to

Y. This check, however, increases the memory requirements (although very slightly).

4. Width-first algorithm Depth-first search generates a function f(k) which is interpreted as the distance of the

class Y k from the terminal set Y,. The procedure WIDTH described below uses a different func- tion, h(k), which should be interpreted as the distance from the source 8 to the set Yk.

The algorithms also differ by the fact that the function f increases over time while the function h decreases. Initially we set h(k):--, k+s’, i.e., all distances are assumed to

be infinite. This initial assignment is typical of width-first algorithms: it implies that

Page 5: Quasioptimal control search algorithms using state-space partitioning

5

the classes Yk are as if unreachable from the vertex S.

As soon as the existence of a path n from S to Yk has been established, the value of

h(k) is set equal to L(n) and the final vertex FEY< of the path n is entered in the list OPEN.

At the time when the path n from s to FEY,* is discovered, the OPEN list already may

contain some vertex s - y. If L(n) is less than the path length from s to z (which obviously equals h(k)), this means that the vertex y is more promising than 2. In this case, y is entered in the OPEN list overwriting z and h(k) is set equal to L(n).

The tree is expanded by opening its leaves. The watchman vertex x with the minimum h is selected from the OPEN list. Then for each vertex yer'(z) we investigate whether or not

it should be included in the OPEN list by the technique described above. Then x is moved to the CLOSED list. The problem is solved as soon as we discover that the watchman vertex

belongs to the class Y,.

PROCEDURE WIDTH A:=m;

Put s in OPEN list;

h(sf) := o; x := s; WHILE s$Y, DO (open the vertex x:)

FOR each vertex yEY(x) without equivalent states in the CLOSED list DO

p:=h(x')+l(x, y);

IF p<h(y') THEN (reduce h(y*) and update the OPEN list:i

IF h(y')<m

THEN {the OPEN list contains a vertex z - y) Remove the vertex E “y from the OPEN list

ENDIF {h(y')<m];

h(y'):=p; Put y in the OPEN list

ENDIF (p<h(y')}

ENDDO {end of enumeration of the vertices y=Y(z)};

Put x in the CLOSED list;

IF CLOSED list is empty THEN {the tree is dead:)

Print message "Problem unsolvable";

STOP

ELSE (select another watchman vertex:}

x := argmin[h(y+)ly belongs to the OPEN list]

BNDIF (the OPEN list is empty)

ENDWHILE;

Backtrack to restore the traced path from s to x=Y, and print it END PROCEDURE (WIDTH}.

Theorem 2. If (P*(s*)c-, then the procedure WIDTH will construct in a finite number of

steps a strong-sense quasioptimal trajectory. Proof. Assume that the many-move game (Sections 1 and 2) starts in a class s* and the

white strategy k++y(k)EYha is known. Then the task of the black player is to find the short-

est path from S* to 0 in a graph with the vertices 0, 1, . . . . K and arcs (k, m) of length l(y(k),I’(y(k),m)). This path can be constructed by Dijkstra's algorithm [5, 61. If the

strategy y(k) is optimal, then the length of this path equals cp'(s'), and in general the path

length does not exceed cp'(s').

It is easy to see that each class is represented in the CLOSED list by at most one vertex y(k). Define the strategy y(k) on classes k not represented in the final CLOSED list in an arbitrary manner (y(k)EYJ. If we now apply Dijkstra's algorithm, then it is almost

exactly identical with the algorithm WIDTH. Thus, the algorithm WIDTH constructs a path on the graph with the vertices 0, 1, ,,., K

that solves the black player's problem, and the length of this path does not exceed cp"(s').

Note that the white player follows a passive strategy in this case: initially the white

player selects s, and subsequently he selects the final states y'x' of the elementary trajec-

tories (I, y) selected by the black player in the previous move. Therefore the path through the graph with the vertices 0, 1, . . . . K corresponds to the path in the original state space XS. The theorem is proved.

The computer programs corresponding to this procedure may use different implementations of the lists. In some cases, it is preferable to maintain a single list. In order to reconstruct the path, the parent of each tree vertex must be saved.

Page 6: Quasioptimal control search algorithms using state-space partitioning

FIG. 1 FIG. 2

If all the arcs in the graph are of the same length, the procedure is essentially simpli- fied. First, there is no need to use the function k. Second, the lists CLOSED and OPEN should be combined into a single list, which is processed as a queue: new vertices are added at the end of the list and the vertices are promoted to watchman in the order of the general queue. If a second path is discovered leading to some class, then it is a priori shorter than the first path. Vertices are therefore not deleted from the list. We thus obtain a modification of the standard width-first algorithm IS-71 which constructs a path consisting of a minimum number of arcs.

In general, the procedure is a generalization of Dijkstra's algorithm [5, 6). The generalized algorithm does not search for the precise vertex y in the lists OPEN and CLOSED, but only for an equivalent vertex 2 _ y.

5. Solution of the test problem Consider the system of equations

g= sine, fl=cose, O=u, lul<i, (5.1)

that describes the plane motion of a point mass(l,n)with constant linear velocity and curva-

ture u(t) not exceeding 1 (the curvature is the control parameter). Following [3], we interpret the evolution of the controlled system as the motion of an automobile (Figs. 1

and 2). In addition to equations (5.1), phase constraints are defined in the b0q plane

(in general,(E, T))EGcIR*, .as well as the goal set DcG (the parking place). The set G is a

labyrinth, and its complement R*\G is the collection of obstacles. The optimality criterion

is the time taken to reach the parking space or, equivalently, the path length. We will use the standard time and control discretization technique, which are typical

of the methods of solving optimal control problems. We assume that the control is piecewise-

constant, and control switching occurs at discrete instants of time. The angle 9 measured

counterclockwise from the on axis is the direction of the velocity vector. With constant

control, we obtain a segment (when u = 0) or a circular arc of radius l/lu[, which is oriented clockwise or counterclockwise depending on the sign of U.

For the numerical solution of the test problem we assumed that control switching occurs at even instants of time, and that u takes the value i/4, where i is an arbitrary integer from -4 to +4. The sign of the curvature indicates the relative orientation of the velocity vector and the normal acceleration. Thus, before control switching, the automobile covers a distance of length 2 along a straight segment or along one of the circles of radius 4/lil.

The corresponding vectogram [3] for O=O in the neighbourhood of the point (6, 9) is shown

in Fig. 1. For an arbitrary oriented point (g, 1, 9) a collection of nine elementary trajec-

tories is formed from this vectogram by appropriate translation and rotation. The state space of the original problem X is the direct product of C and the unit circle

S' defined by the angle 8 mod 2s. The terminal set Y, is DXS'.

In numerical experiments, D was defined by the inequalityqco. The region G was defined

by the necessary system of inequalities O<E<S, n<16 and a supplementary condition: the

point (E,n) does not lie on any of the obstacles. The obstacles were selected individually

Page 7: Quasioptimal control search algorithms using state-space partitioning

7

for each experiment in the form of squares in the coordinate grid S=8i/33,q-8//33, where i and j are integers.

The set X\Y, was partitioned into classes Y,,...,Y, independently of the choice of the

obstacles by dividing each of the three state space coordinates: E=&/5, n=4j/5, 8=2n12/15 (i, j,

and k are integers). The number of classes was thus 10X20X15=3000. As the initial vertex

s we usually took the state t=4,n=iS,R=n. Note that the velocity vector points downward

in the initial state. For the obstacle configurations shown in Figs. 1 and 2, both algorithms produced

surprisingly similar results. This suggests the conjecture that the paths generated by the algorithms are not merely quasioptimal but actually optimal on the set of trajectories of

the class II' terminating inY,. The trajectory in Fig. 2 was generated by both algorithms;

for Fig. 1 the two algorithms generated different trajectories, which however had the same length and coincided on the first 14 arcs. The generated vertices were inspected starting with zero curvature and ending with maximum absolute curvature, i.e., 1.

The numerical experiments show that the order of inspection of the generated vertices may substantially affect the length of the trajectory. Thus, for Fig. 2, inspection in order of increasing curvature from-l to +1 produced a curve at length 32 by the algorithm DEPTH (see Fig. 2 in [II) and 36 by the algorithm WIDTH. The reason for this difference is that the degree of promise of a specific class Yk depends on what particular point of the class

is reached first. This does not contradict Theorems 1 and 2 because cp(s') for both figures

is obviously infinite (cp"(s') is more difficult to estimate), i.e., the guaranteed result is

strongly upward biased in this case. Thus, if we were to design the quasioptimal control by the method of [21, we would not obtain a satisfactory curve. The point is that the sets Yk are too large compared to the details of the obstacles, which must be taken into account

when carrying out the turning maneuver. Here we see the qualitative advantages of seeking a single trajectory compared to constructing a guaranteed strategy. If the classes Yk

are chosen to be sufficiently small, the computer power will be insufficient to solve the problem.

For the obstacle configuration in Fig. 1, we originally thought that the program will generate a trajectory similar to that in Fig. 2. However, after making two loops, the "automobile" somehow turned and approached the obstacle at an angle which enabled it to get out of the dead end and reduce the number of arcs. When an additional obstacle was intro- duced (Fig. 2), this maneuver became futile.

Let us consider the complexity of the two algorithms for the test problem. The trajectory tree constructed by the algorithm WIDTH may not fit in RAM, especially

if a fine partition is used. Therefore, the Pascal program creates a single list of tree elements in queue format. Since all the arcs are of equal length, the function k is not used; the vertices are placed at the end of the list and closed in the order of the general queue. Standard Pascal attributed, such as record, pointer, and dynamic data struc- ture, make it unnecessary to reserve memory in advance, and the memory space is expanded during the search process. When RAM is filled, the information about new vertices is auto- matically stored in external devices, which may be detected by the user only as a result of a certain slowdawn of the computation speed. For each vertex, the program WIDTH stores four variables: three coordinates of the oriented point and the list address of the parent vertex. The total number of vertices for the configurations shown in Figs. 1 and 2 was 1802 and 1735, respectively. The program WIDTH thus used a total of "7000 memory locations.

In the program DEPTH the memory is reserved in advance for the values of the function f. The memory required to store one trajectory is not large. The memory requirements in this case were slightly over 3000 memory locations.

The improved memory efficiency of the program DEPTH was offset by a somewhat inferior running time. The execution time of the two programs is easily compared. If we ignore the additional time for manipulating pointer variables, the execution time is proportional to the number of vertices scanned (counting with their multiplicities). The number of vertices for the program WIDTH is given above, and the number of vertices for the program DEPTH was 4319 and 5334, respectively.

Repeating the invitation to test various optimal control methods as it applies to this problem of an automobile maneuvering to its parking place when there are obstackes around 111, the author will be glad to supply all the necessary information about the parameters of the problem and auxiliary subroutines.

REFERENCES

1.

2.

3. 4.

OREL E.N., A method of solving optimal control problems, Dokl. Akad. Nauk SSSR, 306, 6, 1301-1304, 1989. OREL E.N., Approximation of Bellman's function by piecewise-constant functions, Zh. Vychisl. Matem. i Mat. Fiz., 18, 4, 916-927, 1978. ISACS R., Differential Games, Mir, Moscow, 1967. OREL E.N., A method of constructing the shortest path on a graph from the Bellman function minorant, Avtomat. Telemekh., 2, 88-91, 1977.

Page 8: Quasioptimal control search algorithms using state-space partitioning

a

5. REINGOLD E.M., NIEVERGELT J. and DE0 N., Combinatorial Algorithms: Theory and Practice, Prentice Hall, Englewood Cliffs, N.J., 1977.

6. NILSON N.J., Artificial Intelligence. Solution Seeking Methods, Mir, Moscow, 1973. 7. ZELKOWITZ M., SHAW A. and HANNON J., Software Development Principles, Mor, Moscow,

1982. 8. OWEN G., Game Theory, Mir, Moscow, 1971. 9. OREL E.N., The approximation of Bellman's function, 7%. Vychisl. Matem. i Mat. Fir..,

13, 5, 1161-1174, 1973.

Translated by Z.L.

V.S.S.R.Comput.Matks.Matk.Pkys.,Vo1.30,No.5,pp.8-17,1990 0041-5553&O $10.00+0.00 Printed in Great Britain 01991 Pergamon Press plc

APPROXIMATING THE MEASUREMENT OPTIMIZATION PROBLEMS FOR A PARABOLIC SYSTEM"

E.K. KOSTOUSOVA

We consider the optimization of the measurement process for solving the heat conduction equation when the equation, the initial condition, and the measurement equations contain errors and the only a priori fnforma- tion about the errors is the region of their possible values. The problem is approximated by the method of lines and by finite differences. These approximations lead to control problems for finite-dimensional systems.

The presence of errors must be taken into account when estimating the state of physical systems from measurements. A statistical description of the unknown errors is often unavail- able, and it is therefore natural to assume that only the region of possible error values is known [l, 21. With this approach, the state of the system cannot be determined exactly and instead the so-called information region is constructed 12, 31, i.e., the set of system states that are consistent with measurement data. The information region constructed in this way depends on the error realizations and on the measurement technique. We therefore have the problem of selecting a measurement method which ensures the best estimate, in some sense, of the information region of the system given worst-case error realizations.

The measurement optimization problem in this minimax setting was studied for finite- dimensional systems in [4]. In the present paper, we consider the optimization of measure- ments for the heat conduction equation when its solution is estimated from measurements of solution values averaged over the spatial coordinates. The paper is related to [3, 51, where the case of point observations was considered.

Optimization of measurements for both finite-dimensional and distributed systems with random disturbances has been studied by various authors (see, e.g., [6-81).

1. Statement of the problem Assume that the state of the system is described by an initial-boundary-value problem

for the heat conduction equation

n,= r, a,n*,z,+cf@,t), XSDCiR', f=L* m (0, @) f , fl.la) i-L

n/,,a,=O, ulr=n=1Eo(Z)‘=L,(D), fl.lb)

where r = 1 or 2, D = (0, 1) for r = 1 and D=(O,l)X(O,l) for r = 2, the constant e = 0 or 1,

a. > 0; the initial state z&and the disturbance f (for c f 0) are not known exactly. We

astsume that the solution of system (1.1) is observed (possibly with an error) , so that at

~2h.v~ch~s2.~a~.~~.Fi~.,30,9,1294-1306,1990