a multifaceted heuristic for the orienteering problem

8
A Multifaceted Heuristic for the Orienteering Problem B. L, Golden, Qiwen Wang, and Li Liu College of Business and Management, University of Maryland, College Park, Maryland 20742 The orienteering problem involves the selection of a path between an origin and a destination which maximizes total score subject to a time restriction. In previous work we presented an effective heuristic for this NP-hard problem that outper- formed other heuristics from the literature. In this article we describe and test a significantly improved procedure. The new procedure is based on four concepts- center of gravity, randomness, subgravity, and learning. These concepts combine to yield a procedure which is much faster and which results in more nearly optimal solutions than previous procedures. INTRODUCTION The orienteering problem involves a number of “control points,” each with an associated score. Competitors are required to visit a subset of these control points after leaving the start point (node 1) in order to maximize their total score and return to the end point (node n) within a prescribed amount of time. Competitors who arrive late at node n are disqualified. The distance or travel time t(i,j) between any pair of control points is assumed to be a known quantity. Thus, the orienteering problem may be formulated as follows. Given n nodes in the Euclidean plane, each with a score s(i) 2 0 [note that s(1) = s(n) = 01, find a path or route of maximum score through these nodes, beginning at node 1 and ending at node n, of length (or duration) no greater than TMAX. The problem has been shown to be NP-hard by Golden, Levy, and Vohra [l]. With this in mind, one is led to consider heuristics. In the article by Golden, Levy, and Vohra the authors presented an effective (center-of-gravity) heuristic that outperformed a (randomization) heuristic due to Tsiligirides [3]. In this article we describe and test a significantly improved procedure which builds upon the strengths of the two above-mentioned procedures. The new procedure is based on four key ideas-center of gravity, randomness, subgravity, and learning. These concepts combine to yield a procedure which is much faster and which yields more nearly optimal solutions than previous procedures. HEURISTICS FROM THE LITERATURE Two effective heuristic approaches for solving the orienteering problem have been proposed in earlier works. In this section, we briefly describe these meth- ods. The stochastic algorithm due to Tsiligirides [3] relies on Monte Carlo tech- Naval Research Logistics, Vol. 35, pp. 359-366 (1988) Copyright 0 1988 by John Wiley & Sons, Inc. CCC 0028-1441/88/030359-008$04.00

Upload: b-l-golden

Post on 06-Jun-2016

222 views

Category:

Documents


6 download

TRANSCRIPT

A Multifaceted Heuristic for the Orienteering Problem

B. L, Golden, Qiwen Wang, and Li Liu College of Business and Management, University of Maryland, College Park,

Maryland 20742

The orienteering problem involves the selection of a path between an origin and a destination which maximizes total score subject to a time restriction. In previous work we presented an effective heuristic for this NP-hard problem that outper- formed other heuristics from the literature. In this article we describe and test a significantly improved procedure. The new procedure is based on four concepts- center of gravity, randomness, subgravity, and learning. These concepts combine to yield a procedure which is much faster and which results in more nearly optimal solutions than previous procedures.

INTRODUCTION

The orienteering problem involves a number of “control points,” each with an associated score. Competitors are required to visit a subset of these control points after leaving the start point (node 1) in order to maximize their total score and return to the end point (node n ) within a prescribed amount of time. Competitors who arrive late at node n are disqualified.

The distance or travel time t( i , j) between any pair of control points is assumed to be a known quantity. Thus, the orienteering problem may be formulated as follows. Given n nodes in the Euclidean plane, each with a score s ( i ) 2 0 [note that s(1) = s ( n ) = 01, find a path or route of maximum score through these nodes, beginning at node 1 and ending at node n, of length (or duration) no greater than TMAX.

The problem has been shown to be NP-hard by Golden, Levy, and Vohra [l]. With this in mind, one is led to consider heuristics. In the article by Golden, Levy, and Vohra the authors presented an effective (center-of-gravity) heuristic that outperformed a (randomization) heuristic due to Tsiligirides [3]. In this article we describe and test a significantly improved procedure which builds upon the strengths of the two above-mentioned procedures. The new procedure is based on four key ideas-center of gravity, randomness, subgravity, and learning. These concepts combine to yield a procedure which is much faster and which yields more nearly optimal solutions than previous procedures.

HEURISTICS FROM THE LITERATURE

Two effective heuristic approaches for solving the orienteering problem have been proposed in earlier works. In this section, we briefly describe these meth- ods.

The stochastic algorithm due to Tsiligirides [3] relies on Monte Carlo tech-

Naval Research Logistics, Vol. 35, pp. 359-366 (1988) Copyright 0 1988 by John Wiley & Sons, Inc. CCC 0028-1441/88/030359-008$04.00

360 Naval Research Logistics, Vol. 35 (1988)

niques to develop a large number of possible routes, and it selects the best from these. The driving force behind this approach is a measure A ( j ) of “desirability” for all nodes j not currently on the route. In particular, Tsiligirides uses A( j ) = (s(j)/t(last, j ) )4.0, where s( j) is the score associated with node j and t(last,j) is the travel time from the last node chosen to node j . After determining the four largest values for A( j ) (fewer if four nodes are not eligible), the values are normalized so that they sum to one. A random number between 0 and 1 is then generated in order to select one of these four nodes for inclusion on the route. This procedure is repeated until no additional nodes can be included on the route. Since the procedure uses random numbers, Tsiligirides’ method generates many routes for each value of TMAX, and it then chooses the one with the largest total score.

At the heart of the center-of-gravity heuristic due to Golden, Levy, and Vohra [l] is a systematic progression from one center of gravity to another. Each center of gravity has an associated node set and a resulting route. The progression continues until a stopping rule is satisfied. At that point, the best route found is recorded as the solution.

The heuristic is made up of the following three steps:

(1) Route-construction step; (2) Route-improvement step; (3) Center-of-gravity step.

In the route-construction step, a “bang-for-buck” insertion heuristic is applied in order to find a route, beginning at 1 and ending at n, which has a relatively high score and requires less than TMAX units of time.

In the route-improvement step, an interchange procedure, such as 2-OPT (see Lin [2]), is applied to the route just generated in order to find a shorter route on the same set of nodes. This is followed by a cheapest insertion step in which as many nodes as possible are inserted into the route obtained thus far without violating TMAX. We call the route that results L .

Suppose now that node i has coordinates ( x ( i ) , y ( i ) ) . In the third step, we calculate the center of gravity of L as g = (ZJ), where

Let a(i) = t(i,g) for i = 1,2, . . . ,n. Next, a new route including nodes 1 and n is formed as follows:

(a) Calculate the ratio of s ( i ) /a ( i ) for all i. (b) Add nodes to the route in decreasing order of this ratio, using cheapest

insertion, until no additional nodes can be added without exceeding TMAX.

(c) Use the route-improvement step to make adjustments to the resulting route.

Golden et al.: Multifaceted Heuristic 361

/- \ / \

\ \ \

d6’l ‘3) \ \ \

Figure 1. Subgravity illustrated.

We now have a route L1. This route’s center of gravity gives rise to a repetition of (a)-(c). The resulting route is denoted by Lz. This process is repeated until a cycle develops; that is, route Lp and L, are identical for some q > p, or another stopping rule is triggered. Finally, we select the route with the highest score from the routes { L , L1,L2, . . . ,L,}.

INSIGHTS AND OBSERVATIONS

The two heuristics described in the previous section consistently perform rea- sonably well. For example, they totally dominate a deterministic algorithm which is presented by Tsiligirides. Randomization seems to be the driving force in Tsiligirides’ procedure, and the center-of-gravity concept seems to be the key idea in the Golden, Levy, Vohra (GLV) procedure. In designing a new and improved heuristic for the orienteering problem, we seek to incorporate these two proven features along with two others.

A third key feature that we find desirable in a heuristic is an ability to “see the forest from the trees.” In other words, a smart heuristic should take into account the opportunity posed by a cluster of nodes. Individual node scores may not be high, but if the entire cluster can be visited quickly then the total payoff (score) may be great. This concept is illustrated in Figure 1.

Suppose nodes 1 and 13 are the start and end nodes, respectively, and TMAX = 13. Each triple shown in Figure 1 represents ( x ( i ) , y ( i ) , s ( i ) ) for node i. In their previous work, Golden, Levy, and Vohra observed that node

362 Naval Research Logistics, Vol. 35 (1988)

score, distance to the center of gravity, and the sum of distances to the start and end nodes were three good indicators of the attractiveness of inserting a particular node onto the route. Figure 1 builds upon this observation.

Initially, the center of gravity of all thirteen nodes is (12/27, 0). It is easy to see that nodes 2 and 6 (due to symmetry) are closest to the center of gravity, they each have the smallest sum of distances and the largest individual scores. Any insertion procedure based on these three measures is likely, therefore, to build route 1-2-3-6-13 with a total score of 9 and a total duration of 12.87. However, a different procedure which looks for attractive node clusters might select route 1-12-11-10-9-8-7-13 instead. This route has a total score of 12 and a total duration of only 11.063. A procedure that inserts nodes based upon “neighborhood” score rather than individual score employs a feature which we will refer to as subgravity.

A final feature that we would like to see in a heuristic for the orienteering problem involves learning. An “intelligent” procedure should be able to learn in the following sense. As different node combinations are sampled (for example, in moving from one center of gravity to another), some are found to work better than others. The heuristic should take advantage of this information and, over time, learn which combinations are most effective. An analogy to the selection of an eight-man boat in rowing can be made. Suppose there are many rowers in competition and two boats are raced each day. The members of the winning boat receive points while the losers are penalized. Eventually, the coach selects those rowers with the highest point totals. These are not necessarily the eight best individual scullers in the same way that the best orienteering route does not necessarily consist of the nodes with the highest scores. It is hoped that the eight men are the rowers that best fit together.

A NEW HEURISTIC WITH LEARNING CAPABILITIES

The procedure used to generate a single route is based on the notion of a weighted average. That is, insertions are made according to a convex combi- nation of three individual measures. For each node i , the three measures are (1) a score measure SM(i) , (2) a distance to center-of-gravity measure C M ( i ) , and ( 3 ) a sum of distances to the two foci of an ellipse measure EM(i) . (The foci are the start and end nodes.)

For any node i not yet on the route, its score, center-of-gravity distance, and ellipse distance are three important factors in determining the appropriateness of inserting i next. The weighted average is given by

W ( i ) = a * S M ( i ) + p * C M ( i ) + y * E M ( i ) ,

where 01 + f3 + y = 1. From previous work with the GLV procedure, we felt that the score measure is the most important factor, the ellipse measure is least important, and the center-of-gravity measure is of intermediate significance. We therefore compared the following four parameter sets (ci,P,y) in our testing effort: (0.7,0.2,0.1), (0.8,0.1,0.1), (0.6,0.2,0.2), and (0.5,0.3,0.2). The first pa- rameter set performed best, the second set performed second best, and so on. The results, however, were not dramatically different from one set to another. In short, we recommend a = 0.7, p = 0.2, and y = 0.1.

Golden et al.: Multifaceted Heuristic 363

Each of the three measures lies between 0 and n (the number of nodes), with n being the best possible value for a measure. The score measure is computed in the following way. To begin, the subgravity feature is captured by the expres- sion

Essentially, this expression takes into account the scores of all nodes j close to node i and discounts these scores as a function of internode distance. We have used p = 10 in this article but p can be adjusted to reflect the scale of internode distances. [We tried p = 1,5, 10, and 100 in our experiments and the first three of these values all worked well.] *

Next, the learning feature is exhibited via the expression

SM(1’) + S M ( i ) * L M ( i ) ,

where L M ( i ) is the learning measure. We wait until later to define L M ( i ) formally; however, we can define it intuitively at this time. If, in previous it- erations, node i has been associated with routes having above-average scores, then L M ( i ) > 1 and node i is rewarded. If node i has been associated with “below-average routes,” then L M ( i ) < 1 and node i is penalized. Finally, the scoring measure is properly scaled between 0 and n by the expression

S M ( i ) * n max { S M ( j ) } ’

S M ( i ) +

If the distance to the center of gravity is denoted by C M ( i ) and the sum of distances to the two foci of an ellipse is denoted by E M ( i ) , then these measures can be similarly scaled by

min { CM( j)} * n CM( i )

C M ( i ) +- Y

and

min { E M ( j)} * n EM (i) E M ( i ) +

At each step, we examine the five nodes with the largest weighted averages and choose one randomly as the node to be inserted onto the route (each node is chosen with probability 0.20). We insert this node as cheaply as possible and then check for time feasibility. If the resulting route is feasible, we try to insert another node. If the resulting route is not feasible, we seek to drop one node such that feasibility is regained. This is always possible since, in the worst case, the last node added can now be dropped. In particular, we drop the node with the lowest score-to-savings ratio currently on the route such that feasibility is attained. We again try to insert another node. The process continues until no further nodes can successfully be added to the route.

The route generated by this insertion procedure is passed to a 2-OPT im- provement routine. The improvement routine attempts to sliorten the duration of the route by resequencing nodes. If the total duration can be reduced, in-

364 Naval Research Logistics, Vol. 35 (1988)

Table 1. Problem 1 results.

Scores New vs. New vs. New

TMAX Tsiligirides GLV New Tsiligirides GLV length

5 10 10 10 ... ... 4.14 10 15 15 15 ... ... 6.87 15 45 45 45 ... ... 14.26 20 65 65 65 ... ... 19.85 25 90 90 90 ... ... 24.88 30 110 110 110 ... ... 29.67 35 135 125 135 up 10 34.08 40 150 140 155 up.5 up 15 38.97 46 175 165 175 ... up 10 44.91 50 190 180 190 ... up 10 49.53 55 205 200 205 UP 5 54.80 60 220 205 225 up's up 20 59.89 65 240 220 240 up 20 63.82 70 255 240 260 up '5 up 20 69.54 73 260 255 265 UP 5 up 10 70.73 75 270 260 270 up 10 73.80 80 275 275 280 up.5 U P 5 78.18 85 280 285 285 UP 5 ... 81.78

sertions are attempted using the procedure already outlined. The outcome is a candidate route.

The candidate route has its own center of gravity. In addition, nodes are rewarded or penalized based upon their most recent performances. With these points in mind, C M ( i ) , L M ( i ) , and S M ( i ) need to be updated. At that time, another candidate route is generated. From a starting center of gravity, we decided to repeat the procedure a total of 20 times-moving from one center of gravity to the next along the way.

For each problem, five starting center-of-gravity locations were used. First, we drew a rectangle in the plane enclosing all of the control points. We then divided this rectangle into four equal-sized quadrants. The center of the rectangle and the quadrant centers became the starting center-of-gravity locations. In all, 100 routes were generated for each problem and the route with the highest total score became the solution recommended by the heuristic.

The procedure has now been fully described except for a definition of the learning measure, LM(1'). From each starting center of gravity, we set L M ( i ) = 1 for all i. After each of the first 19 routes generated, however, we update this measure according to the equation

1 L M ( ~ ) = - 2 {route 1 score/average route score},

N ( l ) I E R ( i )

where R(i) is the set of routes that include i, of those examined, N ( i ) is the cardinality of R(i) , and the average route score takes all routes into account, not just those in R(i).

Golden et al.: Multifaceted Heuristic 365

Table 2. Problem 2 results.

Scores New vs. New vs. New

TMAX Tsiligirides GLV New Tsiligirides GLV length

15 20 23 25 27 30 32 35 38 40 45

120 190 205 230 230 250 275 315 355 395 430

120 200 210 230 230 260 260 300 355 380 450

120 200 205 230 230 265 300 320 360 395 450

... up 10

...

...

... up 15

U P 5 U P 5

up 20

up 25

...

...

... down 5

... up. 5 U P 40 up 20 U P 5 up 15

...

14.90 19.88 22.51 24.13 24.13 29.85 31.63 34.51 37.84 39.78 44.44

COMPUTATIONAL RESULTS

In evaluating the performance of the proposed heuristic, results are compared with the stochastic algorithm of Tsiligirides and the center of gravity heuristic of Golden, Levy, and Vohra for the three test problems provided by Tsiligirides

In Tables 1, 2, and 3, we present the results for Problems 1 (32 nodes), 2 (21 nodes), and 3 (33 nodes). The new heuristic clearly dominates its competition.

[31.

Table 3. Problem 3 results.

Scores New vs. New vs. New

TMAX Tsiligirides GLV New Tsiligirides GLV length

15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95

100 105 110

100 140 190 240 290 330 370 410 450 500 530 560 590 640 670 690 720 760 770 790

170 200 250 320 380 420 450 500 520 580 600 640 650 690 720 770 790 800 800 800

170 200 260 320 390 430 470 520 550 580 610 640 670 710 740 770 790 800 800 800

up 70 up 60 up 70 up 80 up 100 up 100 up 100 up 110 up 100 up 80 up 80 up 80 up 80

up 80

up 70 up 70

up 70 up 40 up 30 up 10

...

... up 10

up 10 up 10 up 20 up 20

up 10

up 20 up 20 up 20

...

up 30 ...

...

...

...

...

...

...

14.47 19.79 24.46 28.77 34.79 38.88 44.26 48.94 53.55 59.44 63.24 69.14 74.23 79.71 84.86 89.82 92,58 96.59 97.08 96.59

366 Naval Research Logistics, Vol. 35 (1988)

Of the 49 problem instances examined in total, the new heuristic betters the Tsiligirides procedure in 32 cases-often dramatically. It ties in the remaining cases. The new heuristic betters the GLV procedure in 26 cases, loses in only one case, and ties in the remaining cases.

The superiority of the new heuristic is even more striking than it might seem at first glance. For the first two problems, small TMAX values are extremely limiting. There are only a few candidate routes (since only a small number of nodes can fit onto a route) and these are easy to obtain. Both the GLV procedure and the new heuristic perform nearly optimally. For the third problem, large TMAX values are such that the entire set of nodes can be visited on a route. Here, the GLV procedure and the new heuristic again perform as well as possible. Over the remaining problem instances, the new heuristic consistently outper- forms the GLV procedure.

The new heuristic was written in FORTRAN and run on a UNIVAC 1100/ 92. The running times are quite impressive. The total CPU time for all 18 Problem 1 instances was 17.95 seconds. For the 11 Problem 2 instances, total time was 4.98 seconds, and for the 20 Problem 3 instances, total time was 25.98 seconds. The running times are significantly faster than the times required by the GLV heuristic (the ratio of running times is approximately 1 : 4), and the quality of solutions is also clearly superior.

On average, out of the 100 routes generated for each problem, there are 36 distinct routes and 31 of the 100 routes have the best attained score. If we were to cut back from 20 to 10 repetitions from each starting center of gravity in or- der to halve running time, the algorithm’s effectiveness would deteriorate only slightly (perhaps in one or two of the 49 problem instances).

CONCLUSIONS

The proposed orienteering heuristic has been found to perform remarkably well, at least on the set of problems tested. This appears to be due to the combination of four features-center of gravity, randomness, subgravity , and learning. In computational experiments, we found that the removal of any one of these four features leads to results that are inferior to the ones presented here. Finally, we point out that our subgravity and learning concepts might profitably be applied to other combinatorial optimization problems.

REFERENCES

[l] Golden, B., Levy, L., and Vohra, R., “The Orienteering Problem,” Naval Research

[2] Lin, S . , “Computer Solutions of the Traveling Salesman Problem,” Bell System Tech-

[3] Tsiligirides, T . , “Heuristic Methods Applied to Orienteering,” Journal of the Op-

Logistics, M(3), 307-318 (1987).

nical Journal, 44,2245-2269 (1965).

erational Research Society, 35(9), 797-809 (1984).

Manuscript received March 3, 1987 Revised Manuscript received July 22, 1987 Accepted August 28, 1987