an efficient algorithm for the single machine total tardiness problem

14
An ecient algorithm for the single machine total tardiness problem BARBAROS C¸ . TANSEL, BAHAR Y. KARA and IHSAN SABUNCUOGLU Department of Industrial Engineering, Bilkent University, Bilkent 06533, Ankara, Turkey E-mail: [email protected] or [email protected] or [email protected] Received November 1999 and accepted November 2000 This paper presents an exact algorithm for the single machine total tardiness problem (1== P T i ). We present a new synthesis of various results from the literature which leads to a compact and concise representation of job precedences, a simple optimality check, new decomposition theory, a new lower bound, and a check for presolved subproblems. These are integrated through the use of an equivalence concept that permits a continuous reformation of the data to permit early detection of optimality at the nodes of an enumeration tree. The overall eect is a significant reduction in the size of the search tree, CPU times, and storage requirements. The algorithm is capable of handling much larger problems (e.g., 500 jobs) than its predecessors in the literature ( 150). In addition, a simple modification of the algorithm gives a new heuristic which significantly outperforms the best known heuristics in the literature. 1. Introduction In this paper, we present an exact algorithm that signifi- cantly advances the computational state-of-the-art on the single machine total tardiness scheduling problem, 1== P T i . In this problem, n jobs with known processing times and due dates must be sequenced on a single con- tinuously available machine. If a job is completed after its due date, it is considered tardy, and is charged a penalty equal to its completion time minus its due date. The object is to find a sequence that minimizes the total tar- diness. The weighted version of the problem is strongly NP-hard (Lenstra et al., 1977). The version with unit weights which we study is NP-hard in the ordinary sense (Du and Leung, 1990). A pseudo-polynomial time dy- namic programming algorithm is available (Lawler, 1977), whose time bound is On 4 P j p j or On 5 max j p j ; where p j is the processing time of job j and n the number of jobs. The method relies on a decomposition theorem proved by Lawler (1977), and refined and strengthened by Potts and van Wassenhove (1982, 1987). 1== P T i was initially introduced by McNaughton (1959) and the first thorough study done by Emmons (1969), who proves dominance theorems that must be satisfied by at least one optimal schedule. These theorems are later generalized to non-decreasing cost functions (Rinnooy Kan et al., 1975). Exact methods rely either on branch and bound (Schwimer, 1972; Rinnooy Kan et al., 1975; Fisher, 1976; Picard and Queyranne, 1978; Sen et al., 1983; Potts and van Wassenhove, 1985; Sen and Borah, 1991; Szwarc and Mukhopadhyay, 1996) or on dynamic programming (Srinivasan, 1971; Lawler, 1977; Schrage and Baker, 1978; Potts and van Wassenhove, 1982, 1987). Even though this problem has been around for more than 30 years, relatively little progress seems to have been made in developing eective computational procedures for exact solutions. Early algorithms (e.g., Fisher, 1976) were able to solve problems with about 50 jobs. The dynamic programming-based algorithm of Potts and van Wassenhove (1982) is able to solve problems with 100 jobs and the branch and bound algorithm of Szwarc and Mukhopadhyay (1996) problems with 150 jobs. An im- portant bottleneck in handling larger-sized problems is the core storage requirement (Potts and van Wassenhove, 1982). The dynamic programming algorithms, developed for this problem require O2 n storage in the worst case, even though ecient bookkeeping leads to reduced stor- age in most implementations. Branch and bound algo- rithms that use job precedences (Emmons, 1969) at every subproblem requires On 2 storage per subproblem. For a depth-first search that implements cycle prevention rules (needed for consistent application of Emmons’ theorems) this amounts to a total core storage of On 3 if one keeps an On 2 precedence record at every level of the search tree. In contrast, the algorithm that we propose requires On storage for job precedences, which results in a total 0740-817X Ó 2001 ‘‘IIE’’ IIE Transactions (2001) 33, 661–674

Upload: barbaros-c-tansel

Post on 02-Aug-2016

216 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: An Efficient Algorithm for the Single Machine Total Tardiness Problem

An e�cient algorithm for the single machine total tardinessproblem

BARBAROS CË . TANSEL, BAHAR Y. KARA and IHSAN SABUNCUOGLU

Department of Industrial Engineering, Bilkent University, Bilkent 06533, Ankara, TurkeyE-mail: [email protected] or [email protected] or [email protected]

Received November 1999 and accepted November 2000

This paper presents an exact algorithm for the single machine total tardiness problem (1==P

Ti). We present a new synthesis ofvarious results from the literature which leads to a compact and concise representation of job precedences, a simple optimalitycheck, new decomposition theory, a new lower bound, and a check for presolved subproblems. These are integrated through theuse of an equivalence concept that permits a continuous reformation of the data to permit early detection of optimality at the nodesof an enumeration tree. The overall e�ect is a signi®cant reduction in the size of the search tree, CPU times, and storagerequirements. The algorithm is capable of handling much larger problems (e.g., 500 jobs) than its predecessors in the literature(�150). In addition, a simple modi®cation of the algorithm gives a new heuristic which signi®cantly outperforms the best knownheuristics in the literature.

1. Introduction

In this paper, we present an exact algorithm that signi®-cantly advances the computational state-of-the-art on thesingle machine total tardiness scheduling problem,1==

PTi. In this problem, n jobs with known processing

times and due dates must be sequenced on a single con-tinuously available machine. If a job is completed after itsdue date, it is considered tardy, and is charged a penaltyequal to its completion time minus its due date. Theobject is to ®nd a sequence that minimizes the total tar-diness. The weighted version of the problem is stronglyNP-hard (Lenstra et al., 1977). The version with unitweights which we study is NP-hard in the ordinary sense(Du and Leung, 1990). A pseudo-polynomial time dy-namic programming algorithm is available (Lawler,1977), whose time bound is O�n4Pj pj� or

O�n5 maxj

pj�;

where pj is the processing time of job j and n the numberof jobs. The method relies on a decomposition theoremproved by Lawler (1977), and re®ned and strengthened byPotts and van Wassenhove (1982, 1987).1==

PTi was initially introduced by McNaughton

(1959) and the ®rst thorough study done by Emmons(1969), who proves dominance theorems that must besatis®ed by at least one optimal schedule. These theoremsare later generalized to non-decreasing cost functions(Rinnooy Kan et al., 1975). Exact methods rely either on

branch and bound (Schwimer, 1972; Rinnooy Kan et al.,1975; Fisher, 1976; Picard and Queyranne, 1978; Senet al., 1983; Potts and van Wassenhove, 1985; Sen andBorah, 1991; Szwarc and Mukhopadhyay, 1996) or ondynamic programming (Srinivasan, 1971; Lawler, 1977;Schrage and Baker, 1978; Potts and van Wassenhove,1982, 1987).Even though this problem has been around for more

than 30 years, relatively little progress seems to have beenmade in developing e�ective computational proceduresfor exact solutions. Early algorithms (e.g., Fisher, 1976)were able to solve problems with about 50 jobs. Thedynamic programming-based algorithm of Potts and vanWassenhove (1982) is able to solve problems with 100jobs and the branch and bound algorithm of Szwarc andMukhopadhyay (1996) problems with 150 jobs. An im-portant bottleneck in handling larger-sized problems isthe core storage requirement (Potts and van Wassenhove,1982). The dynamic programming algorithms, developedfor this problem require O�2n� storage in the worst case,even though e�cient bookkeeping leads to reduced stor-age in most implementations. Branch and bound algo-rithms that use job precedences (Emmons, 1969) at everysubproblem requires O�n2� storage per subproblem. For adepth-®rst search that implements cycle prevention rules(needed for consistent application of Emmons' theorems)this amounts to a total core storage of O�n3� if one keepsan O�n2� precedence record at every level of the searchtree. In contrast, the algorithm that we propose requiresO�n� storage for job precedences, which results in a total

0740-817X Ó 2001 ``IIE''

IIE Transactions (2001) 33, 661±674

Page 2: An Efficient Algorithm for the Single Machine Total Tardiness Problem

core storage requirement of O�n2�. Hence, computationaldi�culties related to limited storage begin to arise atmuch larger n in our branch and bound algorithm than inthe existing algorithms. In addition, an e�cient andsimpli®ed way of handling the data allows us to auto-matically enforce cycle prevention which is a computa-tionally expensive but unavoidable task for the existingmethods.With these features, the proposed algorithm is capable

of handling much larger problems (e.g., 500 jobs) thanits predecessors in the literature. The CPU times basedon 2800 test problems indicate that problems with 500jobs can be solved to optimality with a 87.5% successrate while those with 400 and 300 jobs can be solvedoptimally with a 95 and 100% success rate, respectively.The test sets are generated using the method of Fisher(1976), which has been used in the literature to evaluatethe previous state-of-the-art algorithms (e.g., Potts andvan Wassenhove, 1982; Szwarc and Mukhopadhyay,1996). The algorithm and the test sets are available inthe web site http :nnwww:bilkent:edu:trn�bkara.The proposed algorithm is a depth-®rst branch and

bound algorithm with three major phases: b-sequenceconstruction and b-test, decompositions, and fathomingtests. The b-sequence constructed in the ®rst phase ofthe algorithm is a critically important sequence whichoften solves the problem optimally. The b-test is asimple check for optimality of the b-sequence. If the testis passed the optimal sequence is at hand. Otherwise, thealgorithm continues with the decomposition phase thatutilizes three di�erent decomposition rules: exact de-composition, key position decomposition, and branchdecomposition. Fathoming tests are done by means of alower bound or by checking if the current job popula-tion matches one of the earlier job populations thathave already been solved optimally (found-solved). Thisbasic structure is repeated at each node of the searchtree.The proposed algorithm is justi®ed on the basis of four

theorems. Theorem 1 (b-Theorem) gives su�cient con-ditions for optimality. This theorem is proved on thebasis of Emmons (1969) and Lawler (1977). The exactdecomposition (Theorem 2) is obtained from Emmons'®rst and second theorems extended via Lawler (1977).The key position decomposition (Theorem 3) and thebranch decomposition (Theorem 4) are direct extensionsof the decomposition theorems of Chang et al. (1995) andalso those of Potts and van Wassenhove (1982) viaLawler (1977). The relationship of the proposed theory tothe existing theory is summarized in Fig. 1.The paper is organized as follows. In Section 2, we

provide the basic results from the literature that we use inour algorithmic design. In Section 3, we prove the b-optimality. In Section 4, we give an e�cient method toconstruct the b-sequence. In Section 5, we give three de-composition theorems. In Section 6, we give the proposed

algorithm. Additionally, we brie¯y discuss time andstorage requirements of the algorithm. In Section 7, wepresent and discuss the computational results. In Section8, we present two new heuristics and discuss their com-putational performance. The paper ends with concludingremarks in Section 9.

2. Basic results

In this section, we present the existing theory that we usein our algorithmic design, namely the ®rst and secondtheorems of Emmons (1969) and the theorem due toLawler (1977).Let J � f1; . . . ; ng and let pj; dj denote, respectively,

the processing time and due date of job j. For any se-quence S of job indices, let Cj�S� be the completion timeof job j in sequence S. The total tardiness associated withS is T �S� �Pj2J max�0;Cj�S� ÿ dj�. The problem is to®nd a sequence S� such that T �S�� � T �S� 8 S: For anysubset K of J , let p�K� �Pj2K pj.

Theorem 1 of Emmons (1969). For a pair of jobs j and kwith j < k, if a set Bk of jobs is known to precede job k insome optimal sequence and if dj � maxfp�Bk� � pk; dkg,then there exists an optimal sequence in which all jobs inBk [ fjg precede job k.

Fig. 1. The relationship of the proposed theory to the existingtheory.

662 Tansel et al.

Page 3: An Efficient Algorithm for the Single Machine Total Tardiness Problem

Theorem 2 of Emmons (1969). For a pair of jobs j and kwith j < k, if sets Bk and Ak of jobs are known to precedeand succeed job k, respectively, in some optimal sequence,dj > maxfp�Bk� � pk; dkg, and dj � pj � p�J ÿ Ak�, thenthere exists an optimal sequence in which all jobs inAk [ fjg succeed job k.

We refer to Bj and Aj as the predecessor and successor setsof job j, respectively, if jobs in Bj are known to precedeand jobs in Aj are known to succeed job j in at least oneoptimal sequence. In addition, we let Ej � p�Bj� � pj,Lj � p�J ÿ Aj� and refer to these as the earliest and latestcompletion times of job j, respectively.The following corollary to Theorem 2 (Emmons, 1969)

gives su�cient conditions for optimality of the EarliestDue Date (EDD) sequence.

Corollary 2.2 of Emmons (1969). The EDD sequence isoptimal if it results in waiting times Wj � dj (i:e:;Cj(EDD)ÿpj � dj) for all jobs j.

The next theorem states that the due dates can be mod-i®ed in appropriate ranges without losing optimality.

Theorem 1 of Lawler (1977). Let S� be any sequence whichis optimal with respect to the given due dates d1; . . . ; dn andlet Cj�S�� be the completion time of job j for this sequence.Let d 0j be chosen such that min�dj;Cj�S��� � d 0j �max�dj;Cj�S���. Then any sequence S0 which is optimalwith respect to the due dates d 01; . . . ; d 0n is also optimal withrespect to the due dates d1; . . . ; dn (but not conversely).

3. b-Optimality

Assume the jobs in J are indexed in non-decreasing orderof processing times with ties broken by non-decreasingorder of due dates. Given an instance of the problem withdue dates dj, suppose we have applied (without creatingcycles) Theorems 1 and 2 of Emmons (1969) and obtainedpredecessor sets Bj. Let Cj � p�Bj� � pj and bj � max�dj;p�Bj� � pj� 8 j. Clearly Cj is a lower bound for the com-pletion time of job j in any optimal sequence that satis®esthe precedence relations with respect to the sets Bj. De®nethe b-sequence to be the sequence such that the jobs areordered in non-decreasing order of their bj values withtied ones ordered in increasing order of job indices. LetJ� � fj 2 J : 9 i > j such that bj > big.

Theorem 1 (b-theorem). The b-sequence is optimal ifbj � p�Fj� 8 j 2 J � where Fj is the set of jobs that precedejob j in the b-sequence.

Proof. We ®rst prove that the b-sequence is optimal forthe problem with due dates d 0j � bj, then prove that it isalso optimal for the original problem. Let j 2 J . If

j 2 J �, then the assumption of the theorem givesbj � p�Fj� � Wj (where Wj is the waiting time of job j inthe b-sequence). If j =2 J�, then Fj � fi : pi � pj andbi � bjg (due to the indexing convention). It follows fromTheorem 1 of Emmons (1969) that every job i in Fj is alsoin Bj. Hence, bj � Cj � p�Fj� � pj > p�Fj� � Wj. We haveshown that bj � Wj 8 j 2 J . Corollary 2.2 (Emmons,1969) implies that the b-sequence is optimal for theproblem with due dates bj.Observe that the interval �dj;max�dj;Cj�� is a subin-

terval of the interval �min�dj;Cj�S���;max�dj;Cj�S����where S� is any optimal sequence that satis®es the pre-cedence relations relative to the sets Bj. Since d 0j � bj is inthis interval, the b-sequence is optimal for the problemwith due dates dj as a result of Theorem 1 of Lawler(1977). j

The proof of Theorem 1 provides an important ideawhich we term the idea of equivalence. By observing thatthe new due dates d 0j � bj are in the intervals �dj;max�dj;Cj��, this idea permits every theoretical statementthat is valid for the original problem to be re-stated interms of the data of the new problem while ensuring thatthe conclusions obtained from such statements remainvalid not only for the new problem but also for theoriginal problem. The idea of equivalence is also used inthe proofs of Theorems 2, 3, and 4 in the sequel, therebygreatly increasing the utility of the existing theory byappropriately modifying the due dates during the branchand bound algorithm. Even though the root of theequivalence idea lies in Theorem 1 of Lawler (1977), itshould be noted that Lawler's theorem alone does notprovide a way of creating the modi®ed problem because itrequires knowing an optimal sequence. On the otherhand, replacing Cj�S�� with Cj circumvents this di�cultyin a simple but strong way by eliminating the need toknow an optimal sequence.

4. Sequence construction

Theorem 1 gives su�cient conditions for optimality of theb-sequence which is nothing but the EDD sequence withrespect to modi®ed due dates obtained through the use ofthe dominance theorems of Emmons (1969). In this sec-tion, we present a method of constructing this sequence inan extremely e�cient way. If the b-sequence satis®es thesu�cient conditions of Theorem 1, an optimum is athand. Otherwise, the problem is decomposed to obtainnew subproblems each of which goes through the opti-mality test again with respect to their own b-sequences.Let aj � max�pj; dj� 8 j. Sequence the jobs in non-de-

creasing order of their aj values with tied jobs sequencedin increasing order of their indices. The tie breaker de®nesthe resulting sequence uniquely. We call this sequence thea-sequence. Note that the well-known heuristic rule MDD

An e�cient algorithm for the single machine total tardiness problem 663

Page 4: An Efficient Algorithm for the Single Machine Total Tardiness Problem

(Modi®ed Due Date) (Baker and Bertrand, 1982) whichreplaces a job's due date by the current time t plus thatjob's processing time whenever dj < t � pj gives the ajvalues if t � 0. However, we use the a-sequence as a seedto begin the construction of the b-sequence, rather thanas a heuristic rule. Observe that any optimal solution tothe problem with due dates aj is also an optimal solutionfor the problem with the original due dates dj. This fol-lows from the fact that pj > dj implies job j will be tardyregardless of where it is placed in the sequence. Hence,shifting its due dates to pj simply changes the tardiness ofjob j by a constant amount (see also Tansel and Sab-uncuoglu (1997) for additional discussion).For ®xed j, partition J ÿ fjg into sets LDj � fi 2 J :

i < j and ai � ajg; LUj � fi 2 J : i < j and ai > ajg;RDj� fi 2 J : i > j and ai < ajg, and RUj � fi 2 J : i > jand ai � ajg. We call LDj�RDj� the left-down (right-down)set and LUj�RUj� the left-up (right-up) set of job j. Theterminology is motivated by the fact that if we plot thepoints �p1; a1�; . . . ; �pn; an� in the plane and pass a hori-zontal and a vertical line through the ®xed point �pj; aj�,the plane is divided into four quadrants at �pj; aj� each ofwhich contains one of the above-mentioned sets. Notethat jobs whose points fall on the horizontal or the ver-tical line through the point �pj; aj� belong either to theleft-down set or the right-up set of job j depending ontheir indices. Note also that if �pi; ai� � �pj; aj�; then i is inLDj if i < j and i is in RUj if i > j. Let Dj � LDj [ RDjand Uj � LUj [ RUj. Call Dj the down-set and Uj the up-set of job j.The ®nal b-sequence is constructed from the a-se-

quence by means of successive transformations of duedates. Observe that if we take the initial b-sequence to bethe a-sequence, then the set Fj in Theorem 1 is the same asDj and J � � fj 2 J : RDj 6� ;g. We refer to the checkingof su�cient conditions bj � p�Dj� 8 j 2 J� as the b-test.We initially apply this test to the a-sequence. If the testfails, the data transformation is initiated and continueduntil either the b-test is passed or the transformation is nolonger possible. This transformation of the problem intoa new form which is more easily solvable than the initialform is one of the features that distinguishes our algo-rithm from the existing algorithms.The following property is a direct consequence of

Theorem 1 of Emmons (1969):

Property 1. There is an optimal sequence such that LDj is apredecessor set of job j and RUj is a successor set of job j.

Property 1 gives an easy way to build the predecessor setsof all jobs. The O�n2� procedure below accomplishes thisby successively expanding left-down sets. This procedurede®nes the b-sequence recursively. We begin the proce-dure with bj � aj 8 j (or with the most recent bj valuesobtained from the procedure RIGHT-EXPAND which isexplained later).

LEFT-EXPAND

Step 1. Select any job k which is not selected yet (if noneleft, stop).

Step 2. Compute p�LDk�.Step 3. If p�LDk� � pk > bk, then increase bk to

p�LDk� � pk, rede®ne LDk with respect to the newpoint �pk; b

newk � and return to Step 2). Otherwise,

add job k to the list of selected jobs and return toStep 1).

Example 1 the procedure is illustrated in Fig. 2 fork � 10. Initially, b10 � a10 � 186 and LD10 � f1; 5; 7g.With p�LD10� � p10 � �10� 57� 66� � 82 � 215, Step 2updates b10 to 215 and LD10 now includes job 8 in addi-tion to 1; 5; 7. The vertical arrow that originates at job10's point in the ®gure indicates this expansion. Theprocedure continues with the second, third, and forthexpansions as shown by the vertical arrows. The ®nal b10

is 544 and LD10 is f1; 2; 3; 4; 5; 6; 7; 8; 9g.Let b � �b1; . . . ; bn� be a vector of bj values of jobs at

any time during the operation of algorithm LEFT-EX-PAND. Rede®ne LDj; LUj;RUj;RDj;Dj;Uj; and J� withrespect to the points �p1; b1�; . . . ; �pn; bn�.The b-test can be conducted at any time during the

algorithm LEFT-EXPAND. A positive answer is con-clusive (optimum at hand) while a negative answer signalsthe need to further increase the bj values (especially thosethat failed in the test). In case of failure of the optimalitytest, we can use the information available in the right-down sets to further increase the bj values based onTheorem 2 of Emmons. For a ®xed job j, if we choose a

Fig. 2. Illustration of procedure LEFT-EXPAND for job 10.

664 Tansel et al.

Page 5: An Efficient Algorithm for the Single Machine Total Tardiness Problem

job k in the most recent right-down set of job j, all con-ditions of Emmons' second theorem except possibly thelast one are automatically satis®ed. That is, if we takeBk � LDk;Ak � RUk, and regard bj and bk to be the newdue dates, then the only remaining condition that needsto be checked is the inequality bj � pj � p�J ÿ RUk� (thecondition bj > max�p�Bk� � pk; dk� in Theorem 2 of Em-mons (1969) is automatically satis®ed by all jobs k in RDjsince bj > bk for such jobs due to the de®nition of RDj).Whenever this inequality is satis®ed, job k quali®es as apredecessor to job j. Such jobs in the right-down set of agiven job can be used to further increase the bj value ofthat job. This is done by the following O�n2� procedure,RIGHT-EXPAND.

RIGHT-EXPAND

Step 0. Let Bj � LDj and Aj � RUj 8 j where LDj;RUjare the left-down and right-up sets correspondingto either bj � aj 8 j or the most recent values ofb1; b2; . . . ;bn obtained from LEFT-EXPAND.Let Ej � p�Bj� � pj and Lj � p�J ÿ Aj� 8 j. Setj � n� 1.

Step 1. Set j jÿ 1. If j � 0 stop, else go to Step 2.Step 2. Identify all jobs k 2 RDj (de®ned by the most

recent bj values) which are not in Bj and whichsatisfy the inequality bj � pj � Lk. Let R be theset of jobs so identi®ed. If R is null, go to Step 1;else go to Step 3.

Step 3. Replace Bj by Bj [ R and increase Ej by p�R�. Foreach k in R, replace Ak by Ak [ fjg and decreaseLk by pj. If Enew

j > bj, increase bj to Enewj , and

return to Step 2. Otherwise return to Step 1.

Consider the following example:

Example 2. The Ej and Lj values in Table 1 are computedwith respect to the LDj and RDj sets de®ned by the ajvalues. For example, E4 � p�LD4� � p4 where LD4 �f2; 3g. For j � 8; 7, Step 2 outputs empty RDj. Continu-ing with j � 6, Step 2 gives RD6 � f7; 8g and R � f7g.

Hence, Step 3 gives B6 � LD6 [ R � f1; 2; 3; 4; 5; 7g, in-creases E6 to 644, updates A7 to RU7 [ f6g � f8; 6g, de-creases L7 to 540, and increases b6 to 644. Returning toStep 2, with j � 6 again, the execution of Steps 2 and 3yield no change for this job, so the algorithm returns toStep 1 and continues with j � 5. Processing job 5 throughtwo cycles of Steps 2 and 3, we obtain B5 � f1; 2; 3; 4; 7g;E5 � 540;A7 � f8; 6; 5g; L7 � 437, and b5 � 540. Con-tinuing in like manner with empty RDj sets for j � 4; 3; 2and with two passes of Steps 2 and 3 for j � 1, we obtainthe ®nal b-sequence �2; 3; 4; 7; 1; 5; 6; 8� with correspond-ing bj values of �80; 171; 265; 380; 437; 540; 644; 656�.We experimented with three strategies for constructing

the b-sequence which we call LEFT, LEFT±RIGHT, andFULL. LEFT uses only LEFT-EXPAND while LEFT±RIGHT ®rst uses LEFT-EXPAND, then switches toRIGHT-EXPAND, then executes LEFT-EXPAND onemore time and stops. FULL uses LEFT-EXPAND andRIGHT-EXPAND repeatedly, one after the other, untilno further updating occurs in the bj values. The numberof nodes of the enumeration tree generated by LEFTexceeds in general those generated by LEFT±RIGHTwhich usually marginally exceeds those generated byFULL. Based on our computational tests, we note how-ever that, in terms of average CPU times, LEFT generallyoutperforms both LEFT±RIGHT and FULL except forthe most di�cult classes of instances. A more detaileddiscussion of the e�ciency issues is given in Section 6.

5. Decomposition

In this section we give three decomposition theorems:exact, key position, and branch decomposition. All ofthese decompositions try to identify a position in the se-quence to split the problem into subproblems. However,there is a major di�erence between branch decompositionand the other two. Whereas branch decomposition canonly identify a set of candidate positions for splitting theproblem, the other two ®nd a speci®c position for de-composition. Branch decomposition leads to di�erentbranches from a given node in a branch and bound pro-cedure and to di�erent states in a dynamic programmingprocedure. Either way, branch decomposition typicallyleads to an exponential number of subproblems. In con-trast, exact and key position decompositions do not leadto any branching. Hence, the major strength of exact andkey position decompositions is their ability to continuewith a single branch (or a single state in dynamic pro-gramming), but their major weakness is that there is noguarantee that such a decomposition may exist. Conse-quently, the latter two types of decompositions do nothave the ability to lead to a branch and bound or dynamicprogramming algorithm. However, if they are combinedwith branch decomposition, the outcome is a fasterbranch and bound or dynamic programming procedure.

Table 1. The data for Example 2

j pj dj aj Ej Lj

1 57 427 427 57 4372 80 67 80 80 1373 91 42 91 171 2284 94 31 94 265 3225 103 510 510 425 5406 104 622 622 529 7607 115 322 322 380 6448 116 548 548 656 760

An e�cient algorithm for the single machine total tardiness problem 665

Page 6: An Efficient Algorithm for the Single Machine Total Tardiness Problem

In the proposed algorithm, we use exact decompositionwhenever the b-test fails. If no job quali®es for exactdecomposition, we continue with key position decompo-sition which is an enhanced version of Theorem 5 ofChang et al. (1995). If this decomposition also fails, wecontinue with branch decomposition which is an en-hanced version of the decomposition proposed by Pottsand van Wassenhove (1987). The latter two decomposi-tion theorems are enhanced in that the bj values are usedin these theorems instead of the initial due dates, therebypermitting repeated use of these theorems with continu-ally modi®ed data. Note that the original decompositionof Lawler (1977) is also in the category of branch de-composition. In fact, Potts and van Wassenhove's de-composition strengthens Lawler's by eliminating certainalternative decomposition positions.The exact decomposition theorem looks for a job q that

quali®es for decomposition and assigns it to a speci®cposition in the sequence which splits the problem into twosubproblems corresponding to the down-set and up-set ofjob q. The theorem uses the information that is availableat the end of the construction of the b-sequence. Thisinformation includes the ®nal sets, LDj; LUj;RDj;RUj, aswell as the ®nal values of the latest times L1; . . . ; Ln (IfLEFT is used, the Lj's are taken to be p�J� ÿ p�RUj�where the RUj's are the ®nal right-up sets at the end ofLEFT-EXPAND). Note that the time complexity ofsearching for a job q that quali®es for exact decomposi-tion is O�n2�.

Theorem 2 (Exact decomposition). Let q 2 J . If

(i) either RDq � ; or pq � bq � Li 8 i 2 RDq and(ii) either LUq � ; or pj � bj � Lq 8 j 2 LUq;

then there is an optimal sequence such that all jobs in Dqprecede q and all jobs in Uq succeed q. This implies that theproblem decomposes in the form (Dq; q;Uq) where thesubproblems corresponding to the sets Dq and Uq are solvedindependently. The jobs in Uq have a ready time ofp�Dq� � pq.

Proof. Consider the problem with modi®ed due datesb1; . . . ; bn. For this problem it can be shown, via Theorem2 of Emmons (1969) that (i) in the theorem implies thatRDq is a predecessor set of q while (ii) implies that LUq is asuccessor set of q. Property 1 implies that LDq is also apredecessor set of q and RUq is also a successor set of q.Hence, Dq is a predecessor set and Uq is a successor set forjob q for the problem with due dates b1; . . . ;bn, i.e., thereexists a sequence S � �S1; q; S2� which is optimal withrespect to due dates b1; . . . ;bn such that S1 is a permu-tation of Dq and S2 is a permutation of Uq. Consider nowthe original problem with due dates d1; . . . ; dn. If we as-sign Cj � Ej where the Ej's are the earliest times obtainedat the end of the construction of the b-sequence, then thedue dates bj 2 �dj;max�dj;Cj��. If S� is an optimal

sequence that obeys the so far obtained precedence rela-tions, then Cj � Cj�S��8 j so that bj are in the intervals�min�dj;Cj�S���;max�dj;Cj�S����. Theorem 1 of Lawler(1977) implies that �S1; q; S2� is an optimal sequence withrespect to due dates d1; . . . ; dn. j

Given the earliest times E1; . . . ;En and the latest timesL1; . . . ;Ln during or at the end of the construction of theb-sequence, if Ej � Lj for some j then the problem clearlydecomposes in the form �Dj; j;Uj� which is the kind ofblock decomposition given by Szwarc and Mukhopad-hyay (1996). In this case, it is straightforward to showthat conditions (i) and (ii) of Theorem 2 are satis®ed.However, Theorem 2 may yield a decomposition even ifEj < Lj 8 j. Example 3 illustrates this.

Example 3. The data for this example are given in Table 2.

Taking Bj � LDj;Aj � RUj and bj � max�Ej; dj�, we havethe results listed in Table 3. Even though Ej < Lj 8 j,there is a decomposition at job 7 since RD7 � ; and b1�p1 � 99� 1 > L7 � 59; b2 � p2 � 193� 3 > L7; b4 � p4 �143� 9 > L7; b6 � p6 � 132� 14 > L7. Thus, jobs D7 �f3; 5g and U7 = f1, 2, 4, 6, 8, 9, 10, 11, 12, 13, 14, 15, 16gcan be sequenced independently.Next we give the key position decomposition theorem.

Theorem 3 (Key position decomposition). Let Sb ���1�; . . . ; �n�� be the b-sequence and assume job n is in po-sition k in Sb. Let Sb�i� be the sequence obtained from Sb bymoving job n from position k to position i and let T �Sb�i��be the total tardiness of Sb�i� computed with respect to thedue dates b1; . . . ; bn. De®ne the key position index h to bethe smallest index such that

T �Sb�h�� � mink�i�n

T �Sb�i��:

Table 2. The data for Example 3

j pj dj

1 1 992 3 1933 4 264 9 1435 13 456 14 1327 15 528 15 2159 16 20810 17 9411 17 9512 17 11613 17 15214 17 19715 18 6116 18 173

666 Tansel et al.

Page 7: An Efficient Algorithm for the Single Machine Total Tardiness Problem

If h < n, then there is a sequence S � �S1; S2� which isoptimal for the problem with due dates b1; . . . ; bn as well asfor the original problem such that S1 is a permutation of thejobs that occupy the ®rst h positions in Sb while S2 is apermutation of the jobs that occupy the last nÿ h positionsin Sb.

Proof. The theorem is an equivalent statement of Theo-rem 5 of Chang et al. (1995) where the di's are replacedby the bi's and the EDD sequence is replaced by the b-sequence. Hence, the optimality of S � �S1; S2� forthe problem with due dates b1; . . . ; bn is immediate fromthe theorem of Chang et al. (1995). The optimality ofS � �S1; S2� for the original problem follows from Law-ler's theorem using again the fact that bj 2 �dj;max�dj;Cj�� � �min�dj;Cj�S���;max�dj;Cj�S���� where S�is an optimal sequence that obeys the obtained prece-dence relations. j

With the above theorem, whenever the key position indexh < n, we can split the current job population into twosubproblems, one consisting of the jobs in positions1; . . . ; h in the b-sequence, the other consisting of those inpositions h� 1; . . . ; n in the b-sequence. The ready timeof the latter set is rede®ned to be the sum of the pro-cessing times of the ®rst set. Note that checking for keyposition decomposition takes O�n2� time.Both Theorems 2 and 3 are based on the b-sequence.

The two theorems di�er in two respects:

1. The key position decomposition can be carried outonly relative to the longest job whereas the exactdecomposition can be carried out relative to anyqualifying job q.

2. Key position decomposition decomposes the prob-lem into two sets without specifying the position of

any of the jobs (including the longest job) whereasthe exact decomposition speci®es the position of thequalifying job q.

Next, we present the branch decomposition. In whatfollows, we say job j passes the b-test with strict inequalityif bj > p�Dj� and with equality if bj � p�Dj�: If Dj � ;,take p�Dj� � 0. Theorem 4 below is an equivalent state-ment of Potts and van Wassenhove (1987) decompositionwhere the dj's are replaced by the bj's. Lawler's theoremimplies that the resulting decomposition for the problemwith due dates b1; . . . ; bn is also valid for the originalproblem.

Theorem 4 (Branch decomposition). Assume job n is inposition k in the b-sequence. The problem decomposes withjob n in position l for some l satisfying one of the followingconditions:

(i) l � k and the �k � 1�st job in the b-sequence passesthe b-test with strict inequality.

(ii) l 2 fk � 1; . . . ; nÿ 1g and the �l�th job either failsthe b-test or passes with equality while the �l� 1�stjob passes the test with strict inequality.

(iii) l � n and the �n�th job either fails the test or passeswith equality.

The following example illustrates the Branch decompo-sition.

Example 4 the data for this example are given in Table 4.The job with the largest processing time is job 16, and it isin the ®rst position of the b-sequence, so k � 1. The b-sequence, together with bj values, down-set totals of thejobs, and passes and fails of the test are given in Table 5.A ``p'' stands for a pass with strict inequality, an ``=''

sign stands for a pass with equality, and an ``X'' stands

Table 3. The results for Example 3

j Ej Lj bj

1 1 85 992 4 163 1933 4 8 264 14 128 1435 17 30 456 32 128 1327 32 59 528 74 211 2159 75 211 20810 49 125 9411 66 142 9512 84 159 11613 124 176 15214 144 211 19715 50 193 6116 160 211 173

Table 4. The data for Example 4

j pj bj

1 1 4562 14 4323 15 2464 15 3855 15 4906 23 3257 26 5198 35 4189 41 39010 42 44011 43 41712 49 47513 59 50614 66 43215 77 29216 85 241

An e�cient algorithm for the single machine total tardiness problem 667

Page 8: An Efficient Algorithm for the Single Machine Total Tardiness Problem

for a failure of the test. The test results imply directly thatthe problem decomposes with job 16 in positions 1, 12,and 16.

Szwarc and Mukhopadhyay (1996) give an additionalrule that eliminates some of the positions permitted bythe decomposition of Potts and van Wassenhove (1987).This rule is adapted to the b-sequence as follows. Thealgorithm BETA uses this additional rule in the Branchdecomposition.

Rule 5c (Szwarc and Mukhopadhyay (1996)). Let the b-sequence be �1�; . . . ; �n� and assume job n is in position k inthe b-sequence. Then job n is not in position s, s � k, ifthere exists a job i, i 2 f�k � 1�; . . . ; �sÿ 1�g such thatp�1� � � � � � p�kÿ1� � p�k�1� � � � � � p�s� � p�n� � pi � di.

Rule 5c additionally eliminates position 12 in Example4 since 372� 85 < p14 � d14 � 498:

6. The algorithm

The proposed algorithm, which we call BETA, is a depth-®rst branch and bound algorithm. The algorithm ®rstchecks whether the problem on hand is in the list ofalready solved subproblems (found-solved). If not, the b-sequence is constructed and the b-test applied. If the testis passed, then the b-sequence is optimal. Otherwise, thealgorithm looks for a job that quali®es for exactdecomposition. If such a job is found, the problem isdecomposed and each subproblem is solved via BETA.Otherwise, the algorithm looks for a position h thatquali®es for key position decomposition. If such aposition is found, decomposition is applied and thesubproblems are solved via BETA. Else, the algorithmcontinues with branch decomposition. During the

processing of the branches, BETA is called separately foreach branch and two fathoming criteria are applied, onebased on found-solved, the other one based on a lowerbound. An example that illustrates the algorithm is givenin the Appendix.Of the two fathoming criteria, found-solved ®rst checks

at each branch if the subproblem under consideration hasalready been solved during the processing of some otherbranch. If the answer is yes, just take the solution, oth-erwise continue with the algorithm in the usual way.Found-solved signi®cantly reduces the number ofbranches in most instances. As a result, the total CPUtime decreases even though additional time is spent perbranch for checking the pre-solved problems. The found-solved list contains the most recent k subproblems thathave been solved during the algorithm where k � 1000 inour implementation.The second fathoming criterion is based on a lower

bound. For the lower bound computation, we use the b-sequence and de®ne a relaxed problem with new duedates dnew

j � maxfbj; p�Dj�g. With the new enlarged duedates, the b-test will necessarily pass and the resulting b-sequence will be optimal for the modi®ed problem. Sincednew

j � bj 8 j, the total tardiness of the b-sequence withrespect to the new due dates gives a lower bound. Weproceed with the b-test only if the lower bound fails tofathom that node.A few words on simpli®ed bookkeeping and compu-

tational e�ciency of the algorithm are in order. Thebookkeeping is greatly simpli®ed by keeping an n-vectorwhich concisely represents all precedence relations. Thisn-vector contains the job indices in the same order as theyappear in the b-sequence (and the associated recordsbj;Ej; Lj; pj; dj 8 j). From this n-vector, one can easilyobtain the sets Bj, Aj and the associated earliest and latesttimes Ej, Lj. This results in O�n� space for keeping trackof precedence relations which lead to O�n2� space re-quirement for a depth-®rst branch and bound tree,whereas existing methods accomplish the same thing inO�n3� space. As for time requirements, the proceduresLEFT and LEFT±RIGHT obtain the b-sequence (andhence the associated precedence relations) in O�n2� time,whereas the existing methods require O�n4� time (O�n2�checking required each time a new relation emerges) forconstructing the precedence diagram.With these considerations, the time and space re-

quirements of the existing methods can easily becomeprohibitive for n > 200 whereas our way of handling thedata allows us to solve much larger problems. The timebounds of the di�erent components of the algorithmBETA are given in Table 6. Computational tests indicatethat, on the average, about 80% of the total CPU timeper branch is spent on the b-sequence computation, about15% on exact and key position decomposition, and about5% on branch decomposition, lower bound, and found-solved.

Table 5. The results for Example 4

b-sequence bj p(Dj) Test result

16 241 0p

3 246 85p

15 292 100p

6 325 177p

4 385 200p

9 390 215p

11 417 256p

8 418 299p

2 432 334p

14 432 348p

10 440 414p

1 456 456 =12 475 457

p5 490 506 X13 506 521 X7 519 580 X

668 Tansel et al.

Page 9: An Efficient Algorithm for the Single Machine Total Tardiness Problem

7. Computational results

The performance of the proposed algorithm is measuredon a Sparc Station Classic with a 60 MHz microSPARC8-CPU and a 384 GB main memory. The code is writtenin the C language. The data generation scheme initiallyproposed by Fisher (1976), which has traditionally beenused in the literature since then, is used to test the algo-rithm for di�erent types of instances. These instances ofvarying degrees of di�culty are generated by means oftwo factors: tardiness factor, T , and range of due dates, R.For each problem, ®rst the process times are generatedfrom a uniform distribution with parameters (1, 100).Then the due dates are computed from a uniform distri-bution which depends on p� � p�J� and on R and T . Thedue date distribution is uniform over �p��1ÿ T ÿ R=2�;p��1ÿ T � R=2��: The values of T and R are selected fromthe sets f0:2; 0:4; 0:6; 0:8g and f0:2; 0:4; 0:6; 0:8; 1g, re-spectively. This gives 20 combinations of R; T factors foreach problem size. We refer to each �R; T � combination asa problem type. The number of jobs n is chosen from

f100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 300,400, 500g:We solve 10 di�erent instances for each settingof n;R; T giving a total of 200 instances for each choice ofn. The total number of random instances solved is 2800.The time limit to abandon a solution is 80 hours.A distinctive feature of the proposed algorithm is that

it is capable of handling very large populations of jobswhereas this seems to be an impossibility for the existingalgorithms in the literature for n > 200. Even though thehardest instances could not be solved to optimality forn � 500, we still have an 87.5% success rate for this size(i.e., 87.5% of the generated instances with n � 500 aresolved to optimality).The detailed results are given in Table 7. Each cell in

that table represents the average CPU seconds of 10 in-stances for each �R; T � pair and n. Of the three versions(LEFT, LEFT±RIGHT, FULL) of the algorithm BETA,the reported numbers except those in parentheses arebased on LEFT. The parenthetical numbers are the av-erage CPU seconds for LEFT±RIGHT and are suppliedonly for those cells where LEFT±RIGHT performed

Table 7. Average CPU seconds for the LEFT version of BETA (numbers in parenthesis are the CPU seconds for LEFT±RIGHT)

R T n = 100 n = 150 n = 200 n = 300 n = 400 n = 500

0.2 0.2 0.03 0.13 0.70 5.75 31.40 88.500.2 0.4 2.17 6.94 125.17 2638.50 29107.4 ±0.2 0.6 4.61 26.04 446.68 33430.90 ± ±0.2 0.8 1.38 15.77 337.99 2269.20 107433.3 (67726.9) ±0.4 0.2 0.00 0.00 0.00 0.00 0.00 0.000.4 0.4 0.35 1.11 14.27 134.60 711.90 2685.800.4 0.6 1.96 6.54 92.53 1429.10 6602.90 74286.20 (67317.9)0.4 0.8 0.08 0.27 0.57 15.20 17.00 42.300.6 0.2 0.00 0.00 0.00 0.00 0.00 0.000.6 0.4 0.14 0.19 0.60 19.90 24.40 56.400.6 0.6 1.43 4.56 34.88 386.30 1562.80 7328.900.6 0.8 0.04 0.09 0.26 1.20 1.80 3.900.8 0.2 0.00 0.00 0.00 0.00 0.00 0.000.8 0.4 0.02 0.04 0.08 0.06 0.20 0.500.8 0.6 0.46 0.80 0.96 4.10 22.90 91.900.8 0.8 0.03 0.13 0.35 1.03 2.70 5.001.0 0.2 0.00 0.00 0.00 0.00 0.00 0.001.0 0.4 0.00 0.00 0.00 0.00 0.00 0.001.0 0.6 0.11 0.30 0.76 2.05 7.10 10.501.0 0.8 0.04 0.11 0.32 1.17 2.70 5.20

Avg. CPU 0.65 sec. 3.16 sec. 52.8 sec. 33.62 min. ± ±Max. CPU 11.80 sec. 66.72 sec. 21.7 min. 19 (17.1) h. ± ±

Table 6. Time bounds of the phases of BETA

b-sequenceconstruction

Optimality check(b-test)

Exactdecomposition

Branchdecomposition

Key positiondecomposition

Lower boundcomputation

Checking forfound-solved

O(n2) for LEFT orLEFT±RIGHT

O(n) O(n2) O(n) O(n2) O(n) O(n)

O(n4) for FULL

An e�cient algorithm for the single machine total tardiness problem 669

Page 10: An Efficient Algorithm for the Single Machine Total Tardiness Problem

considerably faster than LEFT. As remarked earlier, eventhough FULL gives the best overall performance in termsof the number of branches, its performance in terms ofthe average CPU seconds is inferior because FULLspends considerably more time per branch to compute theb-sequence than either LEFT±RIGHT or LEFT. Thenumber of branches generated by LEFT±RIGHT isgenerally quite close to those generated by FULL, butLEFT±RIGHT spends much less time per branch thanFULL. Hence, we recommend the use of LEFT±RIGHTfor n � 500 for the ®rst four most di�cult classes of in-stances corresponding to �R; T � pairs of�0:2; 0:6�; �0:2; 0:8�; �0:2; 0:4�; �0:4; 0:6�; where the trade-o� between the number of branches and the CPU timeper branch works in favor of the reduced number ofbranches. The same recommendation is valid for cate-gories �0:2; 0:6�; �0:2; 0:8� for n � 400. In all the remain-ing cases (i.e., the 16 categories of n � 500, the 18categories of n � 400, and all 20 categories of n � 300),we recommend the use of LEFT as it performs consid-erably better than LEFT±RIGHT in terms of the averageCPU time. Note, however, that the maximum CPU timeof LEFT±RIGHT is considerably better for largen �n � 300� than that of LEFT, e.g., 17.1 hours versus 19hours for n � 300. The general pattern is that it becomesmore advantageous to switch-over to LEFT±RIGHTfrom LEFT as n increases where the switch-over point isearlier for more di�cult classes of instances.As can be seen from Table 7, the average solution time

for n � 100 is less than 1 sec. The average solution timesfor n � 150; 200, and 300 are 3.16 secs, 52.8 secs, and33.6 minutes, respectively (the averages for n � 400 and500 are not available because not all instances are solvedfor those sizes). The worst encountered solution times are11.8 seconds, 1.11 minutes, 21.7 minutes, 17.1 hours forn � 100; 150; 200; 300, respectively (more than 80 hoursfor sizes 400 and 500). Thus, our algorithm handlesproblems of size 150 very successfully with an averagesolution time of about 3 seconds while the maximumsolution time for 150 job problems is about 1 minute. 200job problems are also successfully handled with an aver-age CPU time of less than 1 minute and a maximum CPUtime of about 22 minutes.Table 7 shows that the computation times are highly

dependent on the R; T factors. Of the three hard problemtypes �0:2; 0:4�; �0:2; 0:6�; and �0:2; 0:8�, the most di�cultone is the pair �0:2; 0:6�. For the blank cells in Table 7,the algorithm obtained suboptimal solutions for n � 400;and 500 whose average deviation from optimality is ap-proximately 5%. In all other �R; T � pairs and for allvalues of n, the algorithm obtained the exact solution forall generated instances.Observe that ®ve out of 20 �R; T � pairs have a solution

time of zero for all n values. Hence, the optimum is im-mediately found in 25% of all instances. These are theproblem types corresponding to �R; T � pairs with T � 0:2

with R � 0:4; 0:6; 0:8; 1 and T � 0:4 with R � 1. The op-timum is immediately found for these problems becauseeither the b-sequence for the initial job population hap-pens to be the optimum one or a few branches lead to theoptimum immediately.Figure 3 shows the average solution times in CPU

seconds as a function of n, ranging from 100 to 200.Observe that, up to n � 130, the solution times are lessthan 2 seconds. At n � 170 and 180, the solution timesare slightly more than 10 sec, and at n � 200, it is about53 seconds. Hence, the slope increases rather slowly,showing almost a linear trend, from n � 100 to 180. Aftern � 180, exponential behavior begins to take over. Eventhough our algorithm uses reduced memory (O�n� versusO�n2� in existing algorithms), this exponential behaviouris partly due to the hard disk and memory requirements.However, our computational experiments up to n � 500indicate that, the prohibitive nature of the exponentialbehavior makes itself felt at much larger n �n > 300�.Statistics on the performance of the algorithm in terms

of solution times, number of nodes and the depth of thetree are given in Table 8 in the range of n � 100 to 200with increments of 10. A comparison in terms of relativeincrease in CPU times indicates that the average solutiontime of the algorithm of Szwarc and Mukhopadhyay(1996) increases 23.69 times as n increases from 100 to 150while the average CPU time of BETA increases 4.86 timesover the same range. Potts and van Wassenhove (1982)do not report any computational times for this range of n,but their reported results indicate that the CPU time in-creases 35.61 times as n goes from 50 to 100.Szwarc and Mukhopadhyay (1996) report that, for

n � 150, the average number of branches and the averagedepth of the tree are 9005 and 10.7, respectively. For thealgorithm BETA and for n � 150, the average number ofnodes is 811(634) and the average depth of the tree is8.2(7.6) if LEFT(LEFT±RIGHT) is used which indicatesthat there is a reduction of about 11(14) times in theaverage number of branches and a reduction of about2.5(3) levels in the average depth of the tree depending onwhich of LEFT or LEFT±RIGHT is used.

Fig. 3. Solution times of BETA.

670 Tansel et al.

Page 11: An Efficient Algorithm for the Single Machine Total Tardiness Problem

Szwarc and Mukhopadhyay (1996) also report that, forn � 150, the maximum number of nodes and the maxi-mum depth of the tree are 455 233 and 36, respectively.For BETA, the maximum number of nodes is 15 178 andthe maximum depth of the tree is 31 for n � 150 if LEFTis used. The corresponding numbers are 11 206 and 31 ifLEFT±RIGHT is used. The signi®cant di�erence in theworst case performance may come from the use of dif-ferent test sets. For this reason, we generated 50 addi-tional random instances for the same n for the mostdi�cult category �R; T � � �0:2; 0:6�. This caused themaximum number of branches and the maximum depthof the tree to go up to 49 360 and 38, respectively. Themain reason for the reduction in the number of branchesis the combined use of various tools (optimality check,exact, key position, and branch decomposition with 5c,b-bound, found-solved) whose e�ects are magni®ed dueto the continuous reformation of the data at the nodes ofthe search tree. A particularly e�ective tool among theseseems to be found-solved as we observed in many runsthat the same set of subproblems have been encounteredin di�erent branches of the tree.Based on these comparisons, it is reasonable to con-

clude that BETA performs considerably better than theprevious state-of-the-art algorithms in terms of theaverage CPU times, the average rate of increase in CPUtimes, the average number of branches, and the average

depth of the tree as well as in the maximum number ofnodes and the maximum depth of the tree.

8. Beta-based heuristics

In this section, we give two new constructive heuristicsbased on the algorithm BETA. The ®rst heuristic,HEURBETA, ®rst computes the b-sequence usingLEFT±RIGHT. If the optimality check fails, it looks forexact or key position decomposition (Theorems 2 and 3).If no job quali®es for exact decomposition, it decomposesthe problem by assigning the longest job to the ®rst(smallest) position which quali®es for branch decompo-sition (Theorem 4). The resulting subproblems are han-dled in the same way. HEURBETA is an O�n3� proceduresince, in each pass, at least one job is ®xed followed by theb-sequence computation which takes O�n2�. The secondheuristic, INITBETA, simply computes the b-sequenceusing LEFT±RIGHT and takes it as the solution.INITBETA is an O�n2� procedure.In Table 9, we give a comparison of these heuristics, in

terms of the average and maximum observed deviationsfrom optimality, with three well known heuristics: thePSK heuristic of Panwalkar et al. (1993), the ATC (Ap-parent Tardiness Cost), heuristic of Rachamadugu andMorton, (1981) and the MDD (Modi®ed Due Date),

Table 8. Average and worst case statistics for n values ranging from 100 to 200

n LEFT LEFT LEFT±RIGHT

Solution time Number of nodes Depth of tree Number of nodes Depth of tree

Average Max Average Max Average Max Average Max Average Max

100 0.65 11.80 257 4281 6.4 23 229 4001 6.02 22110 0.76 11.48 332.3 4540 6.5 25 294 3773 6.3 25120 1.29 19.78 438.4 5915 6.97 25 369.7 4972 6.6 26130 1.72 32.83 539.9 8560 7.4 27 462.7 6559 7.01 27140 2.51 40.60 727.5 13 327 7.7 28 588.9 7015 7.3 28150 3.16 66.72 810.8 15 178 8.2 31 634 11 206 7.60 31160 5.05 65.65 1092.3 13 105 8.98 29 873.3 7601 8.3 29170 10.39 252.27 1966.7 37 491 9.3 33 1382.8 21 498 8.69 33180 10.13 188.50 1971.8 29 236 9.7 35 1547.8 19 954 9.0 35190 34.10 668.02 4267.2 74 246 10.5 37 2896 40 707 9.9 37200 52.80 1301.82 6173.9 10 5904 11.5 42 4019 56 139 10.7 42

Table 9. Average (maximum) percent deviations of heuristics from optimality

n = 100 n = 200 n = 300 n = 400 n = 500 Overall

HEURBETA 1.6 (11) 1.6 (11) 1.3 (12) 1.2 (12) 0.9 (5) 1.32 (12)PSK 7.7 (72) 6.1 (1) 3.3 (16) 5.8 (222) 9.8 (585) 6.54 (1)INITBETA 22.4 (88) 21 (86) 19.1 (92) 20.6 (104) 17.4 (93) 20.1 (104)ATC 45.7 (110) 46.9 (107) 47.3 (96) 47.6 (104) 41.2 (100) 45.7 (110)MDD 47.8 (110) 48.7 (110) 49.3 (99) 49.2 (104) 42.5 (104) 47.5 (110)

An e�cient algorithm for the single machine total tardiness problem 671

Page 12: An Efficient Algorithm for the Single Machine Total Tardiness Problem

heuristic of Baker and Bertrand (1982). The statistics arebased on the same set of 970 test problems that are solvedto optimality by the algorithm BETA forn � 100; 200; 300; 400; 500. Even though there are variousother heuristics in the single machine literature (Wilker-son and Irwin, 1971; Fry et al., 1989; Holsenback andRussel, 1992), we choose the above three heuristics forcomparison because PSK is the most successful one re-ported in the literature for 1==

PTi, while MDD and

ATC are the two most frequently used heuristics in thegeneral scheduling literature.In Table 9, the deviations are computed via the for-

mula

ZH ÿ ZZ

� 100%;

where ZH and Z are the total tardiness values of theheuristic and the optimal solutions, respectively. In termsof the average deviation from optimality, HEURBETAshows the best performance with an average deviation ofabout 1.3% while the closest competitor PSK yields anaverage deviation of about 6.5%. The superiority ofHEURBETA is much more pronounced in terms ofmaximum deviations. The maximum observed deviationof HEURBETA from optimality is 12% whereas PSKyields a maximum deviation of 585% (barring one in-stance which yielded 1% due to the optimal objectivevalue being zero). INITBETA is more primitive thanHEURBETA, but it performs reasonably well with anaverage deviation of about 20% which is poorer thanPSK but signi®cantly better than ATC and MDD whoseaverage deviations are close to 50%. In terms of maxi-mum deviations, INITBETA, ATC and MDD performnearly the same with a maximum deviation of about100% which is signi®cantly better than that of PSK.In terms of CPU seconds, the average running time for

INITBETA is essentially the same as that of the otherswhile the average running time of HEURBETA is about®ve or six times the others. Since all of these heuristics runin the order of seconds (e.g., a few seconds for n � 500 forHEURBETA), the di�erence in average CPU secondsseems insigni®cant relative to the substantial improve-ment obtained by HEURBETA in terms of the averageand maximum deviations from optimality. We note alsothat all of these heuristics require O�n� storage. In par-ticular, the remarkable success of HEURBETA in termsof the maximum deviation is noteworthy. This heuristicprovides a new practical tool for solving very largeproblems in a few seconds with a maximum expecteddeviation of only 12%.

9. Concluding remarks

This paper presents a new algorithm for 1==P

Ti. Theproposed algorithm is based on an integration and

enhancement of the existing theory. Its main componentsconsist of a simple optimality check, new decompositiontheory, a new lower bound, a check for presolved sub-problems, and modi®ed versions of existing theorems thatare enhanced through the use of an equivalence conceptwhich permits a repeated modi®cation of the data. Theuse of these techniques provides substantial savings inbookkeeping and solution time. The combined e�ect isthe ability to solve much larger size problems (e.g.,n � 500) than previously available in the literature. Ex-tensive computational tests with the new exact algorithmBETA and the heuristic algorithm HEURBETA indicatea signi®cant performance superiority over the existingexact and heuristic algorithms.

References

Baker, K.R. and Bertrand, J.W.M. (1982) A dynamic priority rule forscheduling against due-dates. Journal of Operations Management,3, 37±42.

Chang, S., Lu, Q., Tang, G. and Yu, W. (1995) On decomposition ofthe total tardiness problem. Operations Research Letters, 17, 221±229.

Du, J. and Leung, T. (1990) Minimizing total tardiness on one machineis NP-hard. Operations Research, 15, 483±495.

Emmons, H. (1969) One machine sequencing to minimize certainfunctions of job tardiness. Operations Research, 17, 701±715.

Fisher, M.L. (1976) A dual algorithm for the one machine schedulingproblem. Mathematical Programming, 11, 229±251.

Fry, T.D., Vicens, L., Macleod, K. and Fernandez, S. (1989) A heu-ristic solution procedure to minimize T -bar on a single machine.Journal of the Operational Research Society, 40, 293±297.

Holsenback, J.E. and Russell, R.M. (1992) A heuristic algorithm forsequencing on one machine to minimize total tardiness. Journal ofthe Operational Research Society, 43, 53±62.

Lawler, E.L. (1977) A `pseudo polynomial' algorithm for sequencingjobs to minimize total tardiness. Annals of Discrete Mathematics,1, 331±342.

Lenstra, J.K., Rinnooy Kan, A.H.G. and Brucker, P. (1977) Com-plexity of machine scheduling problems. Annals of DiscreteMathematics, 1, 343±362.

McNaughton, R. (1959) Scheduling with deadlines and loss functions.Management Science, 6, 1±12.

Panwalkar, S.S., Smith, M.L. and Koulamas, C.P. (1993) A heuristicfor the single machine tardiness problem. European Journal ofOperations Research, 70, 304±310.

Picard, J.C. and Queyranne, M. (1978) The time dependent travelingsalesman problem and its application to the tardiness problem inone machine scheduling. Operations Research, 26, 86±110.

Potts, C.N. and van Wassenhove, L.N. (1982) A decomposition algo-rithm for the single machine total tardiness problem. OperationsResearch Letters, 11, 177±181.

Potts, C.N. and van Wassenhove, L.N. (1985) A branch and boundalgorithm for the total weighted tardiness problem. OperationsResearch, 33, 363±377.

Potts, C.N. and van Wassenhove, L.N. (1987) Dynamic programmingand decomposition approaches for the single machine total tar-diness problem. European Journal of Operational Research, 32,405±414.

Rachamadugu, R.V. and Morton, T.E. (1981) Myopic heuristics forthe single machine weighted tardiness problem. Working Paper28-81-82, Graduate School of Industrial Administration, CarnegieMellon University, Pittsburgh, PA.

672 Tansel et al.

Page 13: An Efficient Algorithm for the Single Machine Total Tardiness Problem

Rinnooy Kan, A.H.G., Lageweg, B.J. and Lenstra, J.K. (1975) Mini-mizing total costs in one machine scheduling. Operations Re-search, 23, 908±927.

Schrage, L. and Baker, K.R. (1978) Dynamic programming solution ofsequencing problems with precedence constraints. OperationsResearch, 26, 444±449.

Schwimer, J. (1972) On the n job one machine sequence independentscheduling with tardiness penalties: a branch and bound solution.Management Science, 18, 301±313.

Sen, T.T., Austin, L.M. and Ghandforoush, P. (1983) An algorithm forthe single machine sequencing problem to minimize total tardi-ness. IIE Transactions, 15, 363±366.

Sen, T.T. and Borah, B.N. (1991) On the single machine schedulingproblem with tardiness penalties. Journal of the Operational Re-search Society, 42, 695±702.

Srinivasan, V. (1971) A hybrid algorithm for the one machine se-quencing problem to minimize total tardiness. Naval ResearchLogistics Quarterly, 18, 317±327.

Szwarc, W. and Mukhopadhyay, S.K. (1996) Decomposition of thesingle machine total tardiness problem. Operations ResearchLetters, 19, 243±250.

Tansel, B.C. and Sabuncuoglu, I. (1997) New insights on the singlemachine total tardiness problem. Journal of the OperationalResearch Society, 48, 82±89.

Wilkerson, J.I. and Irwin, J.D. (1971) An improved method forscheduling independent tasks. AIIE Transactions, 3, 239±245.

Appendix

The example below illustrates the steps of the algorithmBETA.Suppose we have 10 jobs to schedule corresponding to

the data plotted in Fig. A1. According to the steps of thealgorithm, we ®rst apply LEFT to obtain the b-sequence�5; 1; 7; 8; 3; 4; 2; 6; 9; 10� shown in Fig. A2.Jobs 5; 1; 7; 8; 3; 4 pass the b-test, but job 2 fails. Since

we ®nd at least one job failing, we cannot conclude thatthe b-sequence is optimum. Then we look at the exactdecomposition and try to decompose if possible. FromFig. A1 we see that RD7 � ;. This means that the ®rstcondition of the exact decomposition theorem is satis®ed.Since LU7 � f2; 3; 4; 6g the second condition to check is ifpi � bi � L7 8 i 2 LU7 and we ®nd that the inequality isvalid for all the jobs in LU7. Thus the problem decom-poses with job 7 at position 3 forming two subproblems.One problem has only two jobs f1; 5g and the otherproblem has 7 jobs f2; 3; 4; 6; 8; 9; 10g.For the two job case we can easily ®nd the optimum

sequence. It happens to be (5; 1). For the second problemwe apply the algorithm once more. The b-sequence con-structed for this subproblem is (8, 3, 4, 2, 6, 9, 10). We trythe b-test ®rst. The test fails so we try exact decomposi-tion. Note that the right-down and left-up sets of job 9are empty. Hence, the problem decomposes with job 9 atposition 6, with job 10 in the up-set and the others in thedown-set. After the decomposition with job 9, the down-set f2; 3; 4; 6; 8g will be solved. It decomposes with job 6in the last position. At this stage, jobs 7; 9; 6 are already in®xed positions (these are the jobs that quali®ed for exact

decomposition). With ®xed jobs shown in bold, we havethe following partial sequence where jobs in the bracketsde®ne independent subproblems.

f5 1g 7 f2; 3; 4; 8g 6 9 f10g:The only non-trivial subproblem is the one correspondingto the job population f2; 3; 4; 8g in the middle. Thus theproblem is now to schedule four jobs f2; 3; 4; 8g. Forthese jobs, the b-sequence is (8, 3, 4, 2) and the b-test, theexact decomposition, and key position decompositionfail. Hence, we apply branch decomposition using thelongest job (job 8).According to Theorem 4 and Rule 5c we ®nd two al-

ternative decompositions for job 8: the position it stays inthe b-sequence (from (i) of Theorem 4 since the secondjob in the b-sequence gets a strict pass in the b-test) andthe last position (from (iii) of Theorem 4 since the last jobgets a fail in the b-test). So, we continue with twobranches. In the ®rst branch we ®x job 8 at the ®rst po-sition and then schedule the set f2; 3; 4g. In this branch,the b-sequence is �3; 2; 4� and this happens to be the op-timum sequence for this job population. In the secondbranch, job 8 is placed in the last position and the other

Fig. A1. Data plot of example.

Fig. A2. Graph of b-values.

An e�cient algorithm for the single machine total tardiness problem 673

Page 14: An Efficient Algorithm for the Single Machine Total Tardiness Problem

jobs 2; 3; 4 are to be scheduled before it. The secondbranch yields an optimum for its subproblem which is nobetter than the ®rst branch. Hence, the ®nal optimumsequence is �5; 1; 7; 8; 3; 2; 4; 6; 9; 10�.

Biographies

Barbaros C. Tansel is the Chairman of the Department of IndustrialEngineering at Bilkent University, Ankara, Turkey. Dr. Tansel earnedhis Ph.D. degree in 1979 from the University of Florida, Department ofIndustrial and Systems Engineering. Prior to his appointment at Bil-kent University in 1991, he was a visiting faculty member at theGeorgia Institute of Technology and a faculty member at the Uni-versity of Southern California. Dr. Tansel's primary research interestsare in location theory, combinatorial optimization, and optimizationwith imprecise data. He has published in various journals includingOperations Research, Management Science, Transportation Science,European Journal of Operational Research, Journal of the OperationalResearch Society, International Journal of Production Research, andJournal of Manufacturing Systems.

Bahar Y. Kara received her Ph.D. degree in 1999 from Bilkent Uni-versity, Department of Industrial Engineering. Since then, Dr. Karahas worked as a Post-Doctoral Research Fellow at McGill University,

Faculty of Management. Her primary research interests are in hublocation, hazardous materials transportation, discrete optimization,and bilevel programming.

Ihsan Sabuncuoglu is an Associate Professor of Industrial Engineeringat Bilkent University. He received B.S. and M.S. degrees in IndustrialEngineering from the Middle East Technical University and a Ph.D.degree in Industrial Engineering from Wichita State University. Dr.Sabuncuoglu teaches and conducts research in the areas of scheduling,production management, simulation, and manufacturing systems. Hehas published papers in IIE Transactions, Decision Sciences, Simula-tion, International Journal of Production Research, International Journalof Flexible Manufacturing Systems, International Journal of ComputerIntegrated Manufacturing, European Journal of Operational Research,Production Planning and Control, Journal of the Operational ResearchSociety, Computers and Operations Research, Computers and IndustrialEngineering, OMEGA, Journal of Intelligent Manufacturing and Inter-national Journal of Production Economics. He is on the Editorial boardof International Journal of Operations and Quantitative Managementand also the Journal of Operations Management. He is an associatemember of Institute of Industrial Engineering, Institute of Simulation,and Institute of Operations Research and Management Science.

Contributed by the Scheduling Department

674 Tansel et al.