multi-project scheduling using competent genetic …ay11/ga-scheduling.pdfmulti-project scheduling...

Multi-Project Scheduling using Competent Genetic Algorithms

Ali A. Yassine+ Christoph Meier Tyson R. Browning Department of Industrial &

Enterprise Systems Engineering University of Illinois at Urbana

Urbana, IL 61801 USA [email protected]

Institute of Astronautics Technische Universität München

85748 Garching, Germany [email protected]

Neeley School of Business Texas Christian University

Fort Worth, TX 76129 USA [email protected]

University of Illinois

Department of Industrial & Enterprise Systems Engineering (IESE)

Working Paper

Current version: Feb. 1, 2007

+ Corresponding author. The second author is grateful for support from the Bavarian Science Foundation.

Nomenclature (Project Scheduling) m Number of projects n Total number of activities nh Number of activities in project h h Index of Project Vz Set of activities {a(1)…a(nh)} in project h di Processing time for activity i i→j Predecessor relationship Pred(i) Set of predecessors for activity i

kR Set of renewable resources of type k

ihkr Per period usage of activity i of resource k in project h Kh Number of types of resources used by project h θ Set of feasible schedules θT Set of precedent feasible schedules θR Set of resource feasible schedules Cmax Vector of task completion times ARLF Average Resource Loading Factor AUF Average Utilization Factor CPh Non-resource-constrained critical path duration of

project h Xiht Boolean variable, true (equal to 1) if activity i of project h

is active at time t Ziht Equal to -1 if the part of activity i of project h is active at

time t ≤ CPl / 2; otherwise equal to 1 S Number of time intervals spanning a problem

Nomenclature (Genetic Algorithm) SGA Simple Genetic Algorithm CGA Competent Genetic Algorithm BB Building Block c Constant factor kBB Order or size of BBs l Current chromosome/problem length mBB Number of BBs npop Population size pcut Cut probability pκ Bitwise cut probability pm Mutation probability ps Slice probability s Size of tournament used in the

Tournament Selection phase μ Calibration coefficient Φp Phenotypic search space Φg Genotypic search space

ABSTRACT

In a multi-project environment, many projects are to be completed that rely on a common pool of

scarce resources. In addition to resource constraints, there exist precedence relationships among activities

of individual projects. This project scheduling problem is NP-hard and most practical solutions that can

handle large problem instances rely on priority-rule heuristics and meta-heuristics rather than optimal

solution procedures. In this paper, a Competent Genetic Algorithm (CGA), hybridized with a local search

strategy, is proposed to minimize the overall duration or makespan of the resource constrained multi-

project scheduling problem (RCMPSP) without violating inter-project resource constraints or intra-

project precedence constraints. The proposed Genetic Algorithm (GA) with several varied parameters is

tested on sample scheduling problems generated according to two popular multi-project summary

measures, Average Utilization Factor (AUF) and Average Resource Load Factor (ARLF). The

superiority of the proposed CGA over simple GAs and well-known heuristics is demonstrated.

Keywords: Multi-Project Scheduling, Resource Constraints, Competent Genetic Algorithm, Heuristic

Priority Rules

1

1. INTRODUCTION

Due to increasingly impatient customers and competitive threats, improvements in the efficiency

with which projects are completed and new products are brought to market have become increasingly

important. To complicate things further, many organizations are faced with the challenge of managing

the simultaneous execution and management of a portfolio of projects under tight time and resource

constraints. In such an environment, project management and scheduling skills become very critical to

the organization. “Multi-project environments seem to be quite common in project scheduling practice….

It has been suggested [65,75] that up to 90%, by value, of all projects are carried out in the multi-project

context, and thus the impact of even a small improvement in their management on the project

management field could be enormous” [34].

In this paper, we address the case of a portfolio of simultaneous projects with identical start times.

Each project consists of precedence-constrained activities that draw from common pools of resources,

which are usually not large enough for all of the activities to work concurrently. In such cases, which

activities should get priority? The goal is to prioritize them so as to optimize an objective function, such

as minimizing the delay of each project or of the whole portfolio. Such is the basic resource-constrained

multi-project scheduling problem (RCMPSP).

In a RCMPSP environment, a company has m concurrent projects P1…Pm, each comprised of a set

of activities Vz = {a(1)…a(nh)}, where nh specifies the total number of activities in project Ph. In addition,

any activity i has several associated attributes, such as its duration di and the types and amounts of

resources required. Each project will have a corresponding precedence network whose structure is often

depicted by an activity-on-node network. The predecessor relationship between two activities i and j is

denoted by i→j or (i,j), and the entire set of predecessors of activity j is denoted by the term Pred(j).

Although projects may be unrelated by precedence constraints, they depend on a common pool of

resources and are therefore related by resource constraints. We consider a set of renewable resources

where the per-period usage of activity i of resource k in project h is written as ihkr . The set kR constitutes

the constant number of resource k available during every time period. We assume that a resource must be

devoted to an activity until it is completed before beginning another activity (i.e., no preemption is

allowed). Moreover, we assume the single-mode case, where a single resource type is assigned for

2

performing a particular activity, and the processing time pi and the resources required ihkr for any

activity i are fixed. When considering the set of feasible project schedules θ = θT ∩ θR, where θT denotes

the set of precedent-feasible schedules and θR denotes the set of resource-feasible schedules, there exist

many possible θs and many potential objectives for choosing between them. If n defines the total number

of activities in all m projects, popular objectives include minimizing the maximum project makespan

(i.e., minimizing Cmax = max{d1…dn}), maximizing the net present value (NPV), maximizing resource

leveling, or minimizing project costs [8,46].

This paper utilizes a new genetic algorithm (GA) approach to solve the RCMPSP; however, in

contrast to previous research, we use a state-of-the-art competent GA (CGA) design. This design differs

from simple GA (SGA) strategies as it strives to identify highly fit hyperplanes in the search space, called

building blocks (BBs), which are subsequently combined in a sophisticated manner. The resulting project

schedules are expected to be superior compared to those produced by priority rule heuristics or SGAs. To

cope with the precedence and resource constraints in the RCMPSP, we introduce an efficient repair

mechanism that ensures feasibility throughout the search. We further enhance the performance of the

CGA with a local search strategy tailored for the RCMPSP. The performance of this approach is

thoroughly tested on 77 test problems, carefully constructed according to two popular multi-project

summary measures, the Average Resource Load Factor (ARLF) and the Average Utilization Factor

(AUF), stated by Kurtulus and Davis [48]. As a result, we discovered that the parallel schedule

generation scheme (SGS) outperforms the self-adapting SGS in combination with the proposed CGA.

Furthermore, the proposed CGA outperforms many well-known priority-rule-based heuristics in 78% of

the problems.

The paper proceeds as follows. Section 2 provides background on activity scheduling and SGAs,

after which §3 presents the design and benefits of a CGA for combinatorial problems and §4 describes its

tailoring to RCMPSPs. §5 evaluates the performance of the proposed CGA compared with a SGA on the

RCMPSP test bank. In §6, we present comparative results of the CGA with 20 popular heuristic priority

rules used in the literature. The paper concludes in §7 with a brief summary of the work completed and

possible extensions for future work.

3

2. BACKGROUND

Project scheduling is of great practical importance and its general model can be used for applications

in product development, as well as production planning and a wide variety of scheduling applications.

Early efforts in project scheduling focused on minimizing the overall project duration (makespan)

assuming unlimited resources. Well-known techniques include the Critical Path Method (CPM) [40] and

the Project Evaluation and Review Technique (PERT) [56]. Scheduling problems have been studied

extensively for many years by attempting to determine exact solutions using methods from the field of

operations research [46].

It was earlier shown that the scheduling problem subject to precedence and resource constraints is

NP-hard [49], which means that exact methods are too time-consuming and inefficient for solving the

large problems found in real-world applications. There exist benchmark instances with as few as 60

activities that have not been solved to optimality [31]. Kolisch [46] surveyed a number of techniques

developed for resource-constrained project scheduling, including dynamic programming, zero-one

programming, and implicit enumeration with branch and bound. Some examples of exact solution

methods can be seen in [8,15,16,72]. Among them, the branch and bound approach is the most widely

applied. However, its depth-first or breadth-first searches cannot exhaustively explore a large-scale

project scheduling problem. Simulation modeling provides a new angle to view the RCMPSP. A

simulation model is proposed for multi-project resource allocation, interpreted as multi-channel queuing

[20]. The innate drawback of simulation is time and cost, as well as deploying a particular simulation

language that could hinder its dissemination. Finally, many different heuristic approaches have been

developed to solve intractable problems quickly, efficiently, and fairly satisfactorily. A survey of

heuristic approaches can be found in [8,45].

GAs, first proposed in [35], are adaptation procedures based on the mechanics of genetics and

natural selection. These algorithms are designed to act as problem-independent algorithms, which is at

times contrary to the field of Operations Research, where algorithms are often matched to problems. In

brief, a simple GA works as follows. A feasible instance of the underlying problem is encoded as a so-

called chromosome, while multiple chromosomes form a GA population. By selecting the fittest

chromosomes (i.e., the ones with the highest value according to an objective function) and applying the

4

Figure 1: Simple GA Flowchart

ordinary genetic operators—selection, crossover, and mutation—the population is expected to improve

over time. The GA proceeds until a predefined convergence criterion is reached. Figure 1 depicts this

simple circular flow.

In terms of the scheduling problem, GAs were first used by Davis [13]. Since then, a vast literature

on the application of GAs to various scheduling problems has emerged [10,17,19,37,67,78]. With a

special focus on a single-project problem1, Hartman developed a GA with permutation-based encoding

[31] and introduced a self-adapting representation scheme which determines automatically the best

schedule decoding procedure [32]. Gonçalves et al. [27] used a SGA approach for the RCMPSP based on

a random key chromosome encoding and a schedule generation procedure which creates so-called

parameterized active schedules. Recently, Valls et al. [79] proposed a hybrid GA tailored to the RCPSP

with a specific crossover and local search operator for the. Despite slight modifications of the GA flow

illustrated at Figure 1, past and recent publications on the application of GAs to the RCMPSP/RCPSP

have nevertheless the SGA design in common. We will explain in the subsequent sections why a SGA

design, even with improvements through local search operators or specific crossover and mutation

operators, is eventually inferior to a state-of-the-art CGA design.

1 This problem is known as the resource constrained project scheduling problem (RCPSP).

5

3. DESIGN AND IMPLEMENTATION OF A COMPETENT GA

Past research on GA applications to scheduling problems is characterized by using a SGA design as

illustrated in Figure 1. This design can be enhanced by several techniques. Some of these improvements

include niching [55], parallelization [9], or the hybridization of the GA, which is oriented more towards

global searches, with an efficient local search strategy [60]. Despite all of the well-known enhancement

techniques, a SGA fails to provide adequate solutions continuously as problem difficulty increases [23].

A mathematical explanation for this statement is the dimensional analysis of the building block (BB)

mixing process in SGAs [74]. BBs are useful hyperplanes in the search space, called schemata, which

can be understood as portions of a solution contributing much to the global optimum [22]. Goldberg [22]

claims that the constant juxtaposition and selection of BBs forms better and better solutions over time,

leading to a global optimum in a search space. In a study of the BB mixing process, Thierens [74]

pointed out the exponential scale-up in computational expense with linearly increasing problem

difficulty. He defined the following equation for the population size necessary for a successful GA:

5.22ln2))ln(ln(ln

mpscnnn

c

mkm

poppoppop

BBBB ⋅⋅>+

+μ

(1)

where kBB denotes the BB order/size, mBB corresponds to the number of BBs, npop is the population size, μ

can be regarded as a calibration coefficient, and c is a constant. Since problem difficulty can be expressed

in terms of the length and order of the BBs, equation (1) demonstrates its negative effects in terms of

computational resources.

To tackle the mixing problem in a SGA, several so-called CGAs were developed. Three different

approaches to CGAs can be distinguished: (a) Perturbation techniques, (b) Linkage adaptation

techniques, and (c) Probabilistic model-based techniques. Examples of the first approach include the fast

messy GA (FMGA) [26], the ordering messy GA (OmeGA) [41], and the gene expression messy GA

(GEMGA) [39]. An example of the second approach is the linkage learning GA, introduced by Harik

[28]. The third approach includes the compact genetic algorithm [29] and the Bayesian optimization

algorithm (BOA) [66]. At present, OmeGA is the only CGA constructed for combinatorial problems like

scheduling or the quadratic assignment problem. Essentially, OmeGA combines the FMGA with random

keys to represent permutations. Empirical tests by Knjazew [41] on artificial test functions show a

promising sub-quadratic scale-up behavior of resources with problem length, O(l1.4), where l = kBB·mBB.

6

Therefore, in the remainder of this section, we present the OmeGA and its application to the RCMPSP.

3.1. Components of the OmeGA

3.1.1. Data Structure

Using any GA as an optimization technique requires the existence of a proper data structure for

manipulation. Each instance of this structure represents one point in the space of all possible solutions. In

the context of GAs, this data structure is usually called a chromosome, which is a juxtaposition of genes.

Genes occur at different locations or loci of the chromosome and have values which are called alleles.

While the term genotype refers to the specific genetic makeup of an individual in natural systems and

corresponds to the structure of a GA, the term phenotype corresponds to the decoded structure of GAs,

which can be regarded as one point in the search space. To facilitate linkage learning by permitting genes

to move around the genotype, the OmeGA uses a different representation technique than ordinary GAs:

each gene is tagged with its location via the pair representation <locus, allele>. For example, the two

messy chromosomes shown in Figure 2 both represent the permutation (1-34-89-15-13-19). Such

permutations constitute a schedule priority list for the RCPSP.

Figure 2: Illustration of a Messy Chromosome

Messy chromosomes may also have a variable length; they can be overspecified or underspecified.

As an example, consider the chromosomes in Figure 3. The problem length is 6 in this example, so the

first chromosome is overspecified since it contains an extra gene. The second chromosome is

underspecified because it contains only three genes. To handle overspecification, Goldberg [26] proposes

a gene expression operator that employs a first-come-first-served rule on a left-to-right scan. In the

example of Figure 3, the gene assigned to locus 1 occurs twice in chromosome A. Thus, the left-to-right-

scan drops the second instance, obtaining the valid permutation (1-34-89-15-13-19) instead of (99-34-89-

15-13-19). In the case of underspecification, the unspecified genes are filled in using a competitive

template, which is a fully specified chromosome from which any missing genes are directly inherited. At

the start of the OmeGA, the genes of the competitive template are randomly chosen in consideration of

7

feasibility issues. For example, using the competitive template shown in Figure 4, the underspecified

chromosome B is completed by inheriting genes 4 through 6 from the competitive template.

Figure 3: Messy Chromosomes May Have Variable Length

Figure 4: Use of a Competitive Template on Underspecified Chromosomes

Representing alleles as ordinary integer values, SGAs for combinatorial problems typically utilize

an integer encoding for the chromosomes. Therefore, various GA operators have been developed to

maintain feasibility in terms of allele duplication in the population when using integer encoding [22,68].2

In contrast, the OmeGA uses a binary coding representation where messy chromosomes are encoded

through random keys [2], as demonstrated in Figure 5. Each gene on the chromosome is assigned a

number ri∈[0,1]. Then, the permutation sequence is determined by sorting the genes in ascending order

of their associated random keys. This encoding has the advantage that any crossover operator can be

used, since random keys always produce duplicate-free solutions for combinatorial problems. Moreover,

information about partial relative ordering is preserved in crossover [41]. Floating point numbers are

typically used as ascending sorting keys to encode a permutation. These numbers are initially determined

randomly and change only under the influence of mutation and crossover. Accordingly, a permutation of

length l consists of a vector r = {r1, r2, ..., rl}.

2 These operators do not ensure predecessor- or resource-feasibility.

Figure 5: Demonstration of Random Keys

8

3.1.2. Fitness Function

The second component of any GA is the fitness function. Every optimization technique must be able

to assign a measure of quality to each structure in the search space to distinguish good and bad results.

For this purpose, GAs use a fitness function to assign each individual chromosome a fitness value.

Generally, an optimization problem can be decomposed into a genotype-phenotype mapping fg and a

phenotype-fitness mapping fp [52]. Assuming a genotypic search space Φg, which can be either discrete

or continuous, the fitness function f assigns each element in Φg a value as follows: ( ) : gf x Φ → .

According to the fitness function decomposition, the genotype-phenotype mapping occurs first, where the

genotype elements are mapped to elements in the phenotypic search space Φg: pggg xf Φ→Φ:)( .

Subsequently, the phenotype-fitness mapping Φp is performed: ( ) :p p pf x Φ → . Thus, the fitness

function can be regarded as a composition of both mappings: ))(( ggpgp xfffff == . Due to the

frequent fitness function evaluations, its efficient implementation is crucial for gaining adequate

computational processing times. One opportunity for speeding up this operation is the use of

parallelization techniques [9], which can be applied to some extent in the OmeGA.

In their comprehensive survey of the project scheduling literature, Kolish and Padman [46] provide

an overview of objectives for scheduling problems, including traditional ones such as makespan and cost

minimization but also more recent ones like maximization of project quality. We use total project

lateness as the RCMPSP performance measure to be minimized, so we use the following fitness function

for the OmeGA:

z

m

iRz DD −∑

=1

min (2)

where DRk corresponds to the duration of project z under consideration of resources available and Dz

denotes the duration of project z neglecting any resource constraints.

3.2. Mechanics of the OmeGA

CGAs operate in a fundamentally different way from SGAs (shown in Figure 1). Figure 6 shows the

overall flow of the OmeGA, which iterates over epochs, each of which contains two loops, an inner and

an outer. The outer loop is called an era and corresponds to one BB level.3 The inner loop consists of

3 A BB “level kBB” denotes the processing of BBs of maximum size kBB.

9

three stages: (1) Initialization Phase, (2) BB Filtering / Primordial Phase, and (3) Juxtapositional Phase.

An important property of any good optimization technique is the ability to avoid local optima. For this

purpose, the OmeGA uses level-wise processing. After each era (level), the best solution found so far is

stored and used as the competitive template for the next era. We will now explain the three phases of

each era more in detail.

3.2.1. Initialization Phase

In the original study of the messy GA, the authors claimed that a population size with a single copy

of all substrings of order/length kBB (ensuring the presence of all BBs) is necessary to detect solutions

near the global optimum [24]. But the population size needed to guarantee such a copy is extremely high:

( )!!!22

BBBB

k

BB

kpop klk

lkl

n BBBB

−=⎟⎟

⎠

⎞⎜⎜⎝

⎛⋅= (3)

This equation results from the number of ways to choose BBs of order kBB from a string of length l,

multiplied by the number of different BB combinations (assuming a binary alphabet).

This exponential demand of resources would cause a serious problem, known as an initialization

bottleneck, but Goldberg et al. [26] found a way to overcome it. They predicted the theoretical population

size for a probabilistically complete initialization and found only a linear scale-up with BB number and

size. They verified this formula empirically on artificial functions. Unfortunately, the BB structure (size

and number of BBs) of the problem is usually unknown. Hence, in practice, population size is normally

determined by empirical tests. Nevertheless, we can estimate the required population size for a successful

convergence of the OmeGA, since its design is identical to that of the fast messy GA except for its

encoding scheme. During the Initialization Phase, the length of the chromosomes can be chosen

Figure 6: The Flow of the OmeGA

10

arbitrarily in the interval between kBB and l. Note that in each era a completely new population is

initialized. The size of the population, popt, at level (or era) t, assuming a starting level 1, is given by:

1

t

ii

pop=∑ . Thus, the overall population size is the sum of population sizes for each era.

3.2.2. Building Block Filtering Phase

A common characteristic of every CGA is a way of identifying the BBs. According to the BB

Hypothesis [22], these specific hyperplanes in the search space can be subsequently combined to form

solutions near the global optimum. In the OmeGA the BB Filtering Phase performs this essential task

through repeated selection and random deletion of genes.

At the beginning of the BB Filtering Phase, which is depicted in Figure 7, chromosomes arrive from

the Initialization Phase with a length almost equal to the problem length. First, a Selection Segment

probabilistically filters highly fit chromosomes from less fit ones. The current competitive template is

used to complete the genotype of under- or overspecified chromosomes. Then, in a Length Reduction

Segment, random deletion cuts chromosomes down to a length equal to the current BB level. Assuming

the Selection segment provides sufficiently good genes, the random deletion is not expected to destroy all

the BBs. The mathematical equations calculating the number of selections and deletions to be performed

can be found in [24,41]. As a selection scheme, we use Tournament Selection without replacement and a

tournament size of 4, due to the reasons mentioned in [54,64].

3.2.3. Juxtapositional Phase

In addition to identification, BBs must be properly recombined. For this purpose, the OmeGA

incorporates a Juxtapositional Phase, which corresponds to the crossover stage in SGAs. Instead of

traditional crossover operators like uniform crossover [73], the OmeGA uses cut and splice operators

[24], as demonstrated in Figure 8. The cut operator divides a chromosome into two parts with cut

probability pcut = pκ (l – 1), where l is the current length of a chromosome and pκ is a specified bitwise cut

probability. Knjazew [41] suggests keeping l ≤ 2n, gaining a pκ set to the reciprocal of half of the

problem length, pκ = 2 / n. In contrast to the cut operator, the splice operator connects two chromosomes

with probability ps, where ps is usually chosen rather high. The OmeGA thereby combines the identified

BBs to find the optimum in exactly the same manner suggested by the BB Hypothesis. Note that the

population size for each Juxtapositional Phase is held constant.

11

<5,13> <1,1> <4,15> <2,34> <6,19> <3,89>

<3,89> <1,1> <6,19> <5,13> <4,15> <2,34>

<1,29> <6,13> <2,17> <3,15> <4,31> <3,24>

<6,0> <4,76> <1,11> <2,71> <3,56> <2,99>

5

14

16

7

Fitness

Chromosomes entering from Initialization Phase

<3,89> <1,1> <6,19> <5,13> <4,15> <2,34>

<1,29> <6,13> <2,17> <3,15> <4,31> <3,24>

14

14

16

16

Fitness

Chromosomes after Selection Segment

<1,29> <6,13> <2,17> <3,15> <4,31> <3,24>

<3,89> <1,1> <6,19> <5,13> <4,15> <2,34>

<3,89> <1,1> <6,19> <5,13> <4,15> <2,34>

<1,29> <6,13> <2,17> <3,15> <4,31> <3,24>

?

?

?

?

Fitness

Chromosomes after Length Reduction Segment

<1,29> <6,13> <2,17> <3,15> <4,31> <3,24>

<3,89> <1,1> <6,19> <5,13> <4,15> <2,34>

Figure 7: Illustration of the BB Filtering Phase

4. APPLYING THE OmeGA TO PROJECT SCHEDULING PROBLEMS

We have found that it is important to be careful in applying any GA to a RCMPSP, because slight

nuances in the GA’s attributes can have important implications for solution quality and efficiency.

Figure 8: Examples of Cut and Splice Operators

12

Therefore, we use this section to provide some details of the OmeGA’s application to the RCMPSP.

4.1. Preserving Predecessor Feasibility

Applying the OmeGA to the RCMPSP requires addressing the issue of predecessor feasibility. The

outcome of the OmeGA is a permutation of the priority list for all activities to be scheduled. Based on

this list, we explain in the next section how a so-called schedule generation scheme (SGS) builds a real

schedule with the start and finish times for each activity. However, a prerequisite for any SGS is a

precedent-feasible schedule list. Since the mechanics4 and the random key encoding in the OmeGA

merely ensure the prevention of duplicate alleles in chromosomes, an additional, efficient repair

mechanism is required to transform any schedule list into a precedent-feasible one.

In the case of predecessor violation, a straight-forward repair strategy is to iterate the processing

step which caused the violation (e.g., the Juxtapositional Phase) until all constraints are satisfied.

Considering the great discrepancy between the immense number of possible permutations (n! for n

activities) and the number of feasible solutions, it becomes obvious that such a brute force method would

be too time consuming. Therefore, we handle predecessor constraints in another deterministic and

efficient way, using a repair mechanism prior to schedule construction and fitness function assignment.

Predecessor conflicts cannot occur between activities which can be executed concurrently—i.e., activities

which do not rely on predecessor information at the same point in time Ti. Assuming we start with an

empty schedule list at time T0, we can calculate the set of parallel activities at T0 and subsequently pick

an activity out of this set according to a deterministic strategy. The chosen activity is then appended at

the end of the current priority list and all its relationships within the network of activities are deleted.

Repeating this procedure until all activities have been assigned to a spot in the schedule list, we will

never violate any predecessor constraints.

The pseudo-code shown in Figure 9 describes the repair mechanism more in detail. As input, the

algorithm needs the permutation to be mapped, q, the number of projects, m, the number of activities, n,

the three-dimensional array of all projects DSM[m][n][n] (explained below), and two auxiliary variables,

i and j. The output is a precedent-feasible schedule list, s. The algorithm identifies the first activity of q

without precedent activities and assigns it to spot i in s. Then, all dependencies on the selected activity

4 In particular, the crossover operators that preserve predecessor feasibility in SGA designs, such as Union Crossover [9], cannot be applied in the OmeGA and are replaced by cut and splice operators.

13

are deleted from the design structure matrix (DSM).5 This simple algorithm scales up in complexity

O(n2).

As an example, consider two projects, each with four activities, modeled by the two DSMs in Figure

10(a) and a permutation representing a schedule list in Figure 10(b). The DSMs indicate the precedence

relationships between the activities—e.g., activity 1 precedes activity 4 and activity 2 precedes activity 3.

The permutation q = {3-7-1-8-5-4-2-6} does not yield a precedent-feasible schedule since, for instance,

activity 3 is scheduled prior to activity 2. Applying the algorithm in Figure 9 leads to the following

results. Activities 1, 2, 5, and 6 do not depend on any other activities in the set and thus comprise the

initial set of parallel activities. The first value in this set which also occurs in q is {1}. Thus, the first

value of the feasible schedule list q must be 1: q[0] = 1. After deleting the row and column for activity 1

in DSM 1, the next iteration of the algorithm begins, detecting a new set of parallel activities: {2,4,5,6}.

In this set, activity 5 is the earliest one in q and consequently q[1] = 5 holds. The row and column for

activity 5 are then deleted and a new loop starts. Repeating all steps of the algorithm until convergence,

we obtain the precedent-feasible schedule list in Figure 10(c), q = {1,5,7,8,4,2,3,6}.

5 A DSM is an efficient and commonly used method of showing the relationships between with the activities in a project [5]. Essentially, it can be understood as the precedence matrix representation of an activity-on-node network. Given a set of n activities in a project, the corresponding DSM is an n × n matrix where the activities are the diagonal elements and the off-diagonal elements indicate the precedence relationships.

Input: Integer i, j, m, n; ScheduleList q[n]; Array DSM[m][n][n]; Output: Feasible schedule list s[n] i ← 0; j ← 0; s ← new ScheduleList[n]; WHILE i < n FOR j to n-1 IF q[j].numberOfPredecessors = 0 AND

q[j].isScheduled = false THEN s[i] ← q[j]; BREAK; ENDIF j ← j+1; ENDFOR j ← 0;

FOR j to DSM[s[i].projectID].length-1 DSM[s[i].projectID][j][s[i].columnID] ← 0; ENDFOR j ← 0; q[s[i]].isScheduled ← true; i ← i+1;

ENDWHILE

Figure 9: Pseudo-Code for Mapping any Permutation to a Feasible Schedule List

14

(b)

(c) 1 5 7 8 4 2 3 6

Mapped Permutation, q

4.2. Schedule Generation Schemes

To assign a fitness value (according to a predefined objective function) to a permutation (schedule

list) in the OmeGA, a schedule generation scheme (SGS) is necessary for building a real schedule out of

a schedule list. Boctor [4] distinguishes between a “serial” and a “parallel” SGS. In a serial SGS, each

activity’s priority is calculated once, at beginning of the SGS algorithm, whereas in a parallel SGS an

activity’s priority is re-determined as necessary at each time step.

The serial SGS proceeds as follows. First, the overall problem duration is broken down into N

stages, where N is the total number of activities to be scheduled. This SGS separates the activities into

two mutually exclusive and disjoint sets: the scheduled set, S (already scheduled activities), and the

decision set, D (unstarted activities that depend only on activities in S). In each stage, one activity is

selected from D and scheduled at its earliest precedence- and resource-feasible start time [40], which

moves it to set S.

The parallel SGS proceeds as follows. First, the overall problem duration is broken down into time

steps. At each time step, the algorithm separates the activities into four mutually exclusive and disjoint

sets: the complete set, C (finished activities), the active set, A (ongoing, “already scheduled” activities),

the decision set, D (unstarted activities that depend only on activities in C), and the ineligible set, I

(activities which depend on activities in A or D). Since preemption is not allowed, the SGS automatically

assigns resources to activities in A. If the remaining resources are sufficient to perform the activities in D,

then the algorithm adds these to A. If not, then it uses a priority rule to rank the activities in D. The

highest-ranking activities are added to A as resources allow. The time step ends when the shortest activity

(or activities) in A finishes. Finished activities are moved to C, and activities in I are checked for

potential transfer to D. The schedule is complete (i.e., the project duration is known) when all activities

are in C.

Figure 10: Two Projects (a) Modeled by DSMs with (b) an Unmapped Schedule Priority List and (c) a Precedent-Feasible Schedule List after Executing the Proposed Repair Mechanism

(a)

15

Unfortunately, it is impossible to predict in advance which SGS will perform best for an arbitrary

RCMPSP. As the serial and parallel SGSs exhibit two different behaviors, thus potentially resulting in

two unequal schedules for an identical schedule priority list [32], determining which scheme is best

becomes an optimization problem in itself. To address this dilemma, Hartmann [32] proposed a GA-

based heuristic, the self-adapting GA, to help determine if the serial or the parallel SGS is better suited

for the underlying problem. Instead of selecting a SGS in advance, the self-adapting GA allows

chromosomes to be evaluated via the parallel or the serial SGS. While the self-adapting GA proceeds,

more fit chromosomes will prevail, not only with respect to their schedule list but also in terms of their

SGS. For this purpose, the self-adapting GA maintains an additional gene for each chromosome that

uniquely specifies its SGS. The modifications necessary to permit such flexibility in the choice of the

SGS are described in [32] and were easily incorporated into the OmeGA.

4.3. Hybridization with the 2-opt Heuristic

In general, a GA tends to explore the search space via its crossover operator rather than extensively

exploiting specific regions through mutation. This is mainly due to the high crossover probability and the

low mutation probability, both of which are necessary for a successful GA.6 However, an exploitation of

interesting regions within the search space can usually be accomplished efficiently and effectively.

Besides, sometimes local and global optima can be identified only by a fine-grained local searcher, since

single peaks would be too difficult to detect for a coarsely-grained crossover operator. Both the SGA and

the OmeGA can be extended with an efficient local search strategy to combine the positive traits of both

approaches. In SGAs, local search is typically applied after crossover and mutation. (Another interesting

opportunity is to incorporate the local search into the fitness function itself.)

Aside from its appropriate placement in the GA flow, the choice of a suitable local search strategy

for the underlying problem is crucial. A potential approach would comprise the application of a well-

known priority rule heuristic (see section 5) for the RCMPSP to one chromosome in the initial GA

population and to the competitive template after each era. In this case, the initial GA population would

quickly increase its average fitness and the final outcome of the GA would never be worse than the

outcome of the priority rule itself. Although this local search approach sounds promising, it would in fact

be a poor choice. The initial chromosome produced by a priority rule heuristic would too quickly 6 For more information on a reasonable adjustment of crossover and mutation probability, the reader may refer to [22,58,61,74].

16

dominate the entire GA population, resulting in a premature convergence of the GA.7 Typically, this

phenomenon arises if the chosen selection scheme provides too much “selection pressure” or intensity. In

other words, by placing a highly-fit chromosome into the initial population, the GA is precluded from

properly seeking interesting search regions that might contain the global optimum.

Instead of a local searcher that strives to push chromosomes to the optimum quickly, thereby

hazarding to get stuck in a local optimum, we favor a strategy which increases the average fitness of the

GA population at a more moderate pace, thereby exploring more of the search space. Therefore, we

decided to combine the OmeGA with a tailored 2-opt heuristic for the RCMPSP. In the RCMPSP, the 2-

opt neighborhood is defined as the set of all feasible solutions that can be reached by a swap of two

elements in the priority list. Since the OmeGA initializes a new population in each era and does not

embody separate crossover and mutation phases, but a Juxtapositional Phase instead, we invoke our local

search approach at the end of each era. Furthermore, to strike a good balance between computational time

and effectiveness, we apply the local search only to the competitive template. Although we execute the 2-

opt heuristic just once per era, we do it for the most promising chromosome, which serves as a template

for upcoming chromosomes in the next era.

Tailoring the fundamental concept of the 2-opt heuristic to the RCMPSP can be accomplished as

follows. Assuming a precedent-feasible schedule list, s, the DSM, and several auxiliary variables, as

noted in Figure 11, s must be decomposed into its independent sets of parallel activities (a vector of

vectors) followed by a 2-opt search in each of these disjoint sets (see Figure 12). For example, consider

the DSMs and s in Figure 10. Due to the predecessor constraints, it is obvious that a traditional swap of

two alleles is not generally possible, as it would often lead to a predecessor violation—e.g., exchanging

activities 5 and 8 must not be allowed. However, if we determine the three disjoint sets of parallel

activities in s—namely {1,2,5,6}, {3,4,7} and {8}—we can modify the order of activities within these

sets as they appear in s without violating any predecessor constraints. In particular, we can apply a swap

between two activities in every parallel set, leading to a new schedule list which is subsequently

evaluated by a SGS. Figure 12 demonstrates the application of this algorithm, which produces nine

feasible schedule lists that differ in exactly two positions from s. Generally, the total number of

7 At this time only mutation can produce slightly new chromosomes. However, the mutation probability is typically set very low. Hence, the chance to generate new genotypes exists but is unlikely.

17

exchanges for a schedule list with z parallel sets of activities is:

∑=

−z

i

ii zz

1

2

2. (4)

Figure 11: Pseudo-code for 2-opt Heuristic in the RCMPSP8

8 The algorithm in Figure 11 describes the 2-opt heuristic only for one chromosome.

Input: Integer h, i, j, m, n, projectID,swapID_1,swapID_2; ScheduleList s[n], t[n]; Vector<Vector> parallel; Array DSM[m][n][n]; Output: Performs 2-opt heuristic for the RCMPSP h ← 0; i ← 0; j ← 0; WHILE i < n Vector<Integer> band ← new Vector(); FOR j to n-1 IF s[j].numberOfPredecessors = 0 AND

s[j].isScheduled = false THEN s[j].isScheduled ← true;

band.add(s[j]); i ← i+1; ENDIF j ← j+1; ENDFOR parallel.add(band); j ← 0; FOR j to band.size()-1 projectID ← band.elementAt(j).projectID;

FOR h to DSM[projectID].length-1 DSM[projected][h][band.elementAt(j).columnID]← 0; h ← h+1; ENDFOR h ← 0; j ← j+1;

ENDFOR parallel.add(band); j ← 0;

ENDWHILE h ← 0; i ← 0; j ← 0; FOR i to parallel.size()-1 FOR j to parallel.elementAt(i).size()-2

swapID_1 ← parallel.elementAt(i).elementAt(j); h ← j+1; FOR h to parallel.elementAt(i).size()-1 swapID_2 ← parallel.elementAt(i).elementAt(h); t • s; t[swapID_1.spotInList] ← swapID_2; t[swapID_2.spotInList] ← swapID_1; CALL SGS with t; h ← h+1; ENDFOR

j ← j+1; ENDFOR

j ← 0; i ← i+1;

ENDFOR

18

1 5 7 8 4 2 3 6

1 2 5 6 3 4 7 8

2 5 7 8 4 1 3 6

5 1 7 8 4 2 3 6

6 5 7 8 4 2 3 1

1 2 7 8 4 5 3 6

1 5 7 8 4 6 3 2

1 6 7 8 4 2 3 5

1 5 7 8 3 2 4 6

1 5 3 8 4 2 7 6

1 5 4 8 7 2 3 6

Precedent feasible schedule list s

Vector parallel of parallel activity sets

Precedent feasible schedule lists t after local search

5. COMPUTATIONAL RESULTS

5.1. OmeGA (CGA) vs. SGA

In order to to illustrate the superior performance of CGAs compared to SGAs, we tested both

designs without local search extensions on artificial functions which allow to scale the level of difficulty.

While a search space can exhibit many problem characteristics like epistasis [14], noise [59], or

symmetry [36], the GA community mainly examines GA performance using so-called deceptive

functions [21]. Deceptive functions attempt to mislead the GA to local optima by assigning low fitness

values to solutions near the global optimum and high fitness values to solutions far from the optimum.

The global optimum is thereby isolated and surrounded by global minima like the “needle in a haystack”

problem.

Kargrupta et al. [38] introduced two deceptive functions for combinatorial problems which spawn

an extremely difficult search space for a GA. Therein, k deceptive sub-functions (which can be regarded

as BBs) of order m are combined into one deceptive function of length l in a way that only one global

optimum exists among l! / (k!)m local optima. The global optimum can be detected only if the GA

subsequently identifies all k sub-functions—i.e., all sub-problems of the overall problem. Problem

difficulty can be scaled not only by the number and length of deceptive sub-functions but also by their

corresponding encoding. In short, two encoding schemes exist, tight and loose, with the loose encoding

Figure 12: Demonstration of Local Search in the RCMPSP

19

causing more difficulties for a GA.9

For tests on deceptive functions, we used the parameters listed in Table 1. The reader may refer to

[58] for detailed information on how to set these parameters. Due to prior knowledge of the BB structure

of the problem (kBB equals the order of the deceptive function and l equals the length of the deceptive

function) we roughly calculated population size using the equation proposed by Harik [30] and

convergence time according to Goldberg [23]. Figures 13 and 14 display average fitness results for 10

independent test runs. In our initial tests, we demonstrate the advantage of the OmeGA (a CGA) on a

deceptive function of order 4 and length 32 with a single global optimum (fitness value of 32): it

performs equally well for easy (tight encoding) and hard (loose encoding) problems. While the SGA

requires fewer function calls than the OmeGA to perceive the global optimum in case of tight-encoded

deceptive functions, it clearly struggles if the search space becomes more difficult and is unable to detect

the global optimum even after many fitness evaluations. Further tests on deceptive functions of varying

length, depicted in Figure 14, exhibit the inability of SGAs to deal with difficult (loose encoding)

problems. The computational resources necessary to reliably identify the global optimum scale up

exponentially with problem difficulty (as predicted by equation 1), which is particularly problematic if

problem size exceeds a value of about 16. Note that the population size depicted on the y-axis of Figure

14 had to be sufficient to detect the global optimum in 9 out of the 10 runs.

SGA OmeGA (CGA) pc : 1.0 pcut: 2 / npop pm : 1 / 4n ps: 1.0

Selection scheme: TWOR with/without continuously updated sharing and tournament size 4 [58,70]

pm: 1 / 4npop

Crossover operator:

Position-based Crossover, Version 2 [62] Selection scheme: TWOR with tournament

size 4 [70] Mutation operator: Shift mutation [62] Number of eras per epoch: 4

Number of generations per epoch 60

Ratio of population size in eras 1-4: 1:1:2:6

Table 1: Test Parameters

9 The encoding scheme influences the so-called defining length [22] of the BBs or sub-functions—the greater the defining length, the greater problem difficulty.

20

(a) SGA (b) OmeGA

5.2. Performance of the Hybrid OmeGA on Generated Test Problems

Due to its problem independent performance (as demonstrated in the previous section), the CGA

design is much more suited for real world problems than the SGA design as problem difficulty is

unknown in advance of optimization. In order to achieve best optimization results, we extended the

OmeGA with the local search strategy explained in section 4.3. and tested it for 77 test problems. These

problems were constructed based on two popular multi-project measures: the Average Resource Load

Factor (ARLF) and the Average Utilization Factor (AUF) [48,80]. The ARLF identifies whether the bulk

of a project’s total resource requirements are in the front or back half of its critical path duration10 and the

relative size of the disparity. For project h, it is defined as:

10 Based on scheduling each activity at its early start time.

Figure 13: Test Results for a Tight- and Loose-Encoded Deceptive Function of Length 32

Figure 14: Comparative Performance of the OmeGA and the SGA on Various Deceptive Functions

21

∑∑∑= = =

⎟⎟⎠

⎞⎜⎜⎝

⎛=

h ih hCP

t

K

k

n

i ih

ihkihtiht

hh K

rXZCP

ARLF1 1 1

1 (5)

where ⎩⎨⎧

>≤−

=2121

h

hiht CPt

CPtZ , (6)

⎩⎨⎧

=otherwise0

at time active is project of activity if1 thiXiht , (7)

ZihtXiht ∈ {-1, 0, 1}, nh is the number of activities in project h, Kih is the number of types of resources

required by an activity i in project h, and ihkr is the amount of resource type k required by activity i in

project h. Projects with negative ARLF are “front loaded” in their resource requirements, while projects

with positive ARLF are “back loaded.” The ARLF for a problem is simply the average of the ARLFs of its

constituent projects.

The AUF indicates the average tightness of the constraints on (i.e., the average amount of contention

for) each resource type:

∑=

=S

s k

skk sR

WS

AUF1

1 (8)

where kR is the (renewable) amount of resource type k available at each interval, and S is the number of

time intervals in the problem. Using S = 3 intervals, for example, once the projects have been sorted from

shortest to longest, such that CP1 ≤ CP2 ≤ CP3, then S1 = CP1, S2 = CP2 – CP1, and S3 = CP3 – CP2. The

total amount of resource k required over any interval s is given by:

∑∑∑= = =

=b

at

m

h

n

iihtihksk

h

XrW1 1

(9)

where a = 1−sCP + 1, b = sCP , and ihkr and X are defined as above. Since the AUF is essentially a ratio of

resources required to resources available, averaged across intervals of problem time, AUFk > 1 indicates

that resource type k is, on average, constrained over the course of a problem. To get the AUF for a

problem involving K types of resources:

AUF = Max(AUF1, AUF2, …, AUFK) (10)

For the test problems, we generated (using the generator described in [7]) seven random problems,

each composed of three projects (i.e., 21 total networks), and with 20 activities per project. Each of the

seven problems has a different ARLF setting, varied in integer increments over -3 ≤ ARLF ≤ 3. For each

problem, we adjusted the number of resources available at 11 levels, thereby varying the AUF in 0.1

22

increments over 0.6 ≤ AUF ≤ 1.6.11 This approach, originally taken in [48], yielded 77 test problems.

First, we compared the serial and parallel SGS results, averaged over all 77 problems. To enable

drawing reliable conclusions, we invoked the OmeGA 50 independent times on each problem for each

SGS. The most important CGA parameters, population size and convergence time, were set to 2000

individuals and four epochs, respectively. The remaining parameters were defined as in Table 1. An

unknown property of each problem is its underlying BB structure—the size, scaling, and number of BBs.

Hence, for the tests we assumed a “conservative” problem difficulty with BB size 4, even though tests

with different values for the size of BBs might lead to better results.12

Table 2 depicts the best average fitness value out of the three different SGSs (serial, parallel, and

self-adapting) for each of the 77 problems. In some cases the OmeGA produced identical average fitness

values for more than one SGS. Interestingly, the self-adapting SGS “won” in only 21 out of 77 cases

(27.3%). The serial SGS won in 11 out of 77 problems (14.3%). Meanwhile, the parallel SGS won in 39

out of 77 cases (51%). When we sum the average and best fitness values and the standard deviations (out

of all 50 independent runs) for each SGS on all 77 problems, as shown in Table 3, we conclude that the

hybrid OmeGA performs best in combination with the parallel SGS. While this conclusion contradicts

the insights in [32], this is not necessarily surprising given the different GA design and problem

instances.

ARLF Schedule Generation Strategy -3 -2 -1 0 1 2 3

0.6 1.76 4.32 12.78 2.00 3.00 3.00 0.00 Parallel SGS 0.7 4.96 16.40 24.00 5.00 11.16 9.60 2.30

0.8 13.72 29.28 38.92 11.00 25.82 21.32 7.14Serial SGS 0.9 19.62 46.92 59.82 23.94 36.40 34.88 16.90

1.0 30.00 57.24 69.54 34.84 60.96 52.94 22.06Self-Adapting AUF 1.1 39.54 74.38 90.50 51.66 71.90 67.32 34.34

1.2 50.98 88.12 112.06 68.08 105.84 75.00 43.16ALL Strategies 1.3 59.96 109.50 134.82 79.40 121.64 93.52 50.26

1.4 70.28 125.02 148.50 105.68 142.04 111.46 65.22Parallel / Self-Adapting 1.5 82.36 140.14 171.00 119.72 172.44 131.84 78.68

1.6 94.10 152.66 194.28 142.78 233.10 148.60 91.48

Table 2: Comparison of Different SGS Strategies for the Hybrid OmeGA

(average fitness of 50 independent runs)

11 We found that we could not adapt standard single-problem generators and test sets such as ProGen/PSPLIB [42] to create multi-project problems to our specifications. 12 Currently, there is no way to figure out a “good“ BB size except by testing all sizes. However, one point is clear: the larger the selected BB size, the more difficult to identify these BBs. Thus, we think a BB size of 4 is a conservative and good choice although it might not be the best for some problems.

23

Sum of Best Results Sum of Average Results Sum of Standard Deviations

Serial SGS 5105 5281.86 92.29

Parallel SGS 5020 5166.44 77.82

Self adapting 5021 5175.00 83.07

Table 3: Aggregate Performance of Each SGS in Terms of Total Project Lateness (TPL)

6. COMPARISON TO POPULAR PRIORITY-BASED HEURISTICS

We also compared the performance of the OmeGA to 20 popular priority rule heuristics found in the

project scheduling literature, as summarized in Table 4. Some of these rules are developed specifically

for a multi-project environment, while others have been reported to be successful in a single-project

environment. To increase their comparability, we standardized the tie-breaker for all rules to FCFS.

Priority Rule (* = multi-project) Formula Comments

1. FCFS—First Come First Served Min(ESil), where ESil is the early start time of the ith activity from the lth project

Best in study by Bock and Patterson [3]

2. SOF—Shortest Operation First Min(dil), where dil is the duration of the ith activity from the lth project

Best in study by Patterson [64]

3. MOF—Maximum (longest) Operation First

Max(dil)

4. MINSLK*—Minimum Slack Min(SLKil), where SLKil = LSil – Max(ESil, t), LSil is the late start time of the ith activity from the lth project, and t is the current time step13

Best in studies by Davis and Patterson [12], Boctor [4], and

Bock and Patterson [3]

5. MAXSLK*—Maximum Slack Max(SLKil)

6. SASP*—Shortest Activity from Shortest Project

Min(fil), where fil = CPl + dil and CPl is the critical path duration of the lth project without resource constraints

Best in studies by Kurtulus and Davis [48] and Maroto et al. [57]

7. LALP*—Longest Activity from Longest Project

Max(fil)

8. MINTWK*—Minimum Total Work content ,Min

11⎟⎟⎠

⎞⎜⎜⎝

⎛+ ∑∑ ∑

== ∈

K

kilk

K

kil

ASiilkil rdrd

l

where ASl is the set of activities already scheduled (i.e., in work) in project l

9. MAXTWK*—Maximum Total Work content ⎟⎟

⎠

⎞⎜⎜⎝

⎛+ ∑∑ ∑

== ∈

K

kilk

K

kil

ASiilkil rdrd

l 11Max Best in studies by Maroto et al.

[58] and Lova and Tormos [54]

10. RAN—Random Activities selected randomly Best in study by Akpan [1]

11. EDDF—Earliest Due Date First Min(LSil)

12. LCFS—Last Come First Served Max(ESil)

13. MAXSP—Maximum Schedule Pressure ,⎟⎟

⎠

⎞⎜⎜⎝

⎛ −

ilil

il

WdLFt

Max where Wil is the percentage of the activity

remaining to be done at time t

Also known as “critical ratio”

14. MINLFT--Minimum Late Finish time

Min(LFil) Equivalent to MINSLK in serial scheduling case (Kolisch [43])

15. MINWCS*—Minimum Worst Case Slack

Min(LSi – Max[E(i,j) | (i,j) ∈ APt]), where E(i,j) is the earliest time to schedule activity j if activity i is started at time t, and APt is the set of all feasible pairs of eligible, un-started activities at time t

Best in study by (Kolisch [43]); without resource constraints,

reduces to MINSLK

13 t is relevant only when using the parallel SGS, where an activity’s slack will diminish the longer it is delayed.

24

Priority Rule (* = multi-project) Formula Comments

16. WACRU*—Weighted Activity Criticality & Resource Utilization ( ) ( ) ,11

1 1 ,⎟⎟⎠

⎞⎜⎜⎝

⎛−++∑ ∑

= =

−iN

q

K

k kMax

ikiq R

rwSLKwMax α where Ni is the

number of immediate successors of the ith activity, w is the weight associated with Ni (0 ≤ w ≤ 1), SLKiq is the slack in the qth immediate successor of the ith activity, and α is a weight parameter

Best in study by (Thomas and Salhi [76])

We use w = 0.5 and α = 0.5

17. TWK-LST*—MAXTWK & earliest Late Start time (2-phase rule)

Prioritize first by MAXTWK (without FCFS tie-breaker) and then by Min(LSil)

(Lova and Tormos [54]); min. late start time (MINLST), was

best in study by (Davis and Patterson [12])

18. TWK-EST*—MAXTWK & earliest Early Start time (2-phase rule)

Prioritize first by MAXTWK (without FCFS tie-breaker) and then by Min(ESil)

(Lova and Tormos [54])

19. MS—Maximum Total Successors Max(TSil), where TSil is the total number of successors of the ith activity in the lth project

Best in study by (Kolisch [43])

20. MCS—Maximum Critical Successors

Max(CSil), where CSil is the number of critical successors of the ith activity in the lth project; CSil ∈ TSil

To compare results of the non-deterministic CGA with deterministic priority rules, we used the

average fitness values of the CGA instead of absolute best values out of 50 runs.14 We compared these

average values with the best result from any priority rule. Nevertheless, the OmeGA outperformed the

priority-rule based heuristics in 60 of the 77 problems (77.9%) in terms of solution quality (Table 5 and

Table 6). Interestingly, we note that Tables 5 and 6 show the priority-rules outperforming the OmeGA at

low AUF values. When AUF = 0.6, the resources are relatively unconstrained and the consequential

delays are small. In these cases, many of the priority rules gave good solutions, while the OmeGA did

not. We therefore conclude that using the OmeGA in the context of nominal resource constraints is

probably not worth the effort. It is also interesting that the priority rules performed well when AUFs were

very high—i.e., when resources were very highly constrained.

With respect to computational time, the priority rules have a clear advantage: the OmeGA required

an average of 211 seconds15, while each priority rule required less than one second. However,

optimization of a RCMPSP does not necessarily constitute a real-time application, and the time required

for the OmeGA should be convenient for many purposes. Nevertheless, priority rules remain most

practical for very large problems.

14 Since GAs constitute a non-deterministic optimization technique, it is much more fair to compare average values than absolute best fitness values. 15 Tests were performed on a PC with 3.0GHz Intel Pentium IV processor, 1024MB RAM and a Windows XP operating system.

Table 4: Overview of Popular Priority Rules Used for the RCMPSP (Adapted from [6])

25

ARLF -3 -2 -1 0 1 2 3 0.6 1.86 3.00 11.00 1.00 3.00 3.00 0.00

“Winner” 0.7 4.96 16.40 23.00 4.00 11.16 9.88 2.32 0.8 13.72 29.28 38.92 11.00 26.00 21.72 7.14

OmeGA 0.9 19.62 46.92 59.94 23.94 36.40 34.88 16.92 1.0 30.12 54.00 69.82 34.84 61.12 53.34 22.06

Priority Rule AUF 1.1 39.54 74.38 90.66 51.66 72.46 67.32 34.58 1.2 51.00 88.12 112.06 68.08 106.22 75.14 43.22

Both 1.3 59.96 97.00 135.28 79.40 122.06 93.00 50.26 1.4 70.44 111.00 148.50 105.68 142.04 102.00 65.58 1.5 82.38 129.00 168.00 119.72 170.00 131.84 78.68

1.6 94.22 152.66 194.28 142.78 223.00 149.08 91.50

ARLF -3 -2 -1 0 1 2 3 0.6 -7.00% 45.33% 16.18% 100.00% 33.33% 15.33% 0.00%

“Winner” 0.7 -17.33% -13.68% 4.35% 25.00% -20.29% -17.67% -42.00% 0.8 -8.53% -11.27% -15.39% -21.43% -18.75% -19.56% -40.50%

OmeGA 0.9 -18.25% -9.77% -10.54% -11.33% -28.63% -8.21% -10.95% 1.0 -18.59% 6.00% -10.49% -18.98% -8.78% -8.03% -15.15%

Priority Rule AUF 1.1 -14.04% -8.17% -11.98% -16.68% -15.74% -3.83% -6.54% 1.2 -16.39% -9.15% -9.63% -5.44% -4.31% -11.60% -9.96%

Both 1.3 -14.34% 12.89% -4.06% -6.59% -4.64% 1.14% -8.62% 1.4 -14.10% 12.63% -2.30% -4.79% -3.37% 9.61% -10.16% 1.5 -9.47% 8.88% 2.27% -4.98% 1.64% -0.12% -1.65%

1.6 -12.76% -0.22% -0.88% -8.47% 4.53% -6.82% -4.69%

7. CONCLUSION

In this paper, we present the first results of applying a CGA, the OmeGA, to the RCMPSP. Because

slight differences in GA settings can have a large influence on its efficacy and performance, we carefully

explored these settings in relation to the RCMPSP. While traditional SGAs scale up exponentially in

computational resources (i.e., fitness function evaluations and necessary population size) with increasing

problem difficulty, CGAs exhibit sub-exponential scale-up behavior due to their ability to identify BBs

of the solution. Furthermore, we extended the OmeGA with a local search strategy tailored to the

RCMPSP. As a basis for tests, we constructed 77 test problems according to the traditional ARLF and

AUF measures. The test results were twofold. First, we found that the parallel SGS—not the self-

adapting SGS, as stated in [32]—performed best in combination with the OmeGA. Second, we compared

Table 5: Comparitive Performance of OmeGA and the Best-Performing Priority Rule on Each Test Problem (Average Total Project Lateness)

Table 6: Percent Difference between the OmeGA and the Best-Performing Priority Rule on Each Test Problem

26

the OmeGA with many well known priority-rule heuristics, concluding that the OmeGA outperforms the

rules in 78% of problem instances. Moreover, the 22% of instances where the rules performed best were

focused on the very low and very high AUF values, where resources are very slightly or very highly

constrained. Thus, we have been able to provide some reasonable guidance on where the OmeGA would

be best applied. As problem size increases, we expect even better performance from the OmeGA

compared to the rules (which makes it useful for practical applications), even though this performance

comes at additional computational expense.16

For future work, we suggest research on even more effective local search strategies for the

RCMPSP. With CGAs already being able to identify BBs, and thus interesting regions of the search

space, new local search strategies that extensively exploit the BB neighborhood should be beneficial.

Furthermore, it would be very worthwhile to develop strategies for accurately estimating the BB structure

of the underlying problem.

REFERENCES

[1] E.O.P. Akpan, Priority Rules in Project Scheduling: A Case for Random Activity Selection, Production Planning & Control 11(2) (2000) 165-170.

[2] J.C. Bean, Genetic algorithms and random keys for sequencing and optimization, ORSA Journal On Computing 6(2) (1994) 154-160.

[3] D. Bock, J. Patterson, A Comparison of Due Date Setting, Resource Assignment, and Job Preemption Heuristics for the Multi-Project Scheduling Problem, Decision Science 21(3) (1990) 387-402.

[4] F.F. Boctor, Some Efficient Multi-heuristic Procedures for Resource-constrained Project Scheduling, European Journal of Operational Research 49 (1990) 3-13.

[5] T.R. Browning, Applying the Design Structure Matrix to System Decomposition and Integration Problems: A Review and New Directions, IEEE Transactions on Engineering Management, 48(3) (2001) 292-306.

[6] T.R. Browning, A.A. Yassine, Resource-Constrained Multi-Project Scheduling: Priority Rule Performance Revisited, TCU M.J. Neeley School of Business, Working Paper, 2006a.

[7] T.R. Browning, A.A. Yassine, A Random Generator for Resource-Constrained Multi-Project Scheduling Problems, TCU M.J. Neeley School of Business, Working Paper, 2006b.

[8] P. Brucker, A. Drexl, R. H. Möhring, K. Neumann, E. Pesch, Resource-constrained project scheduling: Notation, classification, models, and methods, European Journal of Operational Research 112(1) (1999) 3-41.

[9] E. Cantú-Paz, Efficient and accurate parallel genetic algorithms, Kluwer, Norwell, 2000. [10] R. Cheng, M. Gen, Y. Tsujimura, A tutorial survey of job-shop scheduling problems using genetic

algorithms, part II: hybrid genetic search strategies, Computers and Industrial Engineering 36(3) (1999) 343-364.

[11] E.W. Davis, Project Network Summary Measures and Constrained Resource Scheduling, AIIE

16 The larger the problem instance, the larger (usually) the search space. Certainly this depends on the constraints, too. Since heuristics explore only small subsets of the overall search space, the chance to find the best solution should decrease with increasing search space size.

27

Transactions 7(2) (1975) 132-142. [12] E.W. Davis, J.H. Patterson, A Comparison of Heuristic and Optimum Solutions in Resource-

Constrained Project Scheduling, Management Science 21(8) (1975) 944-955. [13] L. Davis, Job shop scheduling with genetic algorithms, in: Proceedings of the 1st international

conference on genetic algorithms, 1985. [14] Y. Davidor, Epistasis variance: A viewpoint on GA-hardness, in: Foundations of Genetic

Algorithms, 1991. [15] E. Demeulemeester, W. Herroelen, A branch-and-bound procedure for the resource constrained

project scheduling problem, Management Science 38(12) (1992) 1803-1818. [16] E. Demeulemeester, W. Herroelen, An efficient optimal solution procedure for resource

constrained project scheduling problem, European Journal of Operational Research 90(2) (1996) 334-348.

[17] H. Fang, P. Ross, D. Corne, A Promising Hybrid GA/Heuristic Approach for Open-Shop Scheduling Problem, in: 11th European Conference on Artificial Intelligence (1994) 590-594.

[18] M.R. Garey, D.S. Johnson, Computers and intractability: A guide to the theory of NP-completeness, 1979.

[19] M. Gen, R. Cheng, Genetic Algorithms and Engineering Design, 1997. [20] S. Ghomi, B. Ashjari, A simulation model for multi-project resource allocation. International

Journal of Project Management 20(2) (2002) 127-130. [21] D.E. Goldberg, Simple genetic algorithms and the minimal, deceptive problem, in: Genetic

Algorithms and Simulated Annealing (1987) 74-88. [22] D.E. Goldberg, Genetic algorithms in search, optimization, and machine learning, Addison-

Wesley, 1989. [23] D.E. Goldberg, The Design of Innovation, Kluwer, Norwell, 2002. [24] D.E. Goldberg, B. Korb, K. Deb, Messy genetic algorithms: Motivation, analysis, and first results,

Complex Systems 3(5) (1989) 493-530. [25] D.E. Goldberg, K. Deb, A comparative analysis of selection schemes used in genetic algorithms,

in: Foundations of Genetic Algorithms, 1991. [26] D.E. Goldberg, K. Deb, H. Kargupta, G. Harik, Rapid, Accurate Optimization of Difficult

Problems Using Fast Messy Genetic Algorithms, in: Proceedings of the Fifth International Conference on Genetic Algorithms, 1993.

[27] J.F. Gonçalves, J.J. de Magalhaes Mendes, M.G.C. Resende, A Genetic Algorithm for the Resource Constrained Multi-Project Scheduling Problem, AT&T Labs Technical Report TD-668LM4, (2004).

[28] G. Harik and D.E. Goldberg, Learning Linkage, in: Foundations of Genetic Algorithms IV, 1997. [29] G. Harik, F. Lobo, D.E. Goldberg, The Compact Genetic Algorithm, Proceedings of the IEEE

International Conference on Evolutionary Computation 3(4) (1998) 523-528. [30] G. Harik, E. Cantu-Paz, D.E. Goldberg, B.L. Miller, The gambler’s ruin problem, genetic

algorithms, and the sizing of populations, Evolutionary Computation 7(3) (1999) 231-253. [31] S. Hartmann, A competitive genetic algorithm for resource-constrained project scheduling, Naval

Research Logistics 45 (1998) 733-750. [32] S. Hartmann, A self-adapting genetic algorithm for project scheduling under resource constraints,

Naval Research Logistics 49(5) (2002) 433-448. [33] S.Hartmann, R.Kolisch, Experimental investigation of state-of-the-art heuristics for the resource-

constrained project scheduling problem, European Journal of Operations Research 127 (2000) 394-407.

[34] W.S. Herroelen, Project Scheduling - Theory and Practice, Production and Operations Management, 4(4) (2005) 413-432.

[35] J.H. Holland, Adaptation in natural and artificial systems, The University of Michigan Press, Ann Arbor, 1975.

[36] C. van Hoyweghen, B. Naudts, D.E. Goldberg, Spin-flip symmetry and synchronization, Evolutionary Computation 10(4) (2002) 317-344.

[37] W.H. Ip, Y. Li, K.F. Man, K.S. Tang, Multi-product planning and scheduling using Genetic Algorithm approach, Computer & Industrial Engineering 38 (2000) 283-296

[38] H. Kargrupta, K. Deb, D.E. Goldberg, Ordering genetic algorithms and deception, in: Parallel Problem Solving from Nature II (1992) 47-56.

28

[39] H. Kargrupta, The gene expression messy genetic algorithm, in: Proceedings of the International Conference on Evolutionary Computation (1996) 814-819.

[40] J.E. Kelley Jr., Critical-Path Planning and Scheduling: Mathematical Basis, Operations Research 9(3) (1961) 296-320.

[41] D. Knjazew, OmeGA: A Competent Genetic Algorithm for Solving Permutation and Scheduling Problems, Kluwer, Norwell, 2002.

[42] R. Kolisch, A. Sprecher, A. Drexl, Characterization and Generation of a General Class of Resource-Constrained Project Scheduling Problems, Management Science, 41(10) (1995) 1693-1703.

[43] R. Kolisch, Efficient Priority Rules for the Resource-Constrained Project Scheduling Problem, Journal of Operations Management 14(3) (1996a) 179-192.

[44] R. Kolisch, Serial and Parallel Resource-Constrained Project Scheduling Methods Revisited: Theory and Computation, European Journal of Operational Research 90 (1996b) 320-333.

[45] R. Kolisch, S. Hartmann, Heuristic algorithms for solving the resource constrained project scheduling problem: classification and computational analysis, Handbook on Recent Advances in Project Scheduling, Kluwer, Boston, 1998.

[46] R. Kolisch, R. Padman, An integrated survey of project deterministic scheduling, International Journal of Management Science 29(3) (2001) 249–272.

[47] R. Kolisch, R. Padman, An integrated survey of deterministic project scheduling, OMEGA 29 (2001) 249-272.

[48] I. Kurtulus, E.W. Davis, Multi-Project Scheduling: Categorization of Heuristic Rules Performance, Management Science 28(2) (1982) 161-172.

[49] J. Lenstra, K. Rinnooy, Complexity of Scheduling under Precedence Constraints, Operations Research 26(1) (1978) 22-35.

[50] M.J. Liberatore, B. Pollack-Johnson, Factors Influencing the Usage and Selection of Project Management Software, IEEE Transactions. on Engineering Management 50(2) (2003) 164-174.

[51] F.S.C. Lam, B.C. Lin, C. Sriskandarajah, H. Yan, Scheduling to minimize product design time using a genetic algorithm, International Journal of Production Research, 37(6) (1999) 1369-1386

[52] G.E. Liepins, M.D. Vose, Representational issues in genetic optimization, Journal of Experimental and Theoretical Artificial Intelligence 2(2) (1990) 4-30.

[53] S. Leu, C. Yang, A GA-based multicriteria optimal model for construction scheduling. Journal of Construction Engineering and Management 125(6) (1999) 420-427.

[54] A. Lova, P. Tormos, Analysis of Scheduling Schemes and Heuristic Rules Performance in Resource-Constrained Multi-project Scheduling, Annals of Operations Research 102 (2001) 263-286.

[55] S.W. Mahfoud, Niching methods for genetic algorithms, Doctoral dissertation, University of llinois at Urbana - Champaign, 1995.

[56] D.G. Malcolm, Application of a Technique for Research and Development Program Evaluation, Operations Research 7(5) (1959) 646-669.

[57] C. Maroto, P. Tormos, A. Lova, The Evolution of Software Quality in Project Scheduling, in: Project Scheduling: Recent Models, Algorithms and Applications, Kluwer, Boston, 1999.

[58] C. Meier, A.A. Yassine, T.R. Browning, Design Process Sequencing with Competent Genetic Algorithms, ASME Journal of Mechanical Design (Forthcoming) (2006).

[59] B.L. Miller, Noise, sampling, and efficient genetic algorithms, Doctoral dissertation, University of Illinois at Urbana-Champaign, 1997.

[60] M. Mitchell, An Introduction to Genetic Algorithms, MIT Press, Cambridge, 1996. [61] H. Mühlenbein, How Genetic Algorithms really work: Mutation and hillclimbing, in: Parallel

Problem Solving from Nature II (1992) 15-26. [62] T. Murata and H. Ishibuchi, Performance evaluation of genetic algorithms for flow shop scheduling

problems, in: Proceedings of the First IEEE Conference on Genetic Algorithms and their Applications (1994) 812-817.

[63] I.M. Oliver, D.J. Smith, J.R.C. Holland, A study of permutation crossover operators on the traveling salesman problems, in: Genetic algorithms and their application (1987) 227-230.

[64] J.H. Patterson, Alternative Methods of Project Scheduling with Limited Resources, Naval Research Logistics Quarterly 20(4) (1973) 767-784.

[65] J.H. Payne, Management of Multiple Simultaneous Projects: A State-of-the-Art Review, International Journal of Project Management, 13(3) (1995) 163-168.

29

[66] M. Pelikan, D.E. Goldberg, E. Cantú-Paz, BOA: The Bayesian Optimization Algorithm, in: Proceedings of the Genetic and Evolutionary Computation Conference (1999) 525-532.

[67] P. Pongcharoen, C. Hicks, P. M. Braiden, The development of genetic algorithms for the capacity scheduling of complex products, with multiple levels of product structure, European Journal of Operational Research 152 (2004) 215-225.

[68] P.W. Poon, J.N. Carter, Genetic algorithms crossover operators for ordering applications, Comp. Ops. Res. 22(1) (1995) 135–147.

[69] K. Sastry, D.E. Goldberg, Modeling tournament selection with replacement using apparent added noise, in: Proceedings of the Genetic and Evolutionary Computation Conference 11 (2001) 129 – 134.

[70] K. Sastry, D.E. Goldberg, Modeling tournament selection with replacement using apparent added noise, Intelligent Engineering Systems Through Artificial Neural Networks 11 (2001) 129-134.

[71] M. Spinner, Improving Project Management Skills and techniques, Prentice-Hall, Englewood Cliffs, 1989.

[72] A. Sprecher, Solving the RCPSP efficiently at modest memory requirements, Manuskripte aus den Instituten für Betriebswirtschaftslehre. No. 425, University of Kiel, 1996.

[73] G. Syswerda, Uniform crossover in genetic algorithms, in: Proceedings of the Third International. Conference on Genetic Algorithms (1989) 2–9.

[74] D. Thierens, Mixing in genetic algorithms, Doctoral dissertation, Katholieke Universiteit Leuven, 1995.

[75] J.R. Turner, The handbook of project-based management, McGraw-Hill, United Kingdom, 1993. [76] P.R. Thomas, S. Salhi, An Investigation into the Relationship of Heuristic Performance with

Network-Resource Characteristics, Journal of the Operational Research Society 48 (1997) 34-43. [77] A.A. Yassine, D. Braha, Four Complex Problems in Concurrent Engineering and the Design

Structure Matrix Method, Concurrent Engineering Research & Applications 11(3) (2003) 165-176. [78] S. Kumann, J. Jegan Jose, K. Raja, Multi-project scheduling using an heuristic and a genetic

algorithm, International Journal of Advanced Manufacturing technology, 31 (2006) 360-366. [79] Valls, V., Ballestín, F., Quintanilla, S., A Hybrid Genetic Algorithm for the Resource-Constrained

Project Scheduling Problem, European Journal of Operational Research (2007), forthcoming. [80] S. Tsubakitani, R. Deckro, A heuristic for multi-project scheduling with limited resources in the

housing industry, European Journal of Operational Research 49 (1990) 80-91.

multi-project scheduling using competent genetic …ay11/ga-scheduling.pdfmulti-project scheduling...

Documents