multi-project scheduling using competent genetic …ay11/ga-scheduling.pdfmulti-project scheduling...
TRANSCRIPT
Multi-Project Scheduling using Competent Genetic Algorithms
Ali A. Yassine+ Christoph Meier Tyson R. Browning Department of Industrial &
Enterprise Systems Engineering University of Illinois at Urbana
Urbana, IL 61801 USA [email protected]
Institute of Astronautics Technische Universität München
85748 Garching, Germany [email protected]
Neeley School of Business Texas Christian University
Fort Worth, TX 76129 USA [email protected]
University of Illinois
Department of Industrial & Enterprise Systems Engineering (IESE)
Working Paper
Current version: Feb. 1, 2007
+ Corresponding author. The second author is grateful for support from the Bavarian Science Foundation.
Nomenclature (Project Scheduling) m Number of projects n Total number of activities nh Number of activities in project h h Index of Project Vz Set of activities {a(1)…a(nh)} in project h di Processing time for activity i i→j Predecessor relationship Pred(i) Set of predecessors for activity i
kR Set of renewable resources of type k
ihkr Per period usage of activity i of resource k in project h Kh Number of types of resources used by project h θ Set of feasible schedules θT Set of precedent feasible schedules θR Set of resource feasible schedules Cmax Vector of task completion times ARLF Average Resource Loading Factor AUF Average Utilization Factor CPh Non-resource-constrained critical path duration of
project h Xiht Boolean variable, true (equal to 1) if activity i of project h
is active at time t Ziht Equal to -1 if the part of activity i of project h is active at
time t ≤ CPl / 2; otherwise equal to 1 S Number of time intervals spanning a problem
Nomenclature (Genetic Algorithm) SGA Simple Genetic Algorithm CGA Competent Genetic Algorithm BB Building Block c Constant factor kBB Order or size of BBs l Current chromosome/problem length mBB Number of BBs npop Population size pcut Cut probability pκ Bitwise cut probability pm Mutation probability ps Slice probability s Size of tournament used in the
Tournament Selection phase μ Calibration coefficient Φp Phenotypic search space Φg Genotypic search space
ABSTRACT
In a multi-project environment, many projects are to be completed that rely on a common pool of
scarce resources. In addition to resource constraints, there exist precedence relationships among activities
of individual projects. This project scheduling problem is NP-hard and most practical solutions that can
handle large problem instances rely on priority-rule heuristics and meta-heuristics rather than optimal
solution procedures. In this paper, a Competent Genetic Algorithm (CGA), hybridized with a local search
strategy, is proposed to minimize the overall duration or makespan of the resource constrained multi-
project scheduling problem (RCMPSP) without violating inter-project resource constraints or intra-
project precedence constraints. The proposed Genetic Algorithm (GA) with several varied parameters is
tested on sample scheduling problems generated according to two popular multi-project summary
measures, Average Utilization Factor (AUF) and Average Resource Load Factor (ARLF). The
superiority of the proposed CGA over simple GAs and well-known heuristics is demonstrated.
Keywords: Multi-Project Scheduling, Resource Constraints, Competent Genetic Algorithm, Heuristic
Priority Rules
1
1. INTRODUCTION
Due to increasingly impatient customers and competitive threats, improvements in the efficiency
with which projects are completed and new products are brought to market have become increasingly
important. To complicate things further, many organizations are faced with the challenge of managing
the simultaneous execution and management of a portfolio of projects under tight time and resource
constraints. In such an environment, project management and scheduling skills become very critical to
the organization. “Multi-project environments seem to be quite common in project scheduling practice….
It has been suggested [65,75] that up to 90%, by value, of all projects are carried out in the multi-project
context, and thus the impact of even a small improvement in their management on the project
management field could be enormous” [34].
In this paper, we address the case of a portfolio of simultaneous projects with identical start times.
Each project consists of precedence-constrained activities that draw from common pools of resources,
which are usually not large enough for all of the activities to work concurrently. In such cases, which
activities should get priority? The goal is to prioritize them so as to optimize an objective function, such
as minimizing the delay of each project or of the whole portfolio. Such is the basic resource-constrained
multi-project scheduling problem (RCMPSP).
In a RCMPSP environment, a company has m concurrent projects P1…Pm, each comprised of a set
of activities Vz = {a(1)…a(nh)}, where nh specifies the total number of activities in project Ph. In addition,
any activity i has several associated attributes, such as its duration di and the types and amounts of
resources required. Each project will have a corresponding precedence network whose structure is often
depicted by an activity-on-node network. The predecessor relationship between two activities i and j is
denoted by i→j or (i,j), and the entire set of predecessors of activity j is denoted by the term Pred(j).
Although projects may be unrelated by precedence constraints, they depend on a common pool of
resources and are therefore related by resource constraints. We consider a set of renewable resources
where the per-period usage of activity i of resource k in project h is written as ihkr . The set kR constitutes
the constant number of resource k available during every time period. We assume that a resource must be
devoted to an activity until it is completed before beginning another activity (i.e., no preemption is
allowed). Moreover, we assume the single-mode case, where a single resource type is assigned for
2
performing a particular activity, and the processing time pi and the resources required ihkr for any
activity i are fixed. When considering the set of feasible project schedules θ = θT ∩ θR, where θT denotes
the set of precedent-feasible schedules and θR denotes the set of resource-feasible schedules, there exist
many possible θs and many potential objectives for choosing between them. If n defines the total number
of activities in all m projects, popular objectives include minimizing the maximum project makespan
(i.e., minimizing Cmax = max{d1…dn}), maximizing the net present value (NPV), maximizing resource
leveling, or minimizing project costs [8,46].
This paper utilizes a new genetic algorithm (GA) approach to solve the RCMPSP; however, in
contrast to previous research, we use a state-of-the-art competent GA (CGA) design. This design differs
from simple GA (SGA) strategies as it strives to identify highly fit hyperplanes in the search space, called
building blocks (BBs), which are subsequently combined in a sophisticated manner. The resulting project
schedules are expected to be superior compared to those produced by priority rule heuristics or SGAs. To
cope with the precedence and resource constraints in the RCMPSP, we introduce an efficient repair
mechanism that ensures feasibility throughout the search. We further enhance the performance of the
CGA with a local search strategy tailored for the RCMPSP. The performance of this approach is
thoroughly tested on 77 test problems, carefully constructed according to two popular multi-project
summary measures, the Average Resource Load Factor (ARLF) and the Average Utilization Factor
(AUF), stated by Kurtulus and Davis [48]. As a result, we discovered that the parallel schedule
generation scheme (SGS) outperforms the self-adapting SGS in combination with the proposed CGA.
Furthermore, the proposed CGA outperforms many well-known priority-rule-based heuristics in 78% of
the problems.
The paper proceeds as follows. Section 2 provides background on activity scheduling and SGAs,
after which §3 presents the design and benefits of a CGA for combinatorial problems and §4 describes its
tailoring to RCMPSPs. §5 evaluates the performance of the proposed CGA compared with a SGA on the
RCMPSP test bank. In §6, we present comparative results of the CGA with 20 popular heuristic priority
rules used in the literature. The paper concludes in §7 with a brief summary of the work completed and
possible extensions for future work.
3
2. BACKGROUND
Project scheduling is of great practical importance and its general model can be used for applications
in product development, as well as production planning and a wide variety of scheduling applications.
Early efforts in project scheduling focused on minimizing the overall project duration (makespan)
assuming unlimited resources. Well-known techniques include the Critical Path Method (CPM) [40] and
the Project Evaluation and Review Technique (PERT) [56]. Scheduling problems have been studied
extensively for many years by attempting to determine exact solutions using methods from the field of
operations research [46].
It was earlier shown that the scheduling problem subject to precedence and resource constraints is
NP-hard [49], which means that exact methods are too time-consuming and inefficient for solving the
large problems found in real-world applications. There exist benchmark instances with as few as 60
activities that have not been solved to optimality [31]. Kolisch [46] surveyed a number of techniques
developed for resource-constrained project scheduling, including dynamic programming, zero-one
programming, and implicit enumeration with branch and bound. Some examples of exact solution
methods can be seen in [8,15,16,72]. Among them, the branch and bound approach is the most widely
applied. However, its depth-first or breadth-first searches cannot exhaustively explore a large-scale
project scheduling problem. Simulation modeling provides a new angle to view the RCMPSP. A
simulation model is proposed for multi-project resource allocation, interpreted as multi-channel queuing
[20]. The innate drawback of simulation is time and cost, as well as deploying a particular simulation
language that could hinder its dissemination. Finally, many different heuristic approaches have been
developed to solve intractable problems quickly, efficiently, and fairly satisfactorily. A survey of
heuristic approaches can be found in [8,45].
GAs, first proposed in [35], are adaptation procedures based on the mechanics of genetics and
natural selection. These algorithms are designed to act as problem-independent algorithms, which is at
times contrary to the field of Operations Research, where algorithms are often matched to problems. In
brief, a simple GA works as follows. A feasible instance of the underlying problem is encoded as a so-
called chromosome, while multiple chromosomes form a GA population. By selecting the fittest
chromosomes (i.e., the ones with the highest value according to an objective function) and applying the
4
Figure 1: Simple GA Flowchart
ordinary genetic operators—selection, crossover, and mutation—the population is expected to improve
over time. The GA proceeds until a predefined convergence criterion is reached. Figure 1 depicts this
simple circular flow.
In terms of the scheduling problem, GAs were first used by Davis [13]. Since then, a vast literature
on the application of GAs to various scheduling problems has emerged [10,17,19,37,67,78]. With a
special focus on a single-project problem1, Hartman developed a GA with permutation-based encoding
[31] and introduced a self-adapting representation scheme which determines automatically the best
schedule decoding procedure [32]. Gonçalves et al. [27] used a SGA approach for the RCMPSP based on
a random key chromosome encoding and a schedule generation procedure which creates so-called
parameterized active schedules. Recently, Valls et al. [79] proposed a hybrid GA tailored to the RCPSP
with a specific crossover and local search operator for the. Despite slight modifications of the GA flow
illustrated at Figure 1, past and recent publications on the application of GAs to the RCMPSP/RCPSP
have nevertheless the SGA design in common. We will explain in the subsequent sections why a SGA
design, even with improvements through local search operators or specific crossover and mutation
operators, is eventually inferior to a state-of-the-art CGA design.
1 This problem is known as the resource constrained project scheduling problem (RCPSP).
5
3. DESIGN AND IMPLEMENTATION OF A COMPETENT GA
Past research on GA applications to scheduling problems is characterized by using a SGA design as
illustrated in Figure 1. This design can be enhanced by several techniques. Some of these improvements
include niching [55], parallelization [9], or the hybridization of the GA, which is oriented more towards
global searches, with an efficient local search strategy [60]. Despite all of the well-known enhancement
techniques, a SGA fails to provide adequate solutions continuously as problem difficulty increases [23].
A mathematical explanation for this statement is the dimensional analysis of the building block (BB)
mixing process in SGAs [74]. BBs are useful hyperplanes in the search space, called schemata, which
can be understood as portions of a solution contributing much to the global optimum [22]. Goldberg [22]
claims that the constant juxtaposition and selection of BBs forms better and better solutions over time,
leading to a global optimum in a search space. In a study of the BB mixing process, Thierens [74]
pointed out the exponential scale-up in computational expense with linearly increasing problem
difficulty. He defined the following equation for the population size necessary for a successful GA:
5.22ln2))ln(ln(ln
mpscnnn
c
mkm
poppoppop
BBBB ⋅⋅>+
+μ
(1)
where kBB denotes the BB order/size, mBB corresponds to the number of BBs, npop is the population size, μ
can be regarded as a calibration coefficient, and c is a constant. Since problem difficulty can be expressed
in terms of the length and order of the BBs, equation (1) demonstrates its negative effects in terms of
computational resources.
To tackle the mixing problem in a SGA, several so-called CGAs were developed. Three different
approaches to CGAs can be distinguished: (a) Perturbation techniques, (b) Linkage adaptation
techniques, and (c) Probabilistic model-based techniques. Examples of the first approach include the fast
messy GA (FMGA) [26], the ordering messy GA (OmeGA) [41], and the gene expression messy GA
(GEMGA) [39]. An example of the second approach is the linkage learning GA, introduced by Harik
[28]. The third approach includes the compact genetic algorithm [29] and the Bayesian optimization
algorithm (BOA) [66]. At present, OmeGA is the only CGA constructed for combinatorial problems like
scheduling or the quadratic assignment problem. Essentially, OmeGA combines the FMGA with random
keys to represent permutations. Empirical tests by Knjazew [41] on artificial test functions show a
promising sub-quadratic scale-up behavior of resources with problem length, O(l1.4), where l = kBB·mBB.
6
Therefore, in the remainder of this section, we present the OmeGA and its application to the RCMPSP.
3.1. Components of the OmeGA
3.1.1. Data Structure
Using any GA as an optimization technique requires the existence of a proper data structure for
manipulation. Each instance of this structure represents one point in the space of all possible solutions. In
the context of GAs, this data structure is usually called a chromosome, which is a juxtaposition of genes.
Genes occur at different locations or loci of the chromosome and have values which are called alleles.
While the term genotype refers to the specific genetic makeup of an individual in natural systems and
corresponds to the structure of a GA, the term phenotype corresponds to the decoded structure of GAs,
which can be regarded as one point in the search space. To facilitate linkage learning by permitting genes
to move around the genotype, the OmeGA uses a different representation technique than ordinary GAs:
each gene is tagged with its location via the pair representation <locus, allele>. For example, the two
messy chromosomes shown in Figure 2 both represent the permutation (1-34-89-15-13-19). Such
permutations constitute a schedule priority list for the RCPSP.
Figure 2: Illustration of a Messy Chromosome
Messy chromosomes may also have a variable length; they can be overspecified or underspecified.
As an example, consider the chromosomes in Figure 3. The problem length is 6 in this example, so the
first chromosome is overspecified since it contains an extra gene. The second chromosome is
underspecified because it contains only three genes. To handle overspecification, Goldberg [26] proposes
a gene expression operator that employs a first-come-first-served rule on a left-to-right scan. In the
example of Figure 3, the gene assigned to locus 1 occurs twice in chromosome A. Thus, the left-to-right-
scan drops the second instance, obtaining the valid permutation (1-34-89-15-13-19) instead of (99-34-89-
15-13-19). In the case of underspecification, the unspecified genes are filled in using a competitive
template, which is a fully specified chromosome from which any missing genes are directly inherited. At
the start of the OmeGA, the genes of the competitive template are randomly chosen in consideration of
7
feasibility issues. For example, using the competitive template shown in Figure 4, the underspecified
chromosome B is completed by inheriting genes 4 through 6 from the competitive template.
Figure 3: Messy Chromosomes May Have Variable Length
Figure 4: Use of a Competitive Template on Underspecified Chromosomes
Representing alleles as ordinary integer values, SGAs for combinatorial problems typically utilize
an integer encoding for the chromosomes. Therefore, various GA operators have been developed to
maintain feasibility in terms of allele duplication in the population when using integer encoding [22,68].2
In contrast, the OmeGA uses a binary coding representation where messy chromosomes are encoded
through random keys [2], as demonstrated in Figure 5. Each gene on the chromosome is assigned a
number ri∈[0,1]. Then, the permutation sequence is determined by sorting the genes in ascending order
of their associated random keys. This encoding has the advantage that any crossover operator can be
used, since random keys always produce duplicate-free solutions for combinatorial problems. Moreover,
information about partial relative ordering is preserved in crossover [41]. Floating point numbers are
typically used as ascending sorting keys to encode a permutation. These numbers are initially determined
randomly and change only under the influence of mutation and crossover. Accordingly, a permutation of
length l consists of a vector r = {r1, r2, ..., rl}.
2 These operators do not ensure predecessor- or resource-feasibility.
Figure 5: Demonstration of Random Keys
8
3.1.2. Fitness Function
The second component of any GA is the fitness function. Every optimization technique must be able
to assign a measure of quality to each structure in the search space to distinguish good and bad results.
For this purpose, GAs use a fitness function to assign each individual chromosome a fitness value.
Generally, an optimization problem can be decomposed into a genotype-phenotype mapping fg and a
phenotype-fitness mapping fp [52]. Assuming a genotypic search space Φg, which can be either discrete
or continuous, the fitness function f assigns each element in Φg a value as follows: ( ) : gf x Φ → .
According to the fitness function decomposition, the genotype-phenotype mapping occurs first, where the
genotype elements are mapped to elements in the phenotypic search space Φg: pggg xf Φ→Φ:)( .
Subsequently, the phenotype-fitness mapping Φp is performed: ( ) :p p pf x Φ → . Thus, the fitness
function can be regarded as a composition of both mappings: ))(( ggpgp xfffff == . Due to the
frequent fitness function evaluations, its efficient implementation is crucial for gaining adequate
computational processing times. One opportunity for speeding up this operation is the use of
parallelization techniques [9], which can be applied to some extent in the OmeGA.
In their comprehensive survey of the project scheduling literature, Kolish and Padman [46] provide
an overview of objectives for scheduling problems, including traditional ones such as makespan and cost
minimization but also more recent ones like maximization of project quality. We use total project
lateness as the RCMPSP performance measure to be minimized, so we use the following fitness function
for the OmeGA:
z
m
iRz DD −∑
=1
min (2)
where DRk corresponds to the duration of project z under consideration of resources available and Dz
denotes the duration of project z neglecting any resource constraints.
3.2. Mechanics of the OmeGA
CGAs operate in a fundamentally different way from SGAs (shown in Figure 1). Figure 6 shows the
overall flow of the OmeGA, which iterates over epochs, each of which contains two loops, an inner and
an outer. The outer loop is called an era and corresponds to one BB level.3 The inner loop consists of
3 A BB “level kBB” denotes the processing of BBs of maximum size kBB.
9
three stages: (1) Initialization Phase, (2) BB Filtering / Primordial Phase, and (3) Juxtapositional Phase.
An important property of any good optimization technique is the ability to avoid local optima. For this
purpose, the OmeGA uses level-wise processing. After each era (level), the best solution found so far is
stored and used as the competitive template for the next era. We will now explain the three phases of
each era more in detail.
3.2.1. Initialization Phase
In the original study of the messy GA, the authors claimed that a population size with a single copy
of all substrings of order/length kBB (ensuring the presence of all BBs) is necessary to detect solutions
near the global optimum [24]. But the population size needed to guarantee such a copy is extremely high:
( )!!!22
BBBB
k
BB
kpop klk
lkl
n BBBB
−=⎟⎟
⎠
⎞⎜⎜⎝
⎛⋅= (3)
This equation results from the number of ways to choose BBs of order kBB from a string of length l,
multiplied by the number of different BB combinations (assuming a binary alphabet).
This exponential demand of resources would cause a serious problem, known as an initialization
bottleneck, but Goldberg et al. [26] found a way to overcome it. They predicted the theoretical population
size for a probabilistically complete initialization and found only a linear scale-up with BB number and
size. They verified this formula empirically on artificial functions. Unfortunately, the BB structure (size
and number of BBs) of the problem is usually unknown. Hence, in practice, population size is normally
determined by empirical tests. Nevertheless, we can estimate the required population size for a successful
convergence of the OmeGA, since its design is identical to that of the fast messy GA except for its
encoding scheme. During the Initialization Phase, the length of the chromosomes can be chosen
Figure 6: The Flow of the OmeGA
10
arbitrarily in the interval between kBB and l. Note that in each era a completely new population is
initialized. The size of the population, popt, at level (or era) t, assuming a starting level 1, is given by:
1
t
ii
pop=∑ . Thus, the overall population size is the sum of population sizes for each era.
3.2.2. Building Block Filtering Phase
A common characteristic of every CGA is a way of identifying the BBs. According to the BB
Hypothesis [22], these specific hyperplanes in the search space can be subsequently combined to form
solutions near the global optimum. In the OmeGA the BB Filtering Phase performs this essential task
through repeated selection and random deletion of genes.
At the beginning of the BB Filtering Phase, which is depicted in Figure 7, chromosomes arrive from
the Initialization Phase with a length almost equal to the problem length. First, a Selection Segment
probabilistically filters highly fit chromosomes from less fit ones. The current competitive template is
used to complete the genotype of under- or overspecified chromosomes. Then, in a Length Reduction
Segment, random deletion cuts chromosomes down to a length equal to the current BB level. Assuming
the Selection segment provides sufficiently good genes, the random deletion is not expected to destroy all
the BBs. The mathematical equations calculating the number of selections and deletions to be performed
can be found in [24,41]. As a selection scheme, we use Tournament Selection without replacement and a
tournament size of 4, due to the reasons mentioned in [54,64].
3.2.3. Juxtapositional Phase
In addition to identification, BBs must be properly recombined. For this purpose, the OmeGA
incorporates a Juxtapositional Phase, which corresponds to the crossover stage in SGAs. Instead of
traditional crossover operators like uniform crossover [73], the OmeGA uses cut and splice operators
[24], as demonstrated in Figure 8. The cut operator divides a chromosome into two parts with cut
probability pcut = pκ (l – 1), where l is the current length of a chromosome and pκ is a specified bitwise cut
probability. Knjazew [41] suggests keeping l ≤ 2n, gaining a pκ set to the reciprocal of half of the
problem length, pκ = 2 / n. In contrast to the cut operator, the splice operator connects two chromosomes
with probability ps, where ps is usually chosen rather high. The OmeGA thereby combines the identified
BBs to find the optimum in exactly the same manner suggested by the BB Hypothesis. Note that the
population size for each Juxtapositional Phase is held constant.
11
<5,13> <1,1> <4,15> <2,34> <6,19> <3,89>
<3,89> <1,1> <6,19> <5,13> <4,15> <2,34>
<1,29> <6,13> <2,17> <3,15> <4,31> <3,24>
<6,0> <4,76> <1,11> <2,71> <3,56> <2,99>
5
14
16
7
Fitness
Chromosomes entering from Initialization Phase
<3,89> <1,1> <6,19> <5,13> <4,15> <2,34>
<1,29> <6,13> <2,17> <3,15> <4,31> <3,24>
14
14
16
16
Fitness
Chromosomes after Selection Segment
<1,29> <6,13> <2,17> <3,15> <4,31> <3,24>
<3,89> <1,1> <6,19> <5,13> <4,15> <2,34>
<3,89> <1,1> <6,19> <5,13> <4,15> <2,34>
<1,29> <6,13> <2,17> <3,15> <4,31> <3,24>
?
?
?
?
Fitness
Chromosomes after Length Reduction Segment
<1,29> <6,13> <2,17> <3,15> <4,31> <3,24>
<3,89> <1,1> <6,19> <5,13> <4,15> <2,34>
Figure 7: Illustration of the BB Filtering Phase
4. APPLYING THE OmeGA TO PROJECT SCHEDULING PROBLEMS
We have found that it is important to be careful in applying any GA to a RCMPSP, because slight
nuances in the GA’s attributes can have important implications for solution quality and efficiency.
Figure 8: Examples of Cut and Splice Operators
12
Therefore, we use this section to provide some details of the OmeGA’s application to the RCMPSP.
4.1. Preserving Predecessor Feasibility
Applying the OmeGA to the RCMPSP requires addressing the issue of predecessor feasibility. The
outcome of the OmeGA is a permutation of the priority list for all activities to be scheduled. Based on
this list, we explain in the next section how a so-called schedule generation scheme (SGS) builds a real
schedule with the start and finish times for each activity. However, a prerequisite for any SGS is a
precedent-feasible schedule list. Since the mechanics4 and the random key encoding in the OmeGA
merely ensure the prevention of duplicate alleles in chromosomes, an additional, efficient repair
mechanism is required to transform any schedule list into a precedent-feasible one.
In the case of predecessor violation, a straight-forward repair strategy is to iterate the processing
step which caused the violation (e.g., the Juxtapositional Phase) until all constraints are satisfied.
Considering the great discrepancy between the immense number of possible permutations (n! for n
activities) and the number of feasible solutions, it becomes obvious that such a brute force method would
be too time consuming. Therefore, we handle predecessor constraints in another deterministic and
efficient way, using a repair mechanism prior to schedule construction and fitness function assignment.
Predecessor conflicts cannot occur between activities which can be executed concurrently—i.e., activities
which do not rely on predecessor information at the same point in time Ti. Assuming we start with an
empty schedule list at time T0, we can calculate the set of parallel activities at T0 and subsequently pick
an activity out of this set according to a deterministic strategy. The chosen activity is then appended at
the end of the current priority list and all its relationships within the network of activities are deleted.
Repeating this procedure until all activities have been assigned to a spot in the schedule list, we will
never violate any predecessor constraints.
The pseudo-code shown in Figure 9 describes the repair mechanism more in detail. As input, the
algorithm needs the permutation to be mapped, q, the number of projects, m, the number of activities, n,
the three-dimensional array of all projects DSM[m][n][n] (explained below), and two auxiliary variables,
i and j. The output is a precedent-feasible schedule list, s. The algorithm identifies the first activity of q
without precedent activities and assigns it to spot i in s. Then, all dependencies on the selected activity
4 In particular, the crossover operators that preserve predecessor feasibility in SGA designs, such as Union Crossover [9], cannot be applied in the OmeGA and are replaced by cut and splice operators.
13
are deleted from the design structure matrix (DSM).5 This simple algorithm scales up in complexity
O(n2).
As an example, consider two projects, each with four activities, modeled by the two DSMs in Figure
10(a) and a permutation representing a schedule list in Figure 10(b). The DSMs indicate the precedence
relationships between the activities—e.g., activity 1 precedes activity 4 and activity 2 precedes activity 3.
The permutation q = {3-7-1-8-5-4-2-6} does not yield a precedent-feasible schedule since, for instance,
activity 3 is scheduled prior to activity 2. Applying the algorithm in Figure 9 leads to the following
results. Activities 1, 2, 5, and 6 do not depend on any other activities in the set and thus comprise the
initial set of parallel activities. The first value in this set which also occurs in q is {1}. Thus, the first
value of the feasible schedule list q must be 1: q[0] = 1. After deleting the row and column for activity 1
in DSM 1, the next iteration of the algorithm begins, detecting a new set of parallel activities: {2,4,5,6}.
In this set, activity 5 is the earliest one in q and consequently q[1] = 5 holds. The row and column for
activity 5 are then deleted and a new loop starts. Repeating all steps of the algorithm until convergence,
we obtain the precedent-feasible schedule list in Figure 10(c), q = {1,5,7,8,4,2,3,6}.
5 A DSM is an efficient and commonly used method of showing the relationships between with the activities in a project [5]. Essentially, it can be understood as the precedence matrix representation of an activity-on-node network. Given a set of n activities in a project, the corresponding DSM is an n × n matrix where the activities are the diagonal elements and the off-diagonal elements indicate the precedence relationships.
Input: Integer i, j, m, n; ScheduleList q[n]; Array DSM[m][n][n]; Output: Feasible schedule list s[n] i ← 0; j ← 0; s ← new ScheduleList[n]; WHILE i < n FOR j to n-1 IF q[j].numberOfPredecessors = 0 AND
q[j].isScheduled = false THEN s[i] ← q[j]; BREAK; ENDIF j ← j+1; ENDFOR j ← 0;
FOR j to DSM[s[i].projectID].length-1 DSM[s[i].projectID][j][s[i].columnID] ← 0; ENDFOR j ← 0; q[s[i]].isScheduled ← true; i ← i+1;
ENDWHILE
Figure 9: Pseudo-Code for Mapping any Permutation to a Feasible Schedule List
14
(b)
(c) 1 5 7 8 4 2 3 6
Mapped Permutation, q
4.2. Schedule Generation Schemes
To assign a fitness value (according to a predefined objective function) to a permutation (schedule
list) in the OmeGA, a schedule generation scheme (SGS) is necessary for building a real schedule out of
a schedule list. Boctor [4] distinguishes between a “serial” and a “parallel” SGS. In a serial SGS, each
activity’s priority is calculated once, at beginning of the SGS algorithm, whereas in a parallel SGS an
activity’s priority is re-determined as necessary at each time step.
The serial SGS proceeds as follows. First, the overall problem duration is broken down into N
stages, where N is the total number of activities to be scheduled. This SGS separates the activities into
two mutually exclusive and disjoint sets: the scheduled set, S (already scheduled activities), and the
decision set, D (unstarted activities that depend only on activities in S). In each stage, one activity is
selected from D and scheduled at its earliest precedence- and resource-feasible start time [40], which
moves it to set S.
The parallel SGS proceeds as follows. First, the overall problem duration is broken down into time
steps. At each time step, the algorithm separates the activities into four mutually exclusive and disjoint
sets: the complete set, C (finished activities), the active set, A (ongoing, “already scheduled” activities),
the decision set, D (unstarted activities that depend only on activities in C), and the ineligible set, I
(activities which depend on activities in A or D). Since preemption is not allowed, the SGS automatically
assigns resources to activities in A. If the remaining resources are sufficient to perform the activities in D,
then the algorithm adds these to A. If not, then it uses a priority rule to rank the activities in D. The
highest-ranking activities are added to A as resources allow. The time step ends when the shortest activity
(or activities) in A finishes. Finished activities are moved to C, and activities in I are checked for
potential transfer to D. The schedule is complete (i.e., the project duration is known) when all activities
are in C.
Figure 10: Two Projects (a) Modeled by DSMs with (b) an Unmapped Schedule Priority List and (c) a Precedent-Feasible Schedule List after Executing the Proposed Repair Mechanism
(a)
15
Unfortunately, it is impossible to predict in advance which SGS will perform best for an arbitrary
RCMPSP. As the serial and parallel SGSs exhibit two different behaviors, thus potentially resulting in
two unequal schedules for an identical schedule priority list [32], determining which scheme is best
becomes an optimization problem in itself. To address this dilemma, Hartmann [32] proposed a GA-
based heuristic, the self-adapting GA, to help determine if the serial or the parallel SGS is better suited
for the underlying problem. Instead of selecting a SGS in advance, the self-adapting GA allows
chromosomes to be evaluated via the parallel or the serial SGS. While the self-adapting GA proceeds,
more fit chromosomes will prevail, not only with respect to their schedule list but also in terms of their
SGS. For this purpose, the self-adapting GA maintains an additional gene for each chromosome that
uniquely specifies its SGS. The modifications necessary to permit such flexibility in the choice of the
SGS are described in [32] and were easily incorporated into the OmeGA.
4.3. Hybridization with the 2-opt Heuristic
In general, a GA tends to explore the search space via its crossover operator rather than extensively
exploiting specific regions through mutation. This is mainly due to the high crossover probability and the
low mutation probability, both of which are necessary for a successful GA.6 However, an exploitation of
interesting regions within the search space can usually be accomplished efficiently and effectively.
Besides, sometimes local and global optima can be identified only by a fine-grained local searcher, since
single peaks would be too difficult to detect for a coarsely-grained crossover operator. Both the SGA and
the OmeGA can be extended with an efficient local search strategy to combine the positive traits of both
approaches. In SGAs, local search is typically applied after crossover and mutation. (Another interesting
opportunity is to incorporate the local search into the fitness function itself.)
Aside from its appropriate placement in the GA flow, the choice of a suitable local search strategy
for the underlying problem is crucial. A potential approach would comprise the application of a well-
known priority rule heuristic (see section 5) for the RCMPSP to one chromosome in the initial GA
population and to the competitive template after each era. In this case, the initial GA population would
quickly increase its average fitness and the final outcome of the GA would never be worse than the
outcome of the priority rule itself. Although this local search approach sounds promising, it would in fact
be a poor choice. The initial chromosome produced by a priority rule heuristic would too quickly 6 For more information on a reasonable adjustment of crossover and mutation probability, the reader may refer to [22,58,61,74].
16
dominate the entire GA population, resulting in a premature convergence of the GA.7 Typically, this
phenomenon arises if the chosen selection scheme provides too much “selection pressure” or intensity. In
other words, by placing a highly-fit chromosome into the initial population, the GA is precluded from
properly seeking interesting search regions that might contain the global optimum.
Instead of a local searcher that strives to push chromosomes to the optimum quickly, thereby
hazarding to get stuck in a local optimum, we favor a strategy which increases the average fitness of the
GA population at a more moderate pace, thereby exploring more of the search space. Therefore, we
decided to combine the OmeGA with a tailored 2-opt heuristic for the RCMPSP. In the RCMPSP, the 2-
opt neighborhood is defined as the set of all feasible solutions that can be reached by a swap of two
elements in the priority list. Since the OmeGA initializes a new population in each era and does not
embody separate crossover and mutation phases, but a Juxtapositional Phase instead, we invoke our local
search approach at the end of each era. Furthermore, to strike a good balance between computational time
and effectiveness, we apply the local search only to the competitive template. Although we execute the 2-
opt heuristic just once per era, we do it for the most promising chromosome, which serves as a template
for upcoming chromosomes in the next era.
Tailoring the fundamental concept of the 2-opt heuristic to the RCMPSP can be accomplished as
follows. Assuming a precedent-feasible schedule list, s, the DSM, and several auxiliary variables, as
noted in Figure 11, s must be decomposed into its independent sets of parallel activities (a vector of
vectors) followed by a 2-opt search in each of these disjoint sets (see Figure 12). For example, consider
the DSMs and s in Figure 10. Due to the predecessor constraints, it is obvious that a traditional swap of
two alleles is not generally possible, as it would often lead to a predecessor violation—e.g., exchanging
activities 5 and 8 must not be allowed. However, if we determine the three disjoint sets of parallel
activities in s—namely {1,2,5,6}, {3,4,7} and {8}—we can modify the order of activities within these
sets as they appear in s without violating any predecessor constraints. In particular, we can apply a swap
between two activities in every parallel set, leading to a new schedule list which is subsequently
evaluated by a SGS. Figure 12 demonstrates the application of this algorithm, which produces nine
feasible schedule lists that differ in exactly two positions from s. Generally, the total number of
7 At this time only mutation can produce slightly new chromosomes. However, the mutation probability is typically set very low. Hence, the chance to generate new genotypes exists but is unlikely.
17
exchanges for a schedule list with z parallel sets of activities is:
∑=
−z
i
ii zz
1
2
2. (4)
Figure 11: Pseudo-code for 2-opt Heuristic in the RCMPSP8
8 The algorithm in Figure 11 describes the 2-opt heuristic only for one chromosome.
Input: Integer h, i, j, m, n, projectID,swapID_1,swapID_2; ScheduleList s[n], t[n]; Vector<Vector> parallel; Array DSM[m][n][n]; Output: Performs 2-opt heuristic for the RCMPSP h ← 0; i ← 0; j ← 0; WHILE i < n Vector<Integer> band ← new Vector(); FOR j to n-1 IF s[j].numberOfPredecessors = 0 AND
s[j].isScheduled = false THEN s[j].isScheduled ← true;
band.add(s[j]); i ← i+1; ENDIF j ← j+1; ENDFOR parallel.add(band); j ← 0; FOR j to band.size()-1 projectID ← band.elementAt(j).projectID;
FOR h to DSM[projectID].length-1 DSM[projected][h][band.elementAt(j).columnID]← 0; h ← h+1; ENDFOR h ← 0; j ← j+1;
ENDFOR parallel.add(band); j ← 0;
ENDWHILE h ← 0; i ← 0; j ← 0; FOR i to parallel.size()-1 FOR j to parallel.elementAt(i).size()-2
swapID_1 ← parallel.elementAt(i).elementAt(j); h ← j+1; FOR h to parallel.elementAt(i).size()-1 swapID_2 ← parallel.elementAt(i).elementAt(h); t • s; t[swapID_1.spotInList] ← swapID_2; t[swapID_2.spotInList] ← swapID_1; CALL SGS with t; h ← h+1; ENDFOR
j ← j+1; ENDFOR
j ← 0; i ← i+1;
ENDFOR
18
1 5 7 8 4 2 3 6
1 2 5 6 3 4 7 8
2 5 7 8 4 1 3 6
5 1 7 8 4 2 3 6
6 5 7 8 4 2 3 1
1 2 7 8 4 5 3 6
1 5 7 8 4 6 3 2
1 6 7 8 4 2 3 5
1 5 7 8 3 2 4 6
1 5 3 8 4 2 7 6
1 5 4 8 7 2 3 6
Precedent feasible schedule list s
Vector parallel of parallel activity sets
Precedent feasible schedule lists t after local search
5. COMPUTATIONAL RESULTS
5.1. OmeGA (CGA) vs. SGA
In order to to illustrate the superior performance of CGAs compared to SGAs, we tested both
designs without local search extensions on artificial functions which allow to scale the level of difficulty.
While a search space can exhibit many problem characteristics like epistasis [14], noise [59], or
symmetry [36], the GA community mainly examines GA performance using so-called deceptive
functions [21]. Deceptive functions attempt to mislead the GA to local optima by assigning low fitness
values to solutions near the global optimum and high fitness values to solutions far from the optimum.
The global optimum is thereby isolated and surrounded by global minima like the “needle in a haystack”
problem.
Kargrupta et al. [38] introduced two deceptive functions for combinatorial problems which spawn
an extremely difficult search space for a GA. Therein, k deceptive sub-functions (which can be regarded
as BBs) of order m are combined into one deceptive function of length l in a way that only one global
optimum exists among l! / (k!)m local optima. The global optimum can be detected only if the GA
subsequently identifies all k sub-functions—i.e., all sub-problems of the overall problem. Problem
difficulty can be scaled not only by the number and length of deceptive sub-functions but also by their
corresponding encoding. In short, two encoding schemes exist, tight and loose, with the loose encoding
Figure 12: Demonstration of Local Search in the RCMPSP
19
causing more difficulties for a GA.9
For tests on deceptive functions, we used the parameters listed in Table 1. The reader may refer to
[58] for detailed information on how to set these parameters. Due to prior knowledge of the BB structure
of the problem (kBB equals the order of the deceptive function and l equals the length of the deceptive
function) we roughly calculated population size using the equation proposed by Harik [30] and
convergence time according to Goldberg [23]. Figures 13 and 14 display average fitness results for 10
independent test runs. In our initial tests, we demonstrate the advantage of the OmeGA (a CGA) on a
deceptive function of order 4 and length 32 with a single global optimum (fitness value of 32): it
performs equally well for easy (tight encoding) and hard (loose encoding) problems. While the SGA
requires fewer function calls than the OmeGA to perceive the global optimum in case of tight-encoded
deceptive functions, it clearly struggles if the search space becomes more difficult and is unable to detect
the global optimum even after many fitness evaluations. Further tests on deceptive functions of varying
length, depicted in Figure 14, exhibit the inability of SGAs to deal with difficult (loose encoding)
problems. The computational resources necessary to reliably identify the global optimum scale up
exponentially with problem difficulty (as predicted by equation 1), which is particularly problematic if
problem size exceeds a value of about 16. Note that the population size depicted on the y-axis of Figure
14 had to be sufficient to detect the global optimum in 9 out of the 10 runs.
SGA OmeGA (CGA) pc : 1.0 pcut: 2 / npop pm : 1 / 4n ps: 1.0
Selection scheme: TWOR with/without continuously updated sharing and tournament size 4 [58,70]
pm: 1 / 4npop
Crossover operator:
Position-based Crossover, Version 2 [62] Selection scheme: TWOR with tournament
size 4 [70] Mutation operator: Shift mutation [62] Number of eras per epoch: 4
Number of generations per epoch 60
Ratio of population size in eras 1-4: 1:1:2:6
Table 1: Test Parameters
9 The encoding scheme influences the so-called defining length [22] of the BBs or sub-functions—the greater the defining length, the greater problem difficulty.
20
(a) SGA (b) OmeGA
5.2. Performance of the Hybrid OmeGA on Generated Test Problems
Due to its problem independent performance (as demonstrated in the previous section), the CGA
design is much more suited for real world problems than the SGA design as problem difficulty is
unknown in advance of optimization. In order to achieve best optimization results, we extended the
OmeGA with the local search strategy explained in section 4.3. and tested it for 77 test problems. These
problems were constructed based on two popular multi-project measures: the Average Resource Load
Factor (ARLF) and the Average Utilization Factor (AUF) [48,80]. The ARLF identifies whether the bulk
of a project’s total resource requirements are in the front or back half of its critical path duration10 and the
relative size of the disparity. For project h, it is defined as:
10 Based on scheduling each activity at its early start time.
Figure 13: Test Results for a Tight- and Loose-Encoded Deceptive Function of Length 32
Figure 14: Comparative Performance of the OmeGA and the SGA on Various Deceptive Functions
21
∑∑∑= = =
⎟⎟⎠
⎞⎜⎜⎝
⎛=
h ih hCP
t
K
k
n
i ih
ihkihtiht
hh K
rXZCP
ARLF1 1 1
1 (5)
where ⎩⎨⎧
>≤−
=2121
h
hiht CPt
CPtZ , (6)
⎩⎨⎧
=otherwise0
at time active is project of activity if1 thiXiht , (7)
ZihtXiht ∈ {-1, 0, 1}, nh is the number of activities in project h, Kih is the number of types of resources
required by an activity i in project h, and ihkr is the amount of resource type k required by activity i in
project h. Projects with negative ARLF are “front loaded” in their resource requirements, while projects
with positive ARLF are “back loaded.” The ARLF for a problem is simply the average of the ARLFs of its
constituent projects.
The AUF indicates the average tightness of the constraints on (i.e., the average amount of contention
for) each resource type:
∑=
=S
s k
skk sR
WS
AUF1
1 (8)
where kR is the (renewable) amount of resource type k available at each interval, and S is the number of
time intervals in the problem. Using S = 3 intervals, for example, once the projects have been sorted from
shortest to longest, such that CP1 ≤ CP2 ≤ CP3, then S1 = CP1, S2 = CP2 – CP1, and S3 = CP3 – CP2. The
total amount of resource k required over any interval s is given by:
∑∑∑= = =
=b
at
m
h
n
iihtihksk
h
XrW1 1
(9)
where a = 1−sCP + 1, b = sCP , and ihkr and X are defined as above. Since the AUF is essentially a ratio of
resources required to resources available, averaged across intervals of problem time, AUFk > 1 indicates
that resource type k is, on average, constrained over the course of a problem. To get the AUF for a
problem involving K types of resources:
AUF = Max(AUF1, AUF2, …, AUFK) (10)
For the test problems, we generated (using the generator described in [7]) seven random problems,
each composed of three projects (i.e., 21 total networks), and with 20 activities per project. Each of the
seven problems has a different ARLF setting, varied in integer increments over -3 ≤ ARLF ≤ 3. For each
problem, we adjusted the number of resources available at 11 levels, thereby varying the AUF in 0.1
22
increments over 0.6 ≤ AUF ≤ 1.6.11 This approach, originally taken in [48], yielded 77 test problems.
First, we compared the serial and parallel SGS results, averaged over all 77 problems. To enable
drawing reliable conclusions, we invoked the OmeGA 50 independent times on each problem for each
SGS. The most important CGA parameters, population size and convergence time, were set to 2000
individuals and four epochs, respectively. The remaining parameters were defined as in Table 1. An
unknown property of each problem is its underlying BB structure—the size, scaling, and number of BBs.
Hence, for the tests we assumed a “conservative” problem difficulty with BB size 4, even though tests
with different values for the size of BBs might lead to better results.12
Table 2 depicts the best average fitness value out of the three different SGSs (serial, parallel, and
self-adapting) for each of the 77 problems. In some cases the OmeGA produced identical average fitness
values for more than one SGS. Interestingly, the self-adapting SGS “won” in only 21 out of 77 cases
(27.3%). The serial SGS won in 11 out of 77 problems (14.3%). Meanwhile, the parallel SGS won in 39
out of 77 cases (51%). When we sum the average and best fitness values and the standard deviations (out
of all 50 independent runs) for each SGS on all 77 problems, as shown in Table 3, we conclude that the
hybrid OmeGA performs best in combination with the parallel SGS. While this conclusion contradicts
the insights in [32], this is not necessarily surprising given the different GA design and problem
instances.
ARLF Schedule Generation Strategy -3 -2 -1 0 1 2 3
0.6 1.76 4.32 12.78 2.00 3.00 3.00 0.00 Parallel SGS 0.7 4.96 16.40 24.00 5.00 11.16 9.60 2.30
0.8 13.72 29.28 38.92 11.00 25.82 21.32 7.14Serial SGS 0.9 19.62 46.92 59.82 23.94 36.40 34.88 16.90
1.0 30.00 57.24 69.54 34.84 60.96 52.94 22.06Self-Adapting AUF 1.1 39.54 74.38 90.50 51.66 71.90 67.32 34.34
1.2 50.98 88.12 112.06 68.08 105.84 75.00 43.16ALL Strategies 1.3 59.96 109.50 134.82 79.40 121.64 93.52 50.26
1.4 70.28 125.02 148.50 105.68 142.04 111.46 65.22Parallel / Self-Adapting 1.5 82.36 140.14 171.00 119.72 172.44 131.84 78.68
1.6 94.10 152.66 194.28 142.78 233.10 148.60 91.48
Table 2: Comparison of Different SGS Strategies for the Hybrid OmeGA
(average fitness of 50 independent runs)
11 We found that we could not adapt standard single-problem generators and test sets such as ProGen/PSPLIB [42] to create multi-project problems to our specifications. 12 Currently, there is no way to figure out a “good“ BB size except by testing all sizes. However, one point is clear: the larger the selected BB size, the more difficult to identify these BBs. Thus, we think a BB size of 4 is a conservative and good choice although it might not be the best for some problems.
23
Sum of Best Results Sum of Average Results Sum of Standard Deviations
Serial SGS 5105 5281.86 92.29
Parallel SGS 5020 5166.44 77.82
Self adapting 5021 5175.00 83.07
Table 3: Aggregate Performance of Each SGS in Terms of Total Project Lateness (TPL)
6. COMPARISON TO POPULAR PRIORITY-BASED HEURISTICS
We also compared the performance of the OmeGA to 20 popular priority rule heuristics found in the
project scheduling literature, as summarized in Table 4. Some of these rules are developed specifically
for a multi-project environment, while others have been reported to be successful in a single-project
environment. To increase their comparability, we standardized the tie-breaker for all rules to FCFS.
Priority Rule (* = multi-project) Formula Comments
1. FCFS—First Come First Served Min(ESil), where ESil is the early start time of the ith activity from the lth project
Best in study by Bock and Patterson [3]
2. SOF—Shortest Operation First Min(dil), where dil is the duration of the ith activity from the lth project
Best in study by Patterson [64]
3. MOF—Maximum (longest) Operation First
Max(dil)
4. MINSLK*—Minimum Slack Min(SLKil), where SLKil = LSil – Max(ESil, t), LSil is the late start time of the ith activity from the lth project, and t is the current time step13
Best in studies by Davis and Patterson [12], Boctor [4], and
Bock and Patterson [3]
5. MAXSLK*—Maximum Slack Max(SLKil)
6. SASP*—Shortest Activity from Shortest Project
Min(fil), where fil = CPl + dil and CPl is the critical path duration of the lth project without resource constraints
Best in studies by Kurtulus and Davis [48] and Maroto et al. [57]
7. LALP*—Longest Activity from Longest Project
Max(fil)
8. MINTWK*—Minimum Total Work content ,Min
11⎟⎟⎠
⎞⎜⎜⎝
⎛+ ∑∑ ∑
== ∈
K
kilk
K
kil
ASiilkil rdrd
l
where ASl is the set of activities already scheduled (i.e., in work) in project l
9. MAXTWK*—Maximum Total Work content ⎟⎟
⎠
⎞⎜⎜⎝
⎛+ ∑∑ ∑
== ∈
K
kilk
K
kil
ASiilkil rdrd
l 11Max Best in studies by Maroto et al.
[58] and Lova and Tormos [54]
10. RAN—Random Activities selected randomly Best in study by Akpan [1]
11. EDDF—Earliest Due Date First Min(LSil)
12. LCFS—Last Come First Served Max(ESil)
13. MAXSP—Maximum Schedule Pressure ,⎟⎟
⎠
⎞⎜⎜⎝
⎛ −
ilil
il
WdLFt
Max where Wil is the percentage of the activity
remaining to be done at time t
Also known as “critical ratio”
14. MINLFT--Minimum Late Finish time
Min(LFil) Equivalent to MINSLK in serial scheduling case (Kolisch [43])
15. MINWCS*—Minimum Worst Case Slack
Min(LSi – Max[E(i,j) | (i,j) ∈ APt]), where E(i,j) is the earliest time to schedule activity j if activity i is started at time t, and APt is the set of all feasible pairs of eligible, un-started activities at time t
Best in study by (Kolisch [43]); without resource constraints,
reduces to MINSLK
13 t is relevant only when using the parallel SGS, where an activity’s slack will diminish the longer it is delayed.
24
Priority Rule (* = multi-project) Formula Comments
16. WACRU*—Weighted Activity Criticality & Resource Utilization ( ) ( ) ,11
1 1 ,⎟⎟⎠
⎞⎜⎜⎝
⎛−++∑ ∑
= =
−iN
q
K
k kMax
ikiq R
rwSLKwMax α where Ni is the
number of immediate successors of the ith activity, w is the weight associated with Ni (0 ≤ w ≤ 1), SLKiq is the slack in the qth immediate successor of the ith activity, and α is a weight parameter
Best in study by (Thomas and Salhi [76])
We use w = 0.5 and α = 0.5
17. TWK-LST*—MAXTWK & earliest Late Start time (2-phase rule)
Prioritize first by MAXTWK (without FCFS tie-breaker) and then by Min(LSil)
(Lova and Tormos [54]); min. late start time (MINLST), was
best in study by (Davis and Patterson [12])
18. TWK-EST*—MAXTWK & earliest Early Start time (2-phase rule)
Prioritize first by MAXTWK (without FCFS tie-breaker) and then by Min(ESil)
(Lova and Tormos [54])
19. MS—Maximum Total Successors Max(TSil), where TSil is the total number of successors of the ith activity in the lth project
Best in study by (Kolisch [43])
20. MCS—Maximum Critical Successors
Max(CSil), where CSil is the number of critical successors of the ith activity in the lth project; CSil ∈ TSil
To compare results of the non-deterministic CGA with deterministic priority rules, we used the
average fitness values of the CGA instead of absolute best values out of 50 runs.14 We compared these
average values with the best result from any priority rule. Nevertheless, the OmeGA outperformed the
priority-rule based heuristics in 60 of the 77 problems (77.9%) in terms of solution quality (Table 5 and
Table 6). Interestingly, we note that Tables 5 and 6 show the priority-rules outperforming the OmeGA at
low AUF values. When AUF = 0.6, the resources are relatively unconstrained and the consequential
delays are small. In these cases, many of the priority rules gave good solutions, while the OmeGA did
not. We therefore conclude that using the OmeGA in the context of nominal resource constraints is
probably not worth the effort. It is also interesting that the priority rules performed well when AUFs were
very high—i.e., when resources were very highly constrained.
With respect to computational time, the priority rules have a clear advantage: the OmeGA required
an average of 211 seconds15, while each priority rule required less than one second. However,
optimization of a RCMPSP does not necessarily constitute a real-time application, and the time required
for the OmeGA should be convenient for many purposes. Nevertheless, priority rules remain most
practical for very large problems.
14 Since GAs constitute a non-deterministic optimization technique, it is much more fair to compare average values than absolute best fitness values. 15 Tests were performed on a PC with 3.0GHz Intel Pentium IV processor, 1024MB RAM and a Windows XP operating system.
Table 4: Overview of Popular Priority Rules Used for the RCMPSP (Adapted from [6])
25
ARLF -3 -2 -1 0 1 2 3 0.6 1.86 3.00 11.00 1.00 3.00 3.00 0.00
“Winner” 0.7 4.96 16.40 23.00 4.00 11.16 9.88 2.32 0.8 13.72 29.28 38.92 11.00 26.00 21.72 7.14
OmeGA 0.9 19.62 46.92 59.94 23.94 36.40 34.88 16.92 1.0 30.12 54.00 69.82 34.84 61.12 53.34 22.06
Priority Rule AUF 1.1 39.54 74.38 90.66 51.66 72.46 67.32 34.58 1.2 51.00 88.12 112.06 68.08 106.22 75.14 43.22
Both 1.3 59.96 97.00 135.28 79.40 122.06 93.00 50.26 1.4 70.44 111.00 148.50 105.68 142.04 102.00 65.58 1.5 82.38 129.00 168.00 119.72 170.00 131.84 78.68
1.6 94.22 152.66 194.28 142.78 223.00 149.08 91.50
ARLF -3 -2 -1 0 1 2 3 0.6 -7.00% 45.33% 16.18% 100.00% 33.33% 15.33% 0.00%
“Winner” 0.7 -17.33% -13.68% 4.35% 25.00% -20.29% -17.67% -42.00% 0.8 -8.53% -11.27% -15.39% -21.43% -18.75% -19.56% -40.50%
OmeGA 0.9 -18.25% -9.77% -10.54% -11.33% -28.63% -8.21% -10.95% 1.0 -18.59% 6.00% -10.49% -18.98% -8.78% -8.03% -15.15%
Priority Rule AUF 1.1 -14.04% -8.17% -11.98% -16.68% -15.74% -3.83% -6.54% 1.2 -16.39% -9.15% -9.63% -5.44% -4.31% -11.60% -9.96%
Both 1.3 -14.34% 12.89% -4.06% -6.59% -4.64% 1.14% -8.62% 1.4 -14.10% 12.63% -2.30% -4.79% -3.37% 9.61% -10.16% 1.5 -9.47% 8.88% 2.27% -4.98% 1.64% -0.12% -1.65%
1.6 -12.76% -0.22% -0.88% -8.47% 4.53% -6.82% -4.69%
7. CONCLUSION
In this paper, we present the first results of applying a CGA, the OmeGA, to the RCMPSP. Because
slight differences in GA settings can have a large influence on its efficacy and performance, we carefully
explored these settings in relation to the RCMPSP. While traditional SGAs scale up exponentially in
computational resources (i.e., fitness function evaluations and necessary population size) with increasing
problem difficulty, CGAs exhibit sub-exponential scale-up behavior due to their ability to identify BBs
of the solution. Furthermore, we extended the OmeGA with a local search strategy tailored to the
RCMPSP. As a basis for tests, we constructed 77 test problems according to the traditional ARLF and
AUF measures. The test results were twofold. First, we found that the parallel SGS—not the self-
adapting SGS, as stated in [32]—performed best in combination with the OmeGA. Second, we compared
Table 5: Comparitive Performance of OmeGA and the Best-Performing Priority Rule on Each Test Problem (Average Total Project Lateness)
Table 6: Percent Difference between the OmeGA and the Best-Performing Priority Rule on Each Test Problem
26
the OmeGA with many well known priority-rule heuristics, concluding that the OmeGA outperforms the
rules in 78% of problem instances. Moreover, the 22% of instances where the rules performed best were
focused on the very low and very high AUF values, where resources are very slightly or very highly
constrained. Thus, we have been able to provide some reasonable guidance on where the OmeGA would
be best applied. As problem size increases, we expect even better performance from the OmeGA
compared to the rules (which makes it useful for practical applications), even though this performance
comes at additional computational expense.16
For future work, we suggest research on even more effective local search strategies for the
RCMPSP. With CGAs already being able to identify BBs, and thus interesting regions of the search
space, new local search strategies that extensively exploit the BB neighborhood should be beneficial.
Furthermore, it would be very worthwhile to develop strategies for accurately estimating the BB structure
of the underlying problem.
REFERENCES
[1] E.O.P. Akpan, Priority Rules in Project Scheduling: A Case for Random Activity Selection, Production Planning & Control 11(2) (2000) 165-170.
[2] J.C. Bean, Genetic algorithms and random keys for sequencing and optimization, ORSA Journal On Computing 6(2) (1994) 154-160.
[3] D. Bock, J. Patterson, A Comparison of Due Date Setting, Resource Assignment, and Job Preemption Heuristics for the Multi-Project Scheduling Problem, Decision Science 21(3) (1990) 387-402.
[4] F.F. Boctor, Some Efficient Multi-heuristic Procedures for Resource-constrained Project Scheduling, European Journal of Operational Research 49 (1990) 3-13.
[5] T.R. Browning, Applying the Design Structure Matrix to System Decomposition and Integration Problems: A Review and New Directions, IEEE Transactions on Engineering Management, 48(3) (2001) 292-306.
[6] T.R. Browning, A.A. Yassine, Resource-Constrained Multi-Project Scheduling: Priority Rule Performance Revisited, TCU M.J. Neeley School of Business, Working Paper, 2006a.
[7] T.R. Browning, A.A. Yassine, A Random Generator for Resource-Constrained Multi-Project Scheduling Problems, TCU M.J. Neeley School of Business, Working Paper, 2006b.
[8] P. Brucker, A. Drexl, R. H. Möhring, K. Neumann, E. Pesch, Resource-constrained project scheduling: Notation, classification, models, and methods, European Journal of Operational Research 112(1) (1999) 3-41.
[9] E. Cantú-Paz, Efficient and accurate parallel genetic algorithms, Kluwer, Norwell, 2000. [10] R. Cheng, M. Gen, Y. Tsujimura, A tutorial survey of job-shop scheduling problems using genetic
algorithms, part II: hybrid genetic search strategies, Computers and Industrial Engineering 36(3) (1999) 343-364.
[11] E.W. Davis, Project Network Summary Measures and Constrained Resource Scheduling, AIIE
16 The larger the problem instance, the larger (usually) the search space. Certainly this depends on the constraints, too. Since heuristics explore only small subsets of the overall search space, the chance to find the best solution should decrease with increasing search space size.
27
Transactions 7(2) (1975) 132-142. [12] E.W. Davis, J.H. Patterson, A Comparison of Heuristic and Optimum Solutions in Resource-
Constrained Project Scheduling, Management Science 21(8) (1975) 944-955. [13] L. Davis, Job shop scheduling with genetic algorithms, in: Proceedings of the 1st international
conference on genetic algorithms, 1985. [14] Y. Davidor, Epistasis variance: A viewpoint on GA-hardness, in: Foundations of Genetic
Algorithms, 1991. [15] E. Demeulemeester, W. Herroelen, A branch-and-bound procedure for the resource constrained
project scheduling problem, Management Science 38(12) (1992) 1803-1818. [16] E. Demeulemeester, W. Herroelen, An efficient optimal solution procedure for resource
constrained project scheduling problem, European Journal of Operational Research 90(2) (1996) 334-348.
[17] H. Fang, P. Ross, D. Corne, A Promising Hybrid GA/Heuristic Approach for Open-Shop Scheduling Problem, in: 11th European Conference on Artificial Intelligence (1994) 590-594.
[18] M.R. Garey, D.S. Johnson, Computers and intractability: A guide to the theory of NP-completeness, 1979.
[19] M. Gen, R. Cheng, Genetic Algorithms and Engineering Design, 1997. [20] S. Ghomi, B. Ashjari, A simulation model for multi-project resource allocation. International
Journal of Project Management 20(2) (2002) 127-130. [21] D.E. Goldberg, Simple genetic algorithms and the minimal, deceptive problem, in: Genetic
Algorithms and Simulated Annealing (1987) 74-88. [22] D.E. Goldberg, Genetic algorithms in search, optimization, and machine learning, Addison-
Wesley, 1989. [23] D.E. Goldberg, The Design of Innovation, Kluwer, Norwell, 2002. [24] D.E. Goldberg, B. Korb, K. Deb, Messy genetic algorithms: Motivation, analysis, and first results,
Complex Systems 3(5) (1989) 493-530. [25] D.E. Goldberg, K. Deb, A comparative analysis of selection schemes used in genetic algorithms,
in: Foundations of Genetic Algorithms, 1991. [26] D.E. Goldberg, K. Deb, H. Kargupta, G. Harik, Rapid, Accurate Optimization of Difficult
Problems Using Fast Messy Genetic Algorithms, in: Proceedings of the Fifth International Conference on Genetic Algorithms, 1993.
[27] J.F. Gonçalves, J.J. de Magalhaes Mendes, M.G.C. Resende, A Genetic Algorithm for the Resource Constrained Multi-Project Scheduling Problem, AT&T Labs Technical Report TD-668LM4, (2004).
[28] G. Harik and D.E. Goldberg, Learning Linkage, in: Foundations of Genetic Algorithms IV, 1997. [29] G. Harik, F. Lobo, D.E. Goldberg, The Compact Genetic Algorithm, Proceedings of the IEEE
International Conference on Evolutionary Computation 3(4) (1998) 523-528. [30] G. Harik, E. Cantu-Paz, D.E. Goldberg, B.L. Miller, The gambler’s ruin problem, genetic
algorithms, and the sizing of populations, Evolutionary Computation 7(3) (1999) 231-253. [31] S. Hartmann, A competitive genetic algorithm for resource-constrained project scheduling, Naval
Research Logistics 45 (1998) 733-750. [32] S. Hartmann, A self-adapting genetic algorithm for project scheduling under resource constraints,
Naval Research Logistics 49(5) (2002) 433-448. [33] S.Hartmann, R.Kolisch, Experimental investigation of state-of-the-art heuristics for the resource-
constrained project scheduling problem, European Journal of Operations Research 127 (2000) 394-407.
[34] W.S. Herroelen, Project Scheduling - Theory and Practice, Production and Operations Management, 4(4) (2005) 413-432.
[35] J.H. Holland, Adaptation in natural and artificial systems, The University of Michigan Press, Ann Arbor, 1975.
[36] C. van Hoyweghen, B. Naudts, D.E. Goldberg, Spin-flip symmetry and synchronization, Evolutionary Computation 10(4) (2002) 317-344.
[37] W.H. Ip, Y. Li, K.F. Man, K.S. Tang, Multi-product planning and scheduling using Genetic Algorithm approach, Computer & Industrial Engineering 38 (2000) 283-296
[38] H. Kargrupta, K. Deb, D.E. Goldberg, Ordering genetic algorithms and deception, in: Parallel Problem Solving from Nature II (1992) 47-56.
28
[39] H. Kargrupta, The gene expression messy genetic algorithm, in: Proceedings of the International Conference on Evolutionary Computation (1996) 814-819.
[40] J.E. Kelley Jr., Critical-Path Planning and Scheduling: Mathematical Basis, Operations Research 9(3) (1961) 296-320.
[41] D. Knjazew, OmeGA: A Competent Genetic Algorithm for Solving Permutation and Scheduling Problems, Kluwer, Norwell, 2002.
[42] R. Kolisch, A. Sprecher, A. Drexl, Characterization and Generation of a General Class of Resource-Constrained Project Scheduling Problems, Management Science, 41(10) (1995) 1693-1703.
[43] R. Kolisch, Efficient Priority Rules for the Resource-Constrained Project Scheduling Problem, Journal of Operations Management 14(3) (1996a) 179-192.
[44] R. Kolisch, Serial and Parallel Resource-Constrained Project Scheduling Methods Revisited: Theory and Computation, European Journal of Operational Research 90 (1996b) 320-333.
[45] R. Kolisch, S. Hartmann, Heuristic algorithms for solving the resource constrained project scheduling problem: classification and computational analysis, Handbook on Recent Advances in Project Scheduling, Kluwer, Boston, 1998.
[46] R. Kolisch, R. Padman, An integrated survey of project deterministic scheduling, International Journal of Management Science 29(3) (2001) 249–272.
[47] R. Kolisch, R. Padman, An integrated survey of deterministic project scheduling, OMEGA 29 (2001) 249-272.
[48] I. Kurtulus, E.W. Davis, Multi-Project Scheduling: Categorization of Heuristic Rules Performance, Management Science 28(2) (1982) 161-172.
[49] J. Lenstra, K. Rinnooy, Complexity of Scheduling under Precedence Constraints, Operations Research 26(1) (1978) 22-35.
[50] M.J. Liberatore, B. Pollack-Johnson, Factors Influencing the Usage and Selection of Project Management Software, IEEE Transactions. on Engineering Management 50(2) (2003) 164-174.
[51] F.S.C. Lam, B.C. Lin, C. Sriskandarajah, H. Yan, Scheduling to minimize product design time using a genetic algorithm, International Journal of Production Research, 37(6) (1999) 1369-1386
[52] G.E. Liepins, M.D. Vose, Representational issues in genetic optimization, Journal of Experimental and Theoretical Artificial Intelligence 2(2) (1990) 4-30.
[53] S. Leu, C. Yang, A GA-based multicriteria optimal model for construction scheduling. Journal of Construction Engineering and Management 125(6) (1999) 420-427.
[54] A. Lova, P. Tormos, Analysis of Scheduling Schemes and Heuristic Rules Performance in Resource-Constrained Multi-project Scheduling, Annals of Operations Research 102 (2001) 263-286.
[55] S.W. Mahfoud, Niching methods for genetic algorithms, Doctoral dissertation, University of llinois at Urbana - Champaign, 1995.
[56] D.G. Malcolm, Application of a Technique for Research and Development Program Evaluation, Operations Research 7(5) (1959) 646-669.
[57] C. Maroto, P. Tormos, A. Lova, The Evolution of Software Quality in Project Scheduling, in: Project Scheduling: Recent Models, Algorithms and Applications, Kluwer, Boston, 1999.
[58] C. Meier, A.A. Yassine, T.R. Browning, Design Process Sequencing with Competent Genetic Algorithms, ASME Journal of Mechanical Design (Forthcoming) (2006).
[59] B.L. Miller, Noise, sampling, and efficient genetic algorithms, Doctoral dissertation, University of Illinois at Urbana-Champaign, 1997.
[60] M. Mitchell, An Introduction to Genetic Algorithms, MIT Press, Cambridge, 1996. [61] H. Mühlenbein, How Genetic Algorithms really work: Mutation and hillclimbing, in: Parallel
Problem Solving from Nature II (1992) 15-26. [62] T. Murata and H. Ishibuchi, Performance evaluation of genetic algorithms for flow shop scheduling
problems, in: Proceedings of the First IEEE Conference on Genetic Algorithms and their Applications (1994) 812-817.
[63] I.M. Oliver, D.J. Smith, J.R.C. Holland, A study of permutation crossover operators on the traveling salesman problems, in: Genetic algorithms and their application (1987) 227-230.
[64] J.H. Patterson, Alternative Methods of Project Scheduling with Limited Resources, Naval Research Logistics Quarterly 20(4) (1973) 767-784.
[65] J.H. Payne, Management of Multiple Simultaneous Projects: A State-of-the-Art Review, International Journal of Project Management, 13(3) (1995) 163-168.
29
[66] M. Pelikan, D.E. Goldberg, E. Cantú-Paz, BOA: The Bayesian Optimization Algorithm, in: Proceedings of the Genetic and Evolutionary Computation Conference (1999) 525-532.
[67] P. Pongcharoen, C. Hicks, P. M. Braiden, The development of genetic algorithms for the capacity scheduling of complex products, with multiple levels of product structure, European Journal of Operational Research 152 (2004) 215-225.
[68] P.W. Poon, J.N. Carter, Genetic algorithms crossover operators for ordering applications, Comp. Ops. Res. 22(1) (1995) 135–147.
[69] K. Sastry, D.E. Goldberg, Modeling tournament selection with replacement using apparent added noise, in: Proceedings of the Genetic and Evolutionary Computation Conference 11 (2001) 129 – 134.
[70] K. Sastry, D.E. Goldberg, Modeling tournament selection with replacement using apparent added noise, Intelligent Engineering Systems Through Artificial Neural Networks 11 (2001) 129-134.
[71] M. Spinner, Improving Project Management Skills and techniques, Prentice-Hall, Englewood Cliffs, 1989.
[72] A. Sprecher, Solving the RCPSP efficiently at modest memory requirements, Manuskripte aus den Instituten für Betriebswirtschaftslehre. No. 425, University of Kiel, 1996.
[73] G. Syswerda, Uniform crossover in genetic algorithms, in: Proceedings of the Third International. Conference on Genetic Algorithms (1989) 2–9.
[74] D. Thierens, Mixing in genetic algorithms, Doctoral dissertation, Katholieke Universiteit Leuven, 1995.
[75] J.R. Turner, The handbook of project-based management, McGraw-Hill, United Kingdom, 1993. [76] P.R. Thomas, S. Salhi, An Investigation into the Relationship of Heuristic Performance with
Network-Resource Characteristics, Journal of the Operational Research Society 48 (1997) 34-43. [77] A.A. Yassine, D. Braha, Four Complex Problems in Concurrent Engineering and the Design
Structure Matrix Method, Concurrent Engineering Research & Applications 11(3) (2003) 165-176. [78] S. Kumann, J. Jegan Jose, K. Raja, Multi-project scheduling using an heuristic and a genetic
algorithm, International Journal of Advanced Manufacturing technology, 31 (2006) 360-366. [79] Valls, V., Ballestín, F., Quintanilla, S., A Hybrid Genetic Algorithm for the Resource-Constrained
Project Scheduling Problem, European Journal of Operational Research (2007), forthcoming. [80] S. Tsubakitani, R. Deckro, A heuristic for multi-project scheduling with limited resources in the
housing industry, European Journal of Operational Research 49 (1990) 80-91.