chapter 5 non linear regression prediction of dwelling...
TRANSCRIPT
112
CHAPTER 5
NON LINEAR REGRESSION PREDICTION
OF DWELLING TIME
Prediction based resource allocation is an efficient tool for dealing
with resource heterogeneity. However, linear time series prediction methods
give less precise parameter estimates when compared with the real data.
Nonlinear regression model can produce good estimates of the unknown
parameters in the model with relatively small data sets. Usually
numerical optimization algorithms are applied to determine the best-fitting
parameters. In this chapter, nonlinear regression based prediction method is used
to predict the future dwelling time of the resources for effective scheduling.
5.1 NON LINEAR PREDICTION MODEL
Non linear regression based prediction techniques are emerging as
powerful tool to predict the future patterns of behavior underlying hiding
dynamical systems which may avoid analysis using other traditional
techniques. In fact, nonlinear time-series analysis has been one of the major
areas of research in time-series analysis for more than two decades. Beyond
linear domain, there are many nonlinear forms to be explored.
5.1.1 Need for Nonlinear Regression Models
The goal of determining a regression is to obtain an equation from
which one can predict one variable based upon another variable. Linear time
113
series models can be used for short range predictions. Although many
scientific and engineering processes can be described well using linear
models, or other relatively simple types of models, there are many other
processes that are inherently nonlinear. Linear models do not describe
processes that asymptote very well because for all linear functions the
function value can't increase or decrease at a declining rate as the explanatory
variables go to the extremes. There are many types of nonlinear models, on
the other hand, that describe the asymptotic behavior of a process well. Like
the asymptotic behavior of some processes, other features of physical
processes can often be expressed more easily using nonlinear models than
with simpler model types. The biggest advantage of nonlinear regression over
many other techniques is the broad range of functions that can be fit
(Clements et al, 2004). To obtain accurate future dwelling time in our
scheduling algorithm the nonlinear regression model is considered.
These models are generally more appropriate than linear models for
accurately describing dynamics of the series and for making multistep-ahead
forecasts. On the other hand, regression analysis is a statistical tool for the
investigation of relationships between variables.
5.1.2 Estimation of Regression Parameters
Estimation of the parameters of a nonlinear regression model is
usually carried out by the method of least squares or the method of maximum
likelihood as in linear regression models. Unlike linear regression, it is
usually not possible to find analytical expressions for the least squares and
maximum likelihood estimators for nonlinear regression models. Instead,
numerical search procedures are used for parameter estimation, which
requires intensive computations. In general, there is no closed-form
expression for the best-fitting parameters, as there is in linear regression. The
data are fitted by a method of successive approximations. Non-linear
114
regression is an iterative procedure in which the number of iterations depends
on how quickly the parameters converge. The choice of initial starting values
is very important with the non-linear method because a poor choice may
result in slow convergence, convergence to a local minimum, or even
divergence. The optimization of nonlinear regression model parameters is
usually carried out by natural heuristic algorithms like GA, PSO, SA and TS
or by machine learning algorithms like ANN and CBR. None of the algorithm
for optimizing general nonlinear functions exists that will always find the
global optimum for a general nonlinear minimization problem in a reasonable
amount of time. Since no single optimization technique is invariably superior
to others, a variety of optimization techniques that work well in various
circumstances. Genetic algorithms are often applied as an approach to
solve global optimization problems.
5.2 GENETIC ALGORITHM (GA) BASED OPTIMIZATION
Genetic Algorithm (GA) is an evolutionary technique for large
space search (Fogel 1994, Xhafa et al, 2005). The general procedure of GA
search is given in Figure 5.1.
Yes
No
Figure 5.1 Genetic algorithm procedure
Start Initialization
ofpopulation
Valuation Fitness Value
Reproduction
Solution found? Stop
115
5.2.1 Generation of Initial Population
A population is a set of chromosomes and each represents a
possible solution, which is a mapping sequence between tasks and machines.
The evolution usually starts from a population of randomly generated
individuals and is an iterative process, with the population in each iteration is
called a generation. The population size depends on the nature of the
problem, but typically contains several hundreds or thousands of possible
solutions.
5.2.2 Fitness Evaluation
A fitness function value quantifies the optimality of a solution. The
value is used to rank a particular solution against all the other solutions. A
fitness value is assigned to each solution depending on how close it is actually
to the optimal solution of the problem.
5.2.3 Generation of New Population
The generation of new population consists of three process namely,
selection, crossover and mutation.
Selection : In selection process, the two parent chromosomes
from a population are selected according to their fitness (better
the fitness, bigger the chance to be selected).
Crossover : Crossover is a genetic operator that combines
two parent chromosomes to produce a new chromosome
(offspring). The idea behind cross over is that the new
chromosome may be better than both of the parents if it takes
the best characteristics from each of the parents. Crossover
116
occurs during evolution according to a user-definable
crossover probability.
Mutation: Mutation is the process of replacing certain genes
or sub-sequences with new gene values to the current
population. The replacing of certain genes is done using
mutation probability. Mutation is an important part of the
genetic search which helps to prevent the population from
stagnating at any local optima.
5.2.4 Termination
Finally, the chromosomes from this modified population are
evaluated again. This completes one iteration of the GA. The GA stops when
a predefined number of evolutions is reached or all chromosomes converge to
the same mapping.
For its simplicity, GA is the most popular nature’s heuristic used in
algorithms for optimization problems. The main property that makes these
genetic representations convenient is that their parts are easily aligned due to
their fixed size, which facilitates simple crossover operations. Variable length
representations may also be used, but crossover implementation is more
complex in this case. Once the genetic representation and the fitness function
are defined, a GA proceeds to initialize a population of solutions and then to
improve it through repetitive application of the mutation, crossover and
selection operators.
5.3 RELATED WORKS
Recently, Genetic Algorithms (GAs) have been widely and
successfully applied to various optimization problems in Grid computing
117
environment. GAs are well suited to the models with varying resolutions and
structures since they can search non-linear solution spaces without requiring
any gradient information or a priori knowledge about model characteristics.
Effective task scheduling in grid requires to model the available resources on
grid nodes and computation requests of tasks, determine the current load of
the system, and predict the task execution time. Many researchers have
successfully applied global optimization methods, such as the GA, rather than
local gradient-based optimization techniques (Collin & Takeshiyamada, 1998)
to Grid computing. GA is probabilistic meta-heuristic methods inspired by the
‘survival of the fittest’ principle of the neo-Darwinian theory of evolution
(Fogel 1994, Grefenstette 1986). Wu et al (2007) proposed a real-valued GA
to optimize the parameters of Support Vector Machine (SVM) for predicting
bankruptcy. Xue & Xue (2010) presented an optimization model for the task
scheduling problem and develop a Hybrid Clonal Selection Genetic
Algorithm (HCSGA) to solve it effectively. Joshua & Vasudevan (2011)
developed the smart Genetic Algorithm with the fitness function based on
makespan & flow time to schedule the task in Grid computing. Maleki & Ali
(2011) proposed an genetic algorithm based task scheduling algorithm to
assign the tasks to the grid resources with goal of minimizing the total
makespan of the environment. Wang et al (1997) presented a genetic-
algorithm-based approach to dispatch and schedule subtasks within grid
environments. The simulation tests presented in Wang et al (1997) for small-
sized problems (e.g., a small number of subtasks and a small number of
machines), the genetic-algorithm-based approach can be found the optimal
solution for these types of problems.
Levitin & Dai (2008) proposed a genetic algorithm to distribute
execution blocks within grid resources. The aim of the algorithm is to find the
best match between execution blocks and grid resources in which the
reliability and/or expected performance of the task execution in the grid
118
environment maximizes. Salehi et al (2013) proposed a hybrid algorithm
combining genetic algorithm and network gravity algorithm to solve
scheduling tasks in the grid computing. Wiessman et al (2004) proposed a
novel GA based algorithm which schedules a divisible data intensive
application. Martino & Marco (2002) presented a GA based scheduling
algorithm where the goal of super scheduling was to minimize the release
time of jobs.
The feasibility of evolutionary algorithm-based scheduling such as
GA for realistic scientific workloads is demonstrated by systems like Mars
(Bose et al 2004). Mars is a meta-scheduling framework for scheduling tasks
across multiple resources in a grid. Golconda & Ozguner (2004) studied and
compared five heuristic based algorithms and they found that GA produced
the best utility for users in comparison to heuristics such as Min-Min. Xhafa
et al (2005) presented Genetic Algorithms (GAs) based schedulers for
efficiently allocating jobs to resources in a Grid system in order to minimize
the makespan and flowtime. The above related works expose that GA method
will be suitable for any type of optimization problem. In this research work a
non linear prediction model is developed and the parameters will be estimated
using GA optimization.
5.4 NON LINEAR PREDICTION MODEL for DARA (NLPM-
DARA)
The dwelling time of the sporadic resource for each occurrence will
be varied randomly. The previous prediction model does not consider the
random dwelling time occurrence. To address this random dwelling time
problem, a non linear prediction model is developed to enhance the prediction
accuracy.
119
5.4.1 Developing a non linear regression model
In general for a nonlinear regression model based on the occurrence
of the curve nature it is developed. The resource dwelling time occurrence is
similar to s-type curve and hence a commonly used logistic regression model
(Batts & Watts, 1988, Gallant, 1975) is modified as follows,
1
01
2
0
exp 1 1( )
1 exp 1
N
i iki
j j N
i iki
tt
t(5.1)
where )( jt = future dwelling time prediction model function,
jt = represents the dwelling time of
thj resource,
it = dwelling time of thi instant and
j , ik are the non linear model parameters which are to be
optimized
N = Past dwelling time history size (Maximum history size
taken is 10)
5.4.2 Optimization of nonlinear parameters
The non linear regression model is optimized by selecting the bestj
and bestN
bestbest1,,,
10values to make it fit to the historical data. The
procedures that are involved in the optimization process are described below.
120
5.4.2.1 Initialize Population Pool
Genetic algorithm begins with a set of random solutions
(represented by chromosomes) called population pool. The population pool of
random solutions or chromosomes ( j , ik values) are generated and it is
denoted as
1,,1,0; pl NlX
where, pN is the pool size, in which each chromosome is of length 1N .
The chromosome length 1N indicates the number of genes i.e.
number of random weights to be optimized are 1N , j and 110 ,,, N .
j 0 1 2 3 4 5 6 7 8 9
The previous 10 occurrence (history) of dwelling time is taken as
chromosome length. Each gene value of every chromosome is an arbitrary
number to be generated within the interval [0, 10].
5.4.2.2 Fitness Evaluation
The chromosomes are then evaluated using the equation. The
fitness value is denoted as lf
12
,0
2
( ( ) )Rl N
j III jj
ft d
(5.2)
where, jIIId , is the actual dwelling time of the thj resource.
121
NR = total number of resources
The difference between the actual dwelling time and the predicted
dwelling time of jth resource is squared for NR resources. For maximum fitness
value the prediction error will be minimum. For every resource in population
pool, the fitness value is evaluated. The objective of fitness evaluation is to
maximize the fitness value.
5.4.2.3 Selection
The selection operator then probabilistically populates the next
generation of chromosomes such that chromosomes with high fitness are
more likely to be selected. The technique to select the chromosomes, are
based on rank selection method. Here the rank is the maximum fitness value.
The operator selects two individuals at random and compares their fitness.
Only the individual with the highest fitness is propagated to the next
generation. If they have equal fitness the individual to be inserted is chosen at
random.
5.4.2.4 Crossover
In the crossover process two solutions (chromosomes) are selected
and exchanges the gene values with a probability of pC . The crossover
operation exchanges pCN. genes between two parent chromosomes and
produce 2/pN children chromosomes 12/,,1,0; poffq NqX .
5.4.2.5 Mutation
Perform uniform random mutation operation with a mutation
probability pM . In the mutation technique, a uniform random integer is
122
generated and replaced in pMN. random positions of offqX and new
qX is
produced.
These crossover and mutation operators are applied to all members
of the new population. Often in addition to these operators, the best member
of the original population is copied into the new population unchanged, a
strategy called elitism.
5.4.2.6 Termination
The resultant newqX and the selection pool chromosomes are placed
in the population pool and the process is repeated until the termination
criteria. In this case, the termination criterion is set as reaching a maximum
number of repetitions of process. Once the maximum number of process
repetition is happened, the process is terminated and the chromosome (can be
represented as bestj
and bestN
bestbest1,,,
10), which has maximum fitness, in
the population pool is extracted.
The obtained best weights are substituted in Equation (5.1) to
derive the final regression model as follows
1
01
2
0
exp 1 1( )
1 exp 1j
ik
Nbest
i ikibest
j Nbest
ii
tt
t (5.3)
The final regression model is able to determine the future dwelling
time resources. The non linear prediction model (NLPM) output is given to
the DARA for resource allocation process.
123
5.4.3 Fuzzy based Resource Allocation
The future dwelling time of the resources from NLPM is given as
the input to the fuzzy scheduler as depicted in Figure 3.3. The three input
variables to the DARA technique are i) priority of job ii) requirement time
which is demanded by the jobs iii) future dwelling time of resource
respectively. The fuzzy inference system is inferred from the input vector
based on a set of rules. Then the output of inference system is compared with
the different threshold score values; if it is above a threshold score, the
corresponding job will be allotted to the resource immediately and stores the
details for the next scheduling.
5.5 RESULTS AND DISCUSSION
The NLPM-DARA technique is evaluated by using MATLAB
simulation. The performance of non linear prediction using GA optimization
is evaluated through MATLAB, then the resources allocated.
5.5.1 Non Linear Prediction Model Analysis
The Genetic Algorithm optimization is performed using the
parameters which is given in Table 5.1.
Table 5.1 GA Parameters for Simulation
Parameters Specifications Population Size 20Selection RankCross Over Probability 0.5Cross Over Type Single Point Mutation Probability 0.05Max. No. of Generations 50
124
The non linear prediction model is evaluated interms of %
prediction accuracy. The prediction accuracy of linear prediction model and
non linear prediction model for sporadic and semi permanent type resources
are shown in Figure 5.2 (a) and (b) respectively. The history size ‘N’ of
values 1, 2,3,…,10 are used to evaluate the errors and accuracy.
1 2 3 4 5 6 7 8 9 1076
78
80
82
84
86
88
90
History Size "N"
Sporadic Resources
LPM-DARANLPM-DARA
(a)
1 2 3 4 5 6 7 8 9 1085
86
87
88
89
90
91
92
History Size "N"
Semi Permanent Resources
LPM-DARANLPM-DARA
(b)
Figure 5.2 Comparison of prediction accuracy of LPM-DARA and NLPM-DARA techniques (a) Sporadic resources (b) Semi permanent resources
125
Figure 5.2 illustrates the prediction accuracy of NLPM-DARA with
comparison to LPM-DARA techniques. It is observed that, the non linear
prediction model accuracy is improved by atleast 5% when compared to
linear prediction model. The maximum accuracy for sporadic resources is
90% and for semi permanent resources it is 92%. This is due to the GA
iterative computation for finding the best fitness value.
5.5.2 Resource Allocation Analysis
Resource dwelling time prediction can improve performance of
scheduling and this section describes the influence of predictions on grid
scheduling. A quantitative analysis is made based on dwelling time based
fuzzy scheduling with the same five datasets generated in the chapter 3. The
scheduling quality is evaluated according to average job makespan, resource
utilization and the failure rate. Makespan is calculated for each job as the time
from submission to completion and this value is averaged across all jobs.
Resource time utilization indicates the ratio of number of resources allocated
to total number of resources. In DARA technique, the fuzzy threshold
filtering has been performed in two locations. One is at the point of evaluating
the fuzzy score that deals with the sporadic resource type and the other is at
the point of evaluating the fuzzy score that deals with semi-permanent
resource type. The threshold values (Sth-III and Sth-II) are varied from 0.3 to 0.7
keeping the job and resource scenarios for five different datasets as mentioned
in chapter 3. The performance metrics for all the datasets under prediction
NLPM-DARA model for different threshold values are tabulated in Tables
5.2 to 5.6.
126
Table 5.2 Performance metrics for Dataset-I
S.No Sth-III Sth-IIUtilization
(in %) Failure rate
(in %) Makespan
(in sec)
1 0.3 0.3 77.41 22.59 70.068
2 0.3 0.4 71.62 28.38 55.418
3 0.3 0.5 82.53 17.47 61.9
4 0.3 0.6 76.86 23.14 57.102
5 0.3 0.7 81.12 18.88 58.078
6 0.4 0.3 80.73 19.27 55.696
7 0.4 0.4 73.89 26.11 59.188
8 0.4 0.5 80.35 19.65 62.802
9 0.4 0.6 79.34 20.66 65.386
10 0.4 0.7 80.65 19.35 57.078
11 0.5 0.3 79.84 20.16 65.556
12 0.5 0.4 79.63 20.37 56.758
13 0.5 0.5 81.5 18.5 58.214 0.5 0.6 78.89 21.11 52.5
15 0.5 0.7 78.63 21.37 59.904
16 0.6 0.3 77.56 22.44 59.74
17 0.6 0.4 82.21 17.79 65.67
18 0.6 0.5 77.19 22.81 53.272
19 0.6 0.6 80.1 19.9 61.3
20 0.6 0.7 76.6 23.4 64.67
21 0.7 0.3 76.54 23.46 59.904
22 0.7 0.4 73.56 26.44 56.496
23 0.7 0.5 78.88 21.12 71.902
24 0.7 0.6 76.22 23.78 48.314
25 0.7 0.7 79.13 20.87 65.556
127
Table 5.3 Performance metrics for Dataset-II
S.No Sth-III Sth-IIUtilization
(in %) Failure rate
(in %) Makespan
(in sec) 1 0.3 0.3 70.76 29.24 143.9482 0.3 0.4 69.56 30.44 111.4963 0.3 0.5 73.12 26.88 155.744 0.3 0.6 70.13 29.87 151.745 0.3 0.7 69.11 30.89 102.3486 0.4 0.3 70.63 29.37 106.3367 0.4 0.4 68.78 31.22 93.068 0.4 0.5 70.41 29.59 142.029 0.4 0.6 72.21 27.79 130.66410 0.4 0.7 70.16 29.84 106.0211 0.5 0.3 67.94 32.06 119.39212 0.5 0.4 71.59 28.41 149.37613 0.5 0.5 73.76 26.24 105.20814 0.5 0.6 68.07 31.93 119.96415 0.5 0.7 71.01 28.99 138.59616 0.6 0.3 70.94 29.06 109.21217 0.6 0.4 72.88 27.12 101.00418 0.6 0.5 74.42 25.58 92.13219 0.6 0.6 67.63 32.37 144.42420 0.6 0.7 70.51 29.49 150.05621 0.7 0.3 69.52 30.48 122.9822 0.7 0.4 71.35 28.65 92.26423 0.7 0.5 70.24 29.76 121.23624 0.7 0.6 70.14 29.86 138.9425 0.7 0.7 70.16 29.84 96.692
128
Table 5.4 Performance metrics for Dataset-III
S.No Sth-III Sth-IIUtilization
(in %) Failure
rate (in %)Makespan
(in sec)
1 0.3 0.3 74.89 25.11 87.948
2 0.3 0.4 79.7 20.3 95.348
3 0.3 0.5 75.23 24.77 94.51
4 0.3 0.6 78.76 21.24 93.184
5 0.3 0.7 70.44 29.56 131.008
6 0.4 0.3 73.25 26.75 159.816
7 0.4 0.4 74.96 25.04 95.448
8 0.4 0.5 72.63 27.37 109.064
9 0.4 0.6 73.38 26.62 148.548
10 0.4 0.7 77.03 22.97 104.628
11 0.5 0.3 71.96 28.04 159.86
12 0.5 0.4 76.43 23.57 95.924
13 0.5 0.5 78.23 21.77 101.22
14 0.5 0.6 75.14 24.86 106.544
15 0.5 0.7 74.89 25.11 128.608
16 0.6 0.3 71.43 28.57 123.34
17 0.6 0.4 74.01 25.99 128.768
18 0.6 0.5 73.23 26.77 147.788
19 0.6 0.6 76.17 23.83 93.624
20 0.6 0.7 72.87 27.13 105.036
21 0.7 0.3 75.53 24.47 103.172
22 0.7 0.4 72.17 27.83 86.416
23 0.7 0.5 77.25 22.75 149.836
24 0.7 0.6 71.19 28.81 99.308
25 0.7 0.7 74.17 25.83 129.752
129
Table 5.5 Performance metrics for Dataset-IV
S.No Sth-III Sth-IIUtilization
(in %) Failure rate
(in %) Makespan
(in sec)
1 0.3 0.3 81.8 18.2 131.411
2 0.3 0.4 75.45 24.55 112.505
3 0.3 0.5 79.03 20.97 92.888
4 0.3 0.6 78.19 21.81 97.481
5 0.3 0.7 68.46 31.54 117.947
6 0.4 0.3 76.86 23.14 123.515
7 0.4 0.4 75.29 24.71 97.622
8 0.4 0.5 80.17 19.83 98.519
9 0.4 0.6 74.59 25.41 85.961
10 0.4 0.7 69.93 30.07 113.117
11 0.5 0.3 73.46 26.54 108.782
12 0.5 0.4 72.6 27.4 130.316
13 0.5 0.5 80.7 19.3 118.2
14 0.5 0.6 70.26 29.74 92.786
15 0.5 0.7 72.38 27.62 80.618
16 0.6 0.3 78.34 21.66 134.39
17 0.6 0.4 68.19 31.81 101.381
18 0.6 0.5 67.18 32.82 131.195
19 0.6 0.6 71.69 28.31 86.243
20 0.6 0.7 65.56 34.44 83.099
21 0.7 0.3 74.24 25.76 83.003
22 0.7 0.4 63.86 36.14 116.453
23 0.7 0.5 79.67 20.33 84.653
24 0.7 0.6 80.19 19.81 107.093
25 0.7 0.7 70.11 29.89 83.999
130
Table 5.6 Performance metrics for Dataset-V
S.No Sth-III Sth-IIUtilization
(in %) Failure rate
(in %) Makespan
(in sec)
1 0.3 0.3 76.07 23.93 165.265
2 0.3 0.4 80.65 19.35 152.76
3 0.3 0.5 85.84 14.16 120.755
4 0.3 0.6 72.67 27.33 172.755
5 0.3 0.7 75.69 24.31 162.665
6 0.4 0.3 71.99 28.01 107.67
7 0.4 0.4 73.13 26.87 102.525
8 0.4 0.5 86.77 13.23 110.165
9 0.4 0.6 74.38 25.62 156.525
10 0.4 0.7 72.21 27.79 120.565
11 0.5 0.3 79.09 20.91 102.655
12 0.5 0.4 74.36 25.64 150.76
13 0.5 0.5 79.31 20.69 120.41
14 0.5 0.6 78.37 21.63 101.275
15 0.5 0.7 85.01 14.99 162.165
16 0.6 0.3 75.09 24.91 151.675
17 0.6 0.4 70.13 29.87 120.565
18 0.6 0.5 82.34 17.66 116.01
19 0.6 0.6 76.96 23.04 150.175
20 0.6 0.7 73.91 26.09 107.755
21 0.7 0.3 83.24 16.76 160.025
22 0.7 0.4 71.63 28.37 133.26
23 0.7 0.5 81.19 18.81 162.395
24 0.7 0.6 79.29 20.71 136.095
25 0.7 0.7 76.13 23.87 173.37
131
Tables 5.2 to 5.6 shows for all possible Sth-III and Sth-II, threshold
values and its corresponding utilization, failure and makespan for all the five
scenario datasets. Table 5.7 shows the observed the maximum resource
utilization achieved in the sporadic and semi permanent thresholds for five
different datasets. Similarly, Table 5.8 shows the observed the minimum
makespan achieved in the sporadic and semi permanent thresholds for five
different datasets.
Table 5.7 Thresholds for maximum resource utilization
Sth-III Sth-II Max. Utilization (%)
Dataset-I 0.3 0.5 82.53
Dataset-II 0.6 0.5 74.42
Dataset-III 0.3 0.4 79.7
Dataset-IV 0.3 0.3 81.8
Dataset-V 0.4 0.5 86.77
Table 5.8 Thresholds for minimum Makespan
Sth-III Sth-II Min. Makespan(Sec)
Dataset-I 0.7 0.6 48.314
Dataset-II 0.6 0.5 92.132
Dataset-III 0.7 0.4 86.416
Dataset-IV 0.5 0.7 80.618
Dataset-V 0.5 0.6 101.275
132
From Table 5.7 and 5.8, it can be seen that, none of the threshold
combination values are repeated for maximum utilization and minimum
makespan.
5.5.3 Comparative Analysis for Fixed Threshold
In order to evaluate LPM-DARA technique with the DARA as well
as with LPM-DARA the Sth-III and Sth-II values are kept fixed as 0.5. The
performance metrics for these three techniques are compared and shown in
Figure 5.3.
(a)
(b)
Figure 5.3 (Continued)
133
(c)
Figure 5.3 Comparison of the DARA, LPM-DARA and NLPM-DARA techniques (a) Utilization (b) Makespan and (c) Failure rate
5.5.4 Comparative Analysis of Averaged Threshold Value
The performance metrics of Sth-III are averaged individually for the
clarity as like in the previous chapter for further analysis of the output of the
NLPM-DARA model. Based on the averaged value of each category of the
performance metrics are presented in Figure 5.4 (a), (b) and (c).
In general a considerable improvement has been made by the non-
linear prediction model (NLPM-DARA) when compared with LPM-DARA.
LPM-DARA reported the utilization rate between 42-75% (Figure 4.9(a)) for
all the datasets where as in NLPM-DARA it narrowed down to 70-80%
(Figure 5.4(a)). It is noteworthy to mention that, while increasing the fuzzy
threshold from 0.3 to 0.7 the % utilization does not produce marked
difference in NLPM-DARA (maximum 1.4%) for all the datasets, but in the
case of LPM-DARA it reported maximum of 6.2%. It reveals that, the
allocation of the jobs to sporadic resources are still very effective compared to
134
LPM-DARA, where in higher fuzzy threshold values % utilization was
reduced considerably. As like in the case of LPM method, this method also
scored better efficiency in lower fuzzy threshold values.
In the case of makespan metrics of NLPM-DARA the structural
pattern of the graph was more or less retained as like LPM-DARA (Fig. 4.9c).
Overall the makespan was reduced by 30 seconds with the previous model.
This showed the effectiveness of the improved NLPM-DARA method.
Dataset I produced the better makespan and difference between the Dataset I
and all other Datasets are as like LPM-DARA (approximately 40 seconds).
0.3 0.4 0.5 0.6 0.740
45
50
55
60
65
70
75
80
Util
izat
ion
in%
STh-II
Dataset I Dataset II Dataset III Dataset IV Dataset V
(a)
Figure 5.4 (Continued)
135
0.3 0.4 0.5 0.6 0.7
60
70
80
90
100
110
120
130
140
150
160
Mak
espa
nin
sec
STh-II
Dataset I Dataset II Dataset III Dataset IV Dataset V
(b)
0.3 0.4 0.5 0.6 0.70
5
10
15
20
25
30
35
40
45
50
Failu
rera
te
Sth-II
Dataset I Dataset II Dataset III Dataset IV Dataset V
(c)
Figure 5.4 Compilation of performance metrics for (a) Resource utilization in %, (b) Makespan in seconds and (c) Failure rate in %.
136
5.5.5 Comparison with other Techniques
To compare all the proposed DARA techniques with other
conventional techniques, the mean value of resource allocation, makespan
and failure rate are shown in Figure 5.5.
(a)
(b)
Figure 5.5 (Continued)
137
(c)
Figure 5.5 Comparison with other algorithms (a) Mean Utilization (b)
Mean Makespan (c) Mean Failure Rate
From Figure 5.5 it is observed that the NLPM-DARA technique is outperformed when compared to all other techniques like FCFS, PSO, SA, DARA and LPM-DARA. This is due to the better prediction of dwelling time of the resources.
5.6 SUMMARY
In this paper, the previously proposed LPM-DARA was enhanced by replacing the time series prediction model by a nonlinear regression model. The nonlinear regression model was blindly initiated and then fine-tuned as per the dwelling time of the historical data. The model understood the historical dwelling time of the sporadic resources and hence the nature of the sporadic resources. This led the way to better resource allocation performance in terms of utilization, failure rate and makespan. The experimental validation under different fuzzy threshold had proved that the performance metrics achieved better results rather than the conventional resource allocation mechanisms including LPM-DARA and heuristic search based resource allocation mechanisms. Hence, the comparison in terms of reliability would be naturally a supporting end to NLPM-DARA, primarily compared against LPM-DARA.