analysis of heuristic search techniques

INTRODUCTION

1.1 HeuristicIncomputer science,artificial intelligence, andmathematical optimization, aheuristicis a technique designed forsolving a problemmore quickly when classic methods are too slow, or for finding an approximate solution when classic methods fail to find any exact solution. This is achieved by trading optimality, completeness,accuracy, orprecisionfor speed.Heuristicrefers to experience-based techniques for problem solving, learning, and discovery that find a solution which is not guaranteed to be optimal, but good enough for a given set of goals. Where the exhaustive search is impractical, heuristic methods are used to speed up the process of finding a satisfactory solution via mental shortcuts to ease the cognitive load of making a decision. Examples of this method include using arule of thumb, aneducated guess, an intuitive judgment, stereotyping,profiling, orcommon sense. More precisely, heuristics are strategies using readily accessible, though loosely applicable, information to controlproblem solvingin human beings and machines[1].A heuristic method can accomplish its task by using search trees. However, instead of generating all possible solution branches, a heuristic selects branches more likely to produce outcomes than other branches. It is selective at each decision point, picking branches that are more likely to produce solutions.Many virus scanners use heuristic rules for detecting viruses and other forms of malware. Heuristic scanning looks for code and/or behavioral patterns indicative of a class or family of viruses, with different sets of rules for different viruses. If a file or executing process is observed to contain matching code patterns and/or to be performing that set of activities, then the scanner infers that the file is infected. The most advanced part of behavior-based heuristic scanning is that it can work against highly randomized polymorphic viruses, which simpler string scanning-only approaches cannot reliably detect. Heuristic scanning has the potential to detect many future viruses without requiring the virus to be detected somewhere, submitted to the virus scanner developer, analyzed, and a detection update for the scanner provided to the scanner's users.The problem of finding an admissible heuristic with a low branching factor for common search tasks has been extensively researched in theartificial intelligencecommunity. Several common techniques are used: Solution costs ofsub-problemsoften serve as useful estimates of the overall solution cost. These are always admissible. For example, a heuristic for a 10-puzzle might be the cost of moving tiles 1-5 into their correct places. A common idea is to use a pattern database that stores the exact solution cost of every sub-problem instance. The solution of arelaxed problemoften serves as a useful admissible estimate of the original. For example, Manhattan distance is a relaxed version of the n-puzzle problem, because we assume we can move each tile to its position independently of moving the other tiles.1.2 Scheduling Scheduling is the process of generating the schedule and schedule is a physical document and generally tells the happening of things and shows a plan for the timing of certain activities. Generally, scheduling problem can be approached in two steps; in the first step sequence is planned or decides how to choose the next task. In the second step, planning of start time and perhaps the completion time of each task is performed. In a scheduling process, the type and amount of each resource should be known so that accomplishing of tasks can be feasibly determined. Boundary of scheduling problem can be efficiently determined if resources are specified. In addition, each task is described in terms of such information as its resource requirement, its duration, the earliest time at which it may start and the time at which it is due to complete. Any technological constraints (precedence restrictions) should also be described that exist among the tasks. Information about resources and tasks defines a scheduling problem and its solution is fairly a complex matter[2]. Many of the early progress in the field of scheduling were aggravated by problems and hence it is usual to employ the vocabulary of manufacturing when describing scheduling problems. Now, although scheduling work is of considerable significant in many non-manufacturing areas with still using terminology of manufacturing. Thus, resources are usually called machines and tasks are called jobs. Sometimes, jobs may consist of several elementary tasks called operations. The environment of the scheduling problem is called the job shop or simply the shop. Scheduling problems in an industry contain a set of tasks to be carried out with a set of limited resources available to perform these tasks. The general problem is to determine the timing of the tasks while recognizing the capability of the resources with given tasks and resources, together with some information about uncertainties. This problem usually arises within a decision making hierarchy in which scheduling follows some earlier, more basic decisions. In industries, analogous decisions are usually said to be part of the planning function. Among other things, the planning function might describe the design 2 of a companys products, the technology available for making and testing the required parts, and the volumes to be produced. In short, the planning function determines the resources available for production and the tasks to be scheduled. The present industrial environment is characterized by markets facing competition from which customer requirements and expectations are becoming increasingly high in terms of quality, cost and delivery times[3]. This evolution is made stronger by rapid development of new information and communication technologies which provide a direct connection between industries and customers. Hence, Industry performance is built on two dimensions: A technological dimension, whose goal is to develop intrinsic performance of marketed products in order to satisfy requirements of quality and lower cost of ownership for these products, technological innovation plays an important role and can be a differentiating element for market development and penetration. In this regard, it is known that rapid product technological growth and the personalization requirements for these products expected by markets often lead companies to forsake mass production and instead focus on small or medium production runs, even on-demand manufacturing. This requires them to have flexible and progressive production systems, able to adapt to market demands and needs quickly and efficiently. An organizational dimension intended for performance development in terms of production cycle times, respect of expected delivery dates, inventory and work in process management, adaptation and reactivity to variations in commercial orders, etc. This dimension plays an increasingly important role as markets are increasingly volatile and progressive, and require shorter response times from companies. Therefore, Industries must have powerful methods and tools at their disposal for production organization and control and that focused attention on satisfying customer needs under the best possible conditions. To achieve these goals, an organization relies on the implementation of a number of functions together with scheduling which plays a very important role. Indeed, the scheduling function is intended for the organization of human and technological resource 3 use in company workshops to directly satisfy customers requirements or demands issued from a production plan prepared by the company planning function. Considering market development and requirements, this function must organize the simultaneous completing of several jobs using flexible resources which are available in limited amounts, which becomes a complex problem to solve. In addition, it is this function which ultimately responsible for product manufacturing. Its efficiency and failures will therefore highly affect the companys relationship with its customers. Within companies, this function has obviously always been present, but today it must face increasingly complex problems because of the large number of jobs that must be executed simultaneously with shorter manufacturing times.

Fig 1: A Schematic of Job SchedulingSignificance of Scheduling Scheduling is a decision making practice that is used on a regular basis in many manufacturing and services industries. Its aim is to optimize one or more objectives with the allocation of resources to tasks over given time periods. The resources and tasks in an organization can take a lot of different forms. The resources may be machines in a workshop, crews at a construction site, processing units in a computing environment, and runways at an airport and so on. The tasks may be operations in a production process, take-offs and landings at an airport, executions of computer programs, stages in a construction project, and so on. Each task can have a definite priority level, an earliest likely starting time and a due date. The objectives can also take many different forms and one objective may be the minimization of the completion time of the last job and another may be the minimization of the number of jobs completed after their respective due dates. Scheduling plays an important role in most manufacturing and service systems as well as in most information processing environments[4]. Scheduling derives its importance from the two different considerations: Ineffective scheduling results in deprived utilization of available resources. A noticeable symptom is the idleness of facilities, human resources and apparatus waiting for orders to be processed. As a result of this cost of production increases. Poor scheduling normally create delays in the flow of some orders through the systems. Thus calls for advance measures that again increase cost.

1.3 Project Goals and ObjectivesThe purpose of this project is to select the best heuristic algorithm to solve a given scheduling problem. The application of neural networks may be used to make the selection. Most combinational optimization problems are NP-hard. This implies that a polynomial algorithm can be designed to solve these problems. Heuristic algorithms are developed and used solve combinational problems representing real world applications. Statistical inference techniques can also be used like ANOVA, to conclude that one heuristic is better than another. Deductions based on statistical inference techniques are valid for an aggregation of several instances but ineffective in selecting the best heuristic algorithm to solve a specific instance of a combinational problem.The problem of selecting the best heuristic algorithm when several of them exist, is a prediction problem.

LITERATURE REVIEW

2.1 MetaheuristicsMetaheuristics are general methods that guide the search through the solution space, using substitute algorithms as some form of heuristics, usually local search. Starting from an initial solution built by some heuristic, metaheuristics improve it iteratively until a stopping criterion is met. The stopping criterion can be elapsed time, number of iterations, number of evaluations of the objective function and so on. Metaheuristics are such strategies that perform a typically incomplete search in the space of solutions by iteratively creating and evaluating new candidate solutions. Genetic Algorithm as belongs to evolutionary algorithms can quickly scan a vast solution set and efficiently search the model space, so they are more likely (than local optimization techniques) to converge toward a global minima. There is no need of linearization of the problem and also no need to compute partial derivatives. Bad proposals do not affect the end solution negatively as they are simply discarded. The inductive nature of the GA means that it doesn't have to know any rules of the problem and it works by its own internal rules. This is very useful for complex or loosely defined problems. Also Simulated Annealing is superior to search optimal solutions from a current one in its neighborhood. SA has a fault that the obtained solutions are not necessarily global, but an advantage that there is only one individual and the computing efforts are not required so much as compared with GA. It can deal with arbitrary systems and cost functions with statistically optimal solution. It is relatively easy to code, even for complex problems and generally gives a good solution. Due to the facts, Genetic algorithm and simulated annealing are the commonly used metaheuristic for flow shop scheduling with each having its own advantages and limitation.[5] A simple genetic algorithm can produce a good result quickly for managers for a complex set of sequence dependent scheduling problems. GA is a viable approach to solving optimization problems.

Fig 2: Classification of common metaheuristics

2.1.1 Genetic AlgorithmGenetic Algorithmis asearchheuristicthat mimics the process ofnatural selection. This heuristic is routinely used to generate useful solutions tooptimizationandsearch problems.In a genetic algorithm, apopulationofcandidate solutions to an optimization problem are evolved toward better solutions. Each candidate solution has a set of properties which can be altered.[6]The evolution starts from a population of randomly generated individuals, and is aniterative process. The population in each iteration is called ageneration. In each generation, thefitnessof every individual in the population is evaluated. The fitness is the value of theobjective functionin the optimization problem being solved. The more fit individuals areselected from the current population, and each individual's properties is modified (recombinedand possibly randomly mutated) to form a new generation. The new generation of candidate solutions is then used in the next iteration of thealgorithm. The algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population.A typical genetic algorithm requires:1. Agenetic representationof the solution domain,2. Afitness functionto evaluate the solution domain.Once the genetic representation and the fitness function are defined, a GA proceeds to initialize a population of solutions and then to improve it through repetitive application of the mutation, crossover, inversion and selection operators.1. Initialization of genetic algorithm.Initially many individual solutions are randomly generated to form an initial population. The population size depends on the nature of the problem. The population is generated randomly, allowing the entire range of possible solutions. 2. SelectionDuring each successive generation, a proportion of the existing population isselectedto breed a new generation. Individual solutions are selected through afitness-basedprocess, wherefittersolutions are selected. Certain selection methods rate the fitness of each solution and preferentially select the best solutions. A generic selection procedure may be implemented as follows:1. Thefitness functionis evaluated for each individual, providing fitness values. These fitness values are then normalized. Normalization means dividing the fitness value of each individual by the sum of all fitness values, so that the sum of all resulting fitness values equals 1.2. The population is sorted by descending fitness values.3. Accumulated normalized fitness values are computed ). The accumulated fitness of the last individual should be 1.4. A random numberRbetween 0 and 1 is chosen.5. The selected individual is the first one whose accumulated normalized value is greater thanR.

3. Genetic OperatorThe next step is to generate a second generation population of solutions from those selected through a combination ofgenetic operators:crossoverandmutation.For each new solution to be produced, a pair of "parent" solutions is selected for breeding from the previously selected tool. By producing a "child" solution using the above methods of crossover and mutation, a new solution is created which shares many of the characteristics of its "parents". New parents are selected for each new child, and the process continues until a new population of solutions of appropriate size is generated. These processes ultimately result in the next generation population of chromosomes that is different from the initial generation. Generally the average fitness will have increased by this procedure for the population, since only the best organisms from the first generation are selected for breeding, along with a small proportion of less fit solutions. These less fit solutions ensure genetic diversity within the genetic pool of the parents and therefore ensure the genetic diversity of the subsequent generation of children.[7]It is possible to use other operators such as regrouping, colonization-extinction, or migration in genetic algorithms.Genetic Algorithm for Sequencing ProblemsChromosomal representationIn order to apply any GA to a sequencing problem, there is obvious practical difficulty. In most traditional GAs, the chromosomal representation is by means of strings of 0s and 1s. Result of a genetic operator is still a valid chromosome. This is not the case if one uses a permutation to represent the solution of a problem, which is the natural representation for a sequencing problem. Selection MechanismOther variations on Ackley's algorithm were also found to be helpful. The original description calls for random selection of parents, the beneficial effects of genetic search being gained by the replacement of `poor' chromosomes. But it is also possible to allow at least one parent to be selected according to its fitness, with lower-valued chromosomes (in this context) being the favored ones. But if this idea is adopted, it raises another question: how should we measure relative fitness? The usual method (for maximization) is to measure relative fitness as the ratio of the value of a given chromosome to the population mean. However, in minimization problems, we need to modify this so that low-valued chromosomes are the good ones. One approach is to define fitness as (vmax - v), but of course we are unlikely to know what vmax is. We could use the largest value found so far as a surrogate for vmax, but in practice this method was not very good at distinguishing between different chromosomes. In essence, if the population consists of many chromosomes whose fitness values are relatively close, the resolution between good and `bad' ones is not of a high order. This approach was therefore abandoned, in favor of a simple ranking mechanism. Sorting the chromosomes is a simple task, and updating the list is also straightforward. The selection of parents was then made in accordance with the probability distribution p( [k]) = 2kM(M + 1)where [k] is the k'th chromosome in ascending order of fitness (i.e. descending order of makespan). This implies that the median value has a chance of 1/M of being selected, while the M'th (the fittest) has a chance of 2/(M + 1), roughly twice that of the median. The ranking mechanism was also used in selecting the chromosome which is to be terminated.Ackley chose from those chromosomes whose fitness was below average in the sense of the arithmetic mean. When we already have a ranking of the chromosomes, it seems sensible to choose from chromosomes which are below the median. [8]Mutation As already indicated, a mutation operator was used to reduce the rate of convergence. In traditional GAs, mutation is applied by flipping each element of the chromosome from 0 to 1 (or vice-versa) with a small probability. With a permutation representation, mutation needs to be defined differently. Two types of mutation were tried. The first, an exchange mutation, was a simple exchange of two elements of the permutation, chosen at random. The second, a shift mutation, was a shift of one element (chosen randomly) a random number of places to the right or left. A few experiments were made in which shift seemed to be better than exchange, so this was adopted. Initial Population The final modification investigated was in the selection of the initial population. Most GAs in other contexts assume that the initial population is chosen completely at random; this was done in some experiments with the algorithm described here. However, it also seemed worth trying the effect of `seeding' the initial population with a good solution generated by a constructive heuristic. There are several such heuristics known, but the consensus seems to be that the best is the NEH algorithm due to Nawaz et al. [13]. Accordingly, another version of the GA was tested in which one solution was obtained from this heuristic, while the remaining (M - 1) were generated randomly. In a few trials of this procedure, the one with the seeded population appeared to arrive at its final solution rather more quickly, with no observed diminution in solution quality. This modification was also therefore included in the final version of the algorithm. A pseudo-code description of the version finally implemented is given in an appendix to this paper.

2.1.2 Simulated Annealing AlgorithmSimulated annealing is a genericprobabilisticmetaheuristicfor theglobal optimizationproblem of locating a good approximation to theglobal optimumof a givenfunctionin a largesearch space.. For certain problems, simulated annealing may be more efficient thanexhaustive enumeration provided that the goal is merely to find an acceptably good solution in a fixed amount of time, rather than the best possible solution.[9]The name and inspiration come fromannealing in metallurgy, a technique involving heating and controlled cooling of a material to increase the size of itscrystalsand reduce theirdefects. Both are attributes of the material that depend on itsthermodynamic free energy. Heating and cooling the material affects both the temperature and the thermodynamic free energy. While the same amount of cooling brings the same amount of decrease in temperature it will bring a bigger or smaller decrease in the thermodynamic free energy depending on the rate that it occurs, with a slower rate producing a bigger decrease.This notion of slow cooling is implemented in the Simulated Annealing algorithm as a slow decrease in the probability of accepting worse solutions as it explores the solution space. Accepting worse solutions is a fundamental property of metaheuristics because it allows for a more extensive search for the optimal solution.1. The basic iterationAt each step, the SA heuristic considers some neighbouring states'of the current states, andprobabilisticallydecides between moving the system to states'or staying in states. These probabilities ultimately lead the system to move to states of lower energy. Typically this step is repeated until the system reaches a state that is good enough for the application, or until a given computation budget has been exhausted.2. The neighbors of a stateThe neighbours of a state are new states of the problem that are produced after altering a given state in some well-defined way. For example, in thetraveling salesman problemeach state is typically defined as apermutationof the cities to be visited. The neighbours of a state are the set of permutations that are produced, for example, by reversing the order of any two successive cities. The well-defined way in which the states are altered in order to find neighbouring states is called a "move" and different moves give different sets of neighbouring states. These moves usually result in minimal alterations of the last state, as the previous example depicts, in order to help the algorithm keep the better parts of the solution and change only the worse parts. In the traveling salesman problem, the parts of the solution are the city connections.Searching for neighbours of a state is fundamental to optimization because the final solution will come after a tour of successive neighbours. Simpleheuristicsmove by finding best neighbour after best neighbour and stop when they have reached a solution which has no neighbours that are better solutions. The problem with this approach is that the neighbours of a state are not guaranteed to contain any of the existing better solutions which means that failure to find a better solution among them does not guarantee that no better solution exists. This is why the best solution found by such algorithms is called alocal optimumin contrast with the actual best solution which is called aglobal optimum. Metaheuristics use the neighbours of a solution as a way to explore the solutions space and although they prefer better neighbours they also accept worse neighbours in order to avoid getting stuck in local optima. As a result, if the algorithm is run for an infinite amount of time, the global optimum will be found.

3. Acceptance probabilitiesThe probability of making thetransitionfrom the current stateto a candidate new stateis specified by anacceptance probability function, that depends on the energiesandof the two states, and on a global time-varying parametercalled thetemperature. States with a smaller energy are better than those with a greater energy. The probability functionmust be positive even whenis greater than. This feature prevents the method from becoming stuck at a local minimum that is worse than the global one.[10]Whentends to zero, the probabilitymust tend to zero ifand to a positive value otherwise. For sufficiently small values of, the system will then increasingly favor moves that go "downhill" (i.e., to lower energy values), and avoid those that go "uphill." Withthe procedure reduces to thegreedy algorithm, which makes only the downhill transitions.In the original description of SA, the probabilitywas equal to 1 when i.e., the procedure always moved downhill when it found a way to do so, irrespective of the temperature. Many descriptions and implementations of SA still take this condition as part of the method's definition. However, this condition is not essential for the method to work.Thefunction is usually chosen so that the probability of accepting a move decreases when the differenceincreasesthat is, small uphill moves are more likely than large ones. However, this requirement is not strictly necessary, provided that the above requirements are met.Given these properties, the temperatureplays a crucial role in controlling the evolution of the stateof the system with regard to its sensitivity to the variations of system energies[11]. To be precise, for a large, the evolution ofis sensitive to coarser energy variations, while it is sensitive to finer energy variations whenis small.4. The annealing scheduleThe name and inspiration of the algorithm demand an interesting feature related to the temperature variation to be embedded in the operational characteristics of the algorithm. This necessitates a gradual reduction of the temperature as the simulation proceeds. The algorithm starts initially withset to a high value (or infinity), and then it is decreased at each step following someannealing schedulewhich may be specified by the user, but must end withtowards the end of the allotted time budget. In this way, the system is expected to wander initially towards a broad region of the search space containing good solutions, ignoring small features of the energy function; then drift towards low-energy regions that become narrower and narrower; and finally move downhill according to thesteepest descentheuristic.

2.2 PROBLEMS TAKEN

2.2.1 Multiprocessor Scheduling ProblemThe multiprocessor scheduling issue comprises of discovering an ideal conveyance of undertaking on an arrangement of processors. The quantity of processors and number of tasks are given. Time taken to finish an undertaking by a processor is likewise given as information. Every processor runs freely, however each can just run one job at once. We call a task of all jobs to accessible processors a "schedule". The objective of the issue is to focus the most brief schedule for the given arrangement of tasks.[12]A mathematical statement of the problem can be made as follows:Letandbe twofinitesets. On account of the industrial origins of the problem, theare calledprocessorsand theare calledtasks.Letdenote the set of all sequential assignments of tasks to processors, such that every task is done by every processor exactly once; elementsmay be written asmatrices, in which columnlists the jobs that processorwill do, in order. For example, the matrix

means that processorwill do the three jobsin the order, while processorwill do the jobs in the order.Suppose also that there is somecost function. The cost function may be interpreted as a "total processing time", and may have some expression in terms of times, the cost/time for machineto do job.Themultiprocessor scheduling problemis to find an assignment of jobssuch thatis a minimum, that is, there is nosuch that.

2.2.2 Travelling Salesman ProblemGiven a set of cities and distance between every pair of cities, the problem is to find the shortest possible route that visits every city exactly once and returns to the starting point. The problem was first formulated in 1930 and is one of the most intensively studied problems in optimization. [13]It is used as a benchmark for many optimization methods. Even though the problem is computationally difficult, a large number ofheuristicsand exact methods are known, so that some instances with tens of thousands of cities can be solved completely and even problems with millions of cities can be approximated within a small fraction of 1%.

TSP can be formulated as aninteger linear program. Label the cities with the numbers 0, ...,nand define:

Fori= 0, ...,n, letbe an artificial variable, and finally taketo be the distance from cityito cityj. Then TSP can be written as the following integer linear programming problem:

The first set of equalities requires that each city be arrived at from exactly one other city, and the second set of equalities requires that from each city there is a departure to exactly one other city. The last constraints enforce that there is only a single tour covering all cities, and not two or more disjointed tours that only collectively cover all cities. To prove this, it is shown below (1) that every feasible solution contains only one closed sequence of cities, and (2) that for every single tour covering all cities, there are values for the dummy variablesthat satisfy the constraints.[14]To prove that every feasible solution contains only one closed sequence of cities, it suffices to show that every subtour in a feasible solution passes through city 0 (noting that the equalities ensure there can only be one such tour). For if we sum all the inequalities corresponding tofor any subtour ofksteps not passing through city 0, we obtain:

which is a contradiction.It now must be shown that for every single tour covering all cities, there are values for the dummy variablesthat satisfy the constraints.Without loss of generality, define the tour as originating (and ending) at city 0. Chooseif cityiis visited in stept(i,t= 1, 2, ..., n). Then

sincecan be no greater thannandcan be no less than 1; hence the constraints are satisfied wheneverFor, we have:

satisfying the constraint.

PROJECT DESIGN AND IMPLEMENTATION

3.1 Solving Microprocessor Scheduling Problem3.1.1 Algorithm I: Simulated AnnealingThe custom annealing function for the multiprocessor scheduling problem will take a job schedule as input. The annealing function will then modify this schedule and return a new schedule that has been changed by an amount proportional to the temperatureCreate objective function: It returns the total time required for a given schedule (which is the maximum of the times that each processor is spending on its tasks).

First, we assume that the tasks are labeled according to the topological order in the task graph. All the tasks assigned to the same processor are scheduled sequentially based on their labels.Second, the temperature goes down in every generation during execution. It differs from the normal SA approach that temperature stays until the system reaches a steady state. Third, in every generation, a new solution is created by randomly modifying the current solution (remove a task, or switch the two tasks). The new solution is evaluated by the fitness function. The accept function decides if the new solution is acceptable or not, based on the evaluation result. It is accepted only when its fitness value is higher than the current one, or lower than the current one within an acceptance threshold. If it is accepted, the new solution replaces the current one. Otherwise, it is discarded.[15] Fourth, the algorithm stops after the temperature reaches a predefined value.

The procedure of simulated annealing is as follows:

Step 1: Initialize solution.

Step 2: Estimate initial temperature.

Step 3: Evaluate solution.

Step 4: If the new solution is accepted, update the old one.

Step 5: Adjust temperature.

Step 6: If the temperature reaches a pre-defined value, then stop the search; otherwise, generate new solution, and go to step 3.

Algorithm II: Genetic AlgorithmGenetic algorithms try to mimic the natural evolution process and generally start with an initial population of individuals, which can either be generated randomly or based on some other algorithm. Each individual is an encoding of a set of parameters that uniquely identify a potential solution of the problem. [16]In each generation, the population goes through the processes of crossover, mutation, fitness evaluation and selection. During crossover, parts of two individuals of the population are exchanged in order to create two entirely new individuals which replace the individuals from which they evolved. Each individual is selected for crossover with a probability of crossover rate. Mutation alters one or more genes in a chromosome with a probability of mutation rate. For example, if the individual is an encoding of a schedule, two tasks are picked randomly and their positions are interchanged. A fitness function calculates the fitness of each individual, i.e., it decides how good a particular solution is. In the selection process, each individual of the current population is selected into the new population with a probability proportional to its fitness. The selection process ensures that individuals with higher fitness values have a higher probability to be carried onto the next generation, and the individuals with lower fitness values are dropped out. The new population created in the above manner constitutes the next generation, and the whole process is terminated either after a fixed number of generations or when a stopping criteria is met. The population after a large number of generations is very likely to have individuals with very high fitness values which imply that the solution represented by the individual is good.[17] The algorithm used to solve scheduling problem is as follows: Step 1: Initialize the population. For initializing population, it is necessary to input number of processors, number of jobs and population size. Step 2: Evaluate the fitness function with the generated populations. Step 3: Perform selection process to select the best individual based on the fitness evaluated to participate in the next generation and eliminate the inferior. The job with the minimal finishing time and waiting time is the best individual corresponding to a particular generation. Step 4: Crossover is applied to produce a new offspring. Two crossover points are generated uniformly in the mated parents at random, and then the two parents exchange the centre portion between these crossover points to create two new children. Newly produced children after crossover are passed to the mutation process. Step 5: Mutation operation is performed to further create new offsprings, which is necessary for adding diversity to the solution set. Step 6: Test for the stopping condition. Stopping condition may be obtaining the best fitness value with minimum finishing time and minimum waiting time for the given objective function. If stopping condition satisfied then goto step 7 else goto step2 Step 7: Declare the best individual in the complete generations. Stop.

3.2 Solving Travelling Salesman ProblemAlgorithm I: Genetic AlgorithmStep 1: Plot the border of the country and generate random location of cities inside the border. Given the location, we can further calculate the distance matrix.

Step 2: Create and register the three required functions to use GA using custom creation function.

Step 3: Permutation Function: Create a cell array, say P where each element represents an ordered set of cities as a permutation vector. [18]The salesman will travel in the order specified in P{i}.

Step 4: Crossover Function: It takes a cell array, the population and returns a cell array, the children that result from the crossover.

Step 5: Mutation Function: It takes an individual, which is an ordered set of cities, and returns a mutated ordered set.

Step 6: Create a fitness function. The fitness of an individual is the total distance traveled for an ordered set of cities. The fitness function also needs the distance matrix to calculate the total distance.

Step 7: GA will call the fitness function to determine the optimal route.

A custom plot function is used to plot the location of the cities and the current best route

Algorithm II: Simulated AnnealingStep 1: Start with a random tour through the selected cities.

Step 2: Pick a new candidate tour at random from allneighbors of the existing tour. This candidate tour might be better or worse compared to the existing tour.

Step 3: Create fitness function. Through this check if the candidate tour is better than the existing tour. If it is better, accept it as the new tour

Step 4: If the candidate tour is worse than the existing tour, still maybe accept it, according to some probability.[19]The probability of accepting an inferior tour is a function of how much longer the candidate is compared to the current tour, and the temperature of the annealing process.A higher temperature makes you more likely to accept an inferior tour.

Step 5: Go back to step 2 and repeat many times, lowering the temperature a bit at each iteration, until you get to a low temperature and arrive at your minimum. EXPERIMENTATION

4.1 Software Requirements4.1.1 MATLABMATLAB (Matric Laboratory) is0a0numerical0computingenvironment0andfourth-generation0programming0language0Developedby00MathWorks,0MATLAB0allowsmatrixmanipulatios,0plotting0of functions00and0data,0implementation of0algorithms,0creation ofuser interfaces,0and0interfacing0with0programs0written0in0other0languages,including0C,C++,Java,and0Fortran.Although MATLAB is0intended0primarily for numerical computing,0an0optional0toolbox uses the0MuPAD0symbolic0engine,0allowing0access0tosymbolic0computingcapabilities.0An additional0package,Simulink,0adds0graphical0multi-domain0simulation0and0Model-Based Designfor0dynamicand0embedded0systems.[20]4.2 Implementing Multiprocessor Scheduling 4.2.1 Simulated Annealing

Fig 3: Multiprocessor Scheduling of 40 tasks and 11 processes through SA

Fig 4: Simulated Annealing Graphic Slover4.2.2 Genetic Algorithm

Fig 5: Multiprocessor Scheduling of 40 tasks and 11 processes through GA

Fig 6: Genetic Algorithm Graphical Solver4.3 Implementing Travelling Salesman Problem4.3.1 Genetic Algorithm

Fig 7: Shortest Route Calculated through GA

Fig 8: Travelling Salesman Problem output through GA

RESULTS

Comparing GA and SA by solving Multiprocessor Scheduling

Fig 9: Chart for job scheduling with different number of processors and jobs using GA

Fig 10: Chart for job scheduling with different number of processors and jobs using SA

Table 2: Comparison of SA and GA through Multiprocessor

CONCLUSIONSBy comparing the performance of GA and SA through the above problem, it was figured out that the execution of SA is superior to that of GA. SA and GA are close relatives, and quite a bit of their distinction is superficial. The two methodologies are typically formed in ways that look altogether different, and utilizing altogether different terminology. With SA, one more often than not discusses solutions, their costs, and neighbors and moves; while with GA, one discussions about individuals (or chromosomes), their fitness, and selection, crossover and mutation. This distinction in terminology obviously reflects the distinctions in accentuation, additionally serves to obscure the likenesses and the genuine differences between SA and GA. Essentially, SA can be thought as GA where the population size is only one. The current solution is the main individual in the population. Since there is one and only individual, there is no crossover, only mutation. This is indeed the key distinction between SA and GA. While SA makes a new solution by altering only one solution with a local move, GA creates solutions by combining two distinct solutions. Whether this really makes the algorithm better or worse, is not direct, but rather relies on upon the issue and the representation. It ought to be noticed that both SA and GA offer the major presumption that great solutions are all the more likely found "close" known great solutions than by randomly selecting from the entire solution space. In the event that this were not the situation with a specific issue or representation, they would perform no better than random sampling.[21]Advantages of GA 1. It always gives solution and solution gets better with time. 2. It supports multi-objective optimization. 3. It is more useful and efficient when search space is large, complex and poorly known or no mathematical analysis is available. 4. The GA is well suited to and has been extensively applied to solve complex design optimization problems because it can handle both discrete and continuous variables, and nonlinear objective functions without requiring gradient information.Advantages of SA 1. It statistically guarantees finding an optimal solution. 2. It is relatively easy to code, even for complex problems 3. SA can deal with nonlinear models, unordered data with many constraints. 4. Its main advantages over other local search methods are its flexibility and its ability to approach global optimality. 5. It is versatile because it does not depend on any restrictive properties of the model.

Limitations of GA 1. When fitness function is not properly defined, GA may converge towards local optima. 2. Operation on dynamic sets is difficult.3. GA is not appropriate choice for constraint based optimization problem.Limitation of SA 1. SA is not that much useful when the energy landscape is smooth, or there are few local minima.2. SA is a meta-heuristic approach, so it needs a lot of choices to turn it into an actual algorithm. 3. There is a trade-off between the quality of the solutions and the time needed to compute them. 4. More customization work needed for varieties of constraints and have to fine-tune the parameters of the algorithm. 5. The precision of the numbers used in implementation have a major effect on the quality of the result

FUTURE PROSPECTSThis project aims to compare the performance of heuristic search algorithms and to analyse them. The performance of SA and GA was compared using Multiprocessor Scheduling Problem. According to the analysis based on various variations such as Number of Processes and Number of Tasks, it was found that SA performs better than GA.Travelling Salesman Problem was implemented using GA, wherein the shortest route was calculated graphically. Implementing TSP through SA, we can compare the performance of SA and GA based on various parameters such as time taken to calculate the route, the variation in time taken if number of cities is increased and the distance calculated by each algorithm. The one which gives a good solution is least amount of time is the better algorithm.SA works better since it can attain global optimum and doesnt get stuck in local optimum.

REFERENCES[1] Melanie Mitchell.,An Introduction to Genetic Algorithms. MIT Press, Cambridge,MA, 1998.[2] Emile H. Aarts and Jan Korst. Simulated annealing and Boltzmann machines: a stochastic approach to combinatorial optimization and neural computing. N.Y. John Wiley and Sons, New York, NY, USA, 1989.[3] http://www.ijsacs.org/vol2issue2/paper23.pdf, International Journal of Applied Information Systems[4] http://www.wseas.org/multimedia/journals/computers/2012/54-908.pdf, On Performance Analysis of Hybrid Intelligent Algorithms[5] http://www.dcc.fc.up.pt/~ines/aulas/0708/CG/papers/sched2008.pdf, Performance Study of Task Scheduling[6] Hou E, Ansari N, Ren H (1994) A genetic algorithm for multiprocessor scheduling. IEEE Trans Parallel Distrib Syst 5(2):113120[7] Fan Yang 2010. Solving Traveling Salesman Problem Using Parallel Genetic Algorithm and Simulated Annealing. [8] Lawler, E. L., Lenstra, J. K., Rinnooy Kan, A. H. G., and Shmoys, D. B. 1985. The Traveling Salesman Problem, John Wiley & Sons, Chichester. [9] http://www.cs.ubbcluj.ro/~csatol/mestint/pdfs/Numerical_Recipes_Simulated_Annealing.pdf, Simulated Annealing Methods[10] http://www.ijcte.org/papers/713-L329.pdf, International Journal of Computer Theory and Engineering[11] M. Srinivas and L. M. Patnaik, Genetic algorithms: A survey computer, vol. 27, pp. 17-26, 1994. [12] M. C. Golumbic, Algorithmic Graph Theory and Perfect Graphs. Second Edition, Elsevier, 2004. [13] N. Sureja, B. Chawda, Random Travelling Salesman Problem usingSA, International Journal of Emerging Technology and AdvancedEngineering, Volume 2, Issue 4, April 2012[19] TSPLIB95: Ruprecht - Karls - UniversitatHeildelberg (2011),http://www.iwr.uiniheidelberg.de/groups/comop t/software/TSPLIB95/[20] Danneberg, D., Tautenhahn, T. and Werner, F., 1999. A comparison of heuristic algorithms for flow shop scheduling problems with setup times and limited batch size. Mathematical and Computer Modelling 29(9), 101126. [21] . French, S., 1982. Sequencing and Scheduling: An Introduction to the Mathematics of the Job-Shop. Ellis Horwood Limited. [22]Goldberg, D.E., 1989. Genetic algorithms in search, optimization and machine learning. Addison-Wesley, Reading, MA. 31

analysis of heuristic search techniques

Documents