# A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems

Post on 27-Jun-2015

289 views

Embed Size (px)

DESCRIPTION

Paper presented at MISTA2013, Gent. In this paper, we present a method based on Learning Automata to solve Hybrid Flexible Flowline Scheduling Problems (HFFSP) with additional constraints like sequence dependent setup times, precedence relations between jobs and machine eligibility. This category of production scheduling problems is noteworthy because it involves several types of constraints that occur in complex real-life production scheduling problems like those in process industry and batch production. In the proposed technique, Learning Automata play a dispersion game to determine the order of jobs to be processed in a way that makespan is minimized, and precedence constraint violations are avoided. Experiments on a set of benchmark problems indicate that this method can yield better results than the ones known until now.TRANSCRIPT

<ul><li> 1. A Reinforcement Learning Approach to Solving Hybrid Flexible Flowline Scheduling Problems Bert Van Vreckem Dmitriy Borodin Wim De Bruyn Ann Nowe </li></ul>
<p> 2. Authors Bert Van Vreckem, HoGent Business and Information Management bert.vanvreckem@hogent.be Dmitriy Borodin, OMPartners dborodin@ompartners.com Wim De Bruyn, HoGent Business and Information Management wim.debruyn@hogent.be Ann Nowe, Articial Intelligence Lab, Vrije Universiteit Brussel ann.nowe@vub.ac.be HFFSP MISTA2013: 29 August 2013 3/28 3. Contents 1 Hybrid Flexible Flowline Scheduling Problems 2 A Machine Learning Approach 3 Learning Permutations with Precedence Constraints 4 Experiments & results 5 Conclusion HFFSP MISTA2013: 29 August 2013 4/28 4. Hybrid Flexible Flowline Scheduling Problems Powerful model for complex real-life production scheduling problems. In // notation1: HFFLm, ((RM(i) ) (m) i=1/Mj, rm, prec, Siljk, Ailjk, lag/Cmax 1 (Urlings, 2010) HFFSP MISTA2013: 29 August 2013 5/28 5. Hybrid Flexible Flowline Scheduling Problems Powerful model for complex real-life production scheduling problems. In // notation1: HFFLm, ((RM(i) ) (m) i=1/Mj, rm, prec, Siljk, Ailjk, lag/Cmax Flowline Scheduling problems: jobs processed in consecutive stages. Stage 1 Stage 2 Stage 3 Stage 4 1 (Urlings, 2010) HFFSP MISTA2013: 29 August 2013 5/28 6. Hybrid Flexible Flowline Scheduling Problems Hybrid case: unrelated parallel machines M11 M12 M13 M21 M22 M31 M32 M33 M34 M41 M42 HFFSP MISTA2013: 29 August 2013 6/28 7. Hybrid Flexible Flowline Scheduling Problems Flexible case: stages may be skipped M11 M12 M13 M21 M22 M41 M42 HFFSP MISTA2013: 29 August 2013 7/28 8. Hybrid Flexible Flowline Scheduling Problems Other constraints: Machine eligibility M11 M13 M21 M22 M31 M33 M42 HFFSP MISTA2013: 29 August 2013 8/28 9. Hybrid Flexible Flowline Scheduling Problems Other constraints: Time lag between stages Stage 1 Stage 2 Stage 3 Stage 4 HFFSP MISTA2013: 29 August 2013 9/28 10. Hybrid Flexible Flowline Scheduling Problems Other constraints: Sequence dependent setup times 1 2 3 4 5 6 7 8 9 10 11 12 J1 J2M1 J1 J2M2 HFFSP MISTA2013: 29 August 2013 10/28 11. Hybrid Flexible Flowline Scheduling Problems Other constraints: Sequence dependent setup times 1 2 3 4 5 6 7 8 9 10 11 12 J1 J2M1 J1 J2M2 J2 J1M1 J2 J1M2 HFFSP MISTA2013: 29 August 2013 10/28 12. Hybrid Flexible Flowline Scheduling Problems Other constraints: Sequence dependent setup times 1 2 3 4 5 6 7 8 9 10 11 12 J1 J2M1 J1 J2M2 J2 J1M1 J2 J1M2 HFFSP MISTA2013: 29 August 2013 11/28 13. Hybrid Flexible Flowline Scheduling Problems Other constraints: Precendence relations between jobs 1 2 3 4 5 6 7 8 9 10 11 12 J1 J2M1 J1 J2M2 J2 J1M1 J2 J1M2 HFFSP MISTA2013: 29 August 2013 12/28 14. Hybrid Flexible Flowline Scheduling Problems Precedence relations between jobs make the problem much harder, in a way that MILP/CPLEX approach doesnt work anymore for larger instances (Urlings, 2010) HFFSP MISTA2013: 29 August 2013 13/28 15. Contents 1 Hybrid Flexible Flowline Scheduling Problems 2 A Machine Learning Approach 3 Learning Permutations with Precedence Constraints 4 Experiments & results 5 Conclusion HFFSP MISTA2013: 29 August 2013 14/28 16. A Machine Learning Approach Scheduling Hybrid Flexible Flowline Scheduling Problems Two stages: Job permutations Machine assignment HFFSP MISTA2013: 29 August 2013 15/28 17. A Machine Learning Approach Scheduling Hybrid Flexible Flowline Scheduling Problems Two stages: Job permutations Learning Automata Machine assignment HFFSP MISTA2013: 29 August 2013 15/28 18. A Machine Learning Approach Scheduling Hybrid Flexible Flowline Scheduling Problems Two stages: Job permutations Learning Automata Machine assignment Earliest Preparation Next Stage (EPNS) (Urlings, 2010) HFFSP MISTA2013: 29 August 2013 15/28 19. A Machine Learning Approach Scheduling Hybrid Flexible Flowline Scheduling Problems Two stages: Job permutations Learning Automata Machine assignment Earliest Preparation Next Stage (EPNS) (Urlings, 2010) HFFSP MISTA2013: 29 August 2013 15/28 20. Reinforcement learning At every discrete time step t: Agent percieves environment state s(t) Agent chooses action a(t) A = a1, . . . , an according to some policy Environment places agent in new state s(t + 1) and gives reinforcement r(t) Goal: learn policy that maximizes long term cumulative reward t r(t) Environment Agent s r a HFFSP MISTA2013: 29 August 2013 16/28 21. Learning Automata (LA) Reinforcement Learning agents that choose action according to probability distribution p(t) = (p1(t), . . . , pn(t)), with pi = Prob[a(t) = ai] and s.t. n i=1 pi = 1 pi(0) = 1 n (1) pi(t + 1) = pi(t) +rewr(t)(1 pi(t)) pen(1 r(t))pi(t) (2) if ai is the action taken at instant t pj(t + 1) = pj(t) rewr(t)pj(t) +pen(1 r(t)) 1 n 1 pj(t) (3) if aj = ai HFFSP MISTA2013: 29 August 2013 17/28 22. Learning Automata (LA) Reinforcement Learning agents that choose action according to probability distribution p(t) = (p1(t), . . . , pn(t)), with pi = Prob[a(t) = ai] and s.t. n i=1 pi = 1 pi(0) = 1 n (1) pi(t + 1) = pi(t) +rewr(t)(1 pi(t)) pen(1 r(t))pi(t) (2) if ai is the action taken at instant t pj(t + 1) = pj(t) rewr(t)pj(t) +pen(1 r(t)) 1 n 1 pj(t) (3) if aj = ai HFFSP MISTA2013: 29 August 2013 17/28 23. Learning Automata (LA) Reinforcement Learning agents that choose action according to probability distribution p(t) = (p1(t), . . . , pn(t)), with pi = Prob[a(t) = ai] and s.t. n i=1 pi = 1 pi(0) = 1 n (1) pi(t + 1) = pi(t) +rewr(t)(1 pi(t)) pen(1 r(t))pi(t) (2) if ai is the action taken at instant t pj(t + 1) = pj(t) rewr(t)pj(t) +pen(1 r(t)) 1 n 1 pj(t) (3) if aj = ai HFFSP MISTA2013: 29 August 2013 17/28 24. Learning Automaton update 1 2 3 4 0 0.2 0.4 0.6 0.8 1 i pi HFFSP MISTA2013: 29 August 2013 18/28 25. Learning Automaton update 1 2 3 4 0 0.2 0.4 0.6 0.8 1 i pi E.g. action 3 was chosen HFFSP MISTA2013: 29 August 2013 18/28 26. Learning Automaton update 1 2 3 4 0 0.2 0.4 0.6 0.8 1 i pi E.g. action 3 was chosen 1 2 3 4 0 0.2 0.4 0.6 0.8 1 r(t) = 1 pi HFFSP MISTA2013: 29 August 2013 18/28 27. Learning Automaton update 1 2 3 4 0 0.2 0.4 0.6 0.8 1 i pi E.g. action 3 was chosen 1 2 3 4 0 0.2 0.4 0.6 0.8 1 r(t) = 1 pi 1 2 3 4 0 0.2 0.4 0.6 0.8 1 r(t) = 0 pi HFFSP MISTA2013: 29 August 2013 18/28 28. Contents 1 Hybrid Flexible Flowline Scheduling Problems 2 A Machine Learning Approach 3 Learning Permutations with Precedence Constraints 4 Experiments & results 5 Conclusion HFFSP MISTA2013: 29 August 2013 19/28 29. Probabilistic Basic Simple Strategy (PBSS) (Wauters, 2012) A LA is assigned to every position of a permutation HFFSP MISTA2013: 29 August 2013 20/28 30. Probabilistic Basic Simple Strategy (PBSS) (Wauters, 2012) A LA is assigned to every position of a permutation LAs play a dispersion game to choose unique action, resulting in a permutation HFFSP MISTA2013: 29 August 2013 20/28 31. Probabilistic Basic Simple Strategy (PBSS) (Wauters, 2012) A LA is assigned to every position of a permutation LAs play a dispersion game to choose unique action, resulting in a permutation Quality of solution is evaluated HFFSP MISTA2013: 29 August 2013 20/28 32. Probabilistic Basic Simple Strategy (PBSS) (Wauters, 2012) A LA is assigned to every position of a permutation LAs play a dispersion game to choose unique action, resulting in a permutation Quality of solution is evaluated Update probabilities according to LA update rule Linear Reward-Inaction (pen = 0): HFFSP MISTA2013: 29 August 2013 20/28 33. Probabilistic Basic Simple Strategy (PBSS) (Wauters, 2012) A LA is assigned to every position of a permutation LAs play a dispersion game to choose unique action, resulting in a permutation Quality of solution is evaluated Update probabilities according to LA update rule Linear Reward-Inaction (pen = 0): Better result than best one so far: r(t) = 1 HFFSP MISTA2013: 29 August 2013 20/28 34. Probabilistic Basic Simple Strategy (PBSS) (Wauters, 2012) A LA is assigned to every position of a permutation LAs play a dispersion game to choose unique action, resulting in a permutation Quality of solution is evaluated Update probabilities according to LA update rule Linear Reward-Inaction (pen = 0): Better result than best one so far: r(t) = 1 If not, r(t) = 0 HFFSP MISTA2013: 29 August 2013 20/28 35. Probabilistic Basic Simple Strategy (PBSS) (Wauters, 2012) A LA is assigned to every position of a permutation LAs play a dispersion game to choose unique action, resulting in a permutation Quality of solution is evaluated Update probabilities according to LA update rule Linear Reward-Inaction (pen = 0): Better result than best one so far: r(t) = 1 If not, r(t) = 0 Repeat until convergence HFFSP MISTA2013: 29 August 2013 20/28 36. Probabilistic Basic Simple Strategy (PBSS) PBSS: great results in several optimization problems that involve learning permutations HFFSP MISTA2013: 29 August 2013 21/28 37. Probabilistic Basic Simple Strategy (PBSS) PBSS: great results in several optimization problems that involve learning permutations but doesnt work well when precedence constraints are involved HFFSP MISTA2013: 29 August 2013 21/28 38. Probabilistic Basic Simple Strategy (PBSS) PBSS: great results in several optimization problems that involve learning permutations but doesnt work well when precedence constraints are involved PBSS only learns from positive experience (i.e. improving on previous solutions) HFFSP MISTA2013: 29 August 2013 21/28 39. Probabilistic Basic Simple Strategy (PBSS) PBSS: great results in several optimization problems that involve learning permutations but doesnt work well when precedence constraints are involved PBSS only learns from positive experience (i.e. improving on previous solutions) Doesnt learn to avoid invalid permutations HFFSP MISTA2013: 29 August 2013 21/28 40. Extending PBSS for precendence constraints Updating probabilities: If the job permutation is invalid, perform an update with r(t) = 0 and pen > 0 for all agents that are involved in the violation of precedence constraints. HFFSP MISTA2013: 29 August 2013 22/28 41. Extending PBSS for precendence constraints Updating probabilities: If the job permutation is invalid, perform an update with r(t) = 0 and pen > 0 for all agents that are involved in the violation of precedence constraints. If the job permutation is valid, perform a LRI update in all agents, depending on the resulting makespan ms and best makespan until now msbest: HFFSP MISTA2013: 29 August 2013 22/28 42. Extending PBSS for precendence constraints Updating probabilities: If the job permutation is invalid, perform an update with r(t) = 0 and pen > 0 for all agents that are involved in the violation of precedence constraints. If the job permutation is valid, perform a LRI update in all agents, depending on the resulting makespan ms and best makespan until now msbest: improved: r(t) = 1; HFFSP MISTA2013: 29 August 2013 22/28 43. Extending PBSS for precendence constraints Updating probabilities: If the job permutation is invalid, perform an update with r(t) = 0 and pen > 0 for all agents that are involved in the violation of precedence constraints. If the job permutation is valid, perform a LRI update in all agents, depending on the resulting makespan ms and best makespan until now msbest: improved: r(t) = 1; equally good: r(t) = 1/2; HFFSP MISTA2013: 29 August 2013 22/28 44. Extending PBSS for precendence constraints Updating probabilities: If the job permutation is invalid, perform an update with r(t) = 0 and pen > 0 for all agents that are involved in the violation of precedence constraints. If the job permutation is valid, perform a LRI update in all agents, depending on the resulting makespan ms and best makespan until now msbest: improved: r(t) = 1; equally good: r(t) = 1/2; worse: r(t) = msbest 2ms ; HFFSP MISTA2013: 29 August 2013 22/28 45. Extending PBSS for precendence constraints Updating probabilities: If the job permutation is invalid, perform an update with r(t) = 0 and pen > 0 for all agents that are involved in the violation of precedence constraints. If the job permutation is valid, perform a LRI update in all agents, depending on the resulting makespan ms and best makespan until now msbest: improved: r(t) = 1; equally good: r(t) = 1/2; worse: r(t) = msbest 2ms ; no valid schedule found: r(t) = 0; HFFSP MISTA2013: 29 August 2013 22/28 46. Contents 1 Hybrid Flexible Flowline Scheduling Problems 2 A Machine Learning Approach 3 Learning Permutations with Precedence Constraints 4 Experiments & results 5 Conclusion HFFSP MISTA2013: 29 August 2013 23/28 47. Experiments HFFSP Benchmark problems from (Ruiz et al., 2008)2 problem sets with 5, 7, 9, 11, 13, 15 jobs, 96 instances in each set + other constraints that make problems harder (precedence relations!) rew = 0.1; pen = 0.5 (no tuning) Run until converges, or at most 300 seconds 2 Available at http://soa.iti.es/problem-instances HFFSP MISTA2013: 29 August 2013 24/28 48. Results Instance set 5 7 9 11 13 15 overall mean RD (%) 0.0697 2.0131 1.1568 1.6565 3.7294 7.9189 2.7484 best RD (%) -35.70 -24.71 -26.92 -21.10 -43.34 -10.46 -43.34 # improved 11 12 18 12 9 6 68 # equal 62 40 19 18 8 7 154 # worse 23 44 59 66 79 82 354 HFFSP MISTA2013: 29 August 2013 25/28 49. Results Instance set 5 7 9 11 13 15 overall mean RD (%) 0.0697 2.0131 1.1568 1.6565 3.7294 7.9189 2.7484 best RD (%) -35.70 -24.71 -26.92 -21.10 -43.34 -10.46 -43.34 # improved 11 12 18 12 9 6 68 # equal 62 40 19 18 8 7 154 # worse 23 44 59 66 79 82 354 HFFSP MISTA2013: 29 August 2013 25/28 50. Contents 1 Hybrid Flexible Flowline Scheduling Problems 2 A Machine Learning Approach 3 Learning Permutations with Precedence Constraints 4 Experiments & results 5 Conclusion HFFSP MISTA2013: 29 August 2013 26/28 51. Results and Discussion Contributions: Extension of PBSS for learning permutations with precedence constraints Simple model + RL approach can yield good quality results for challenging HFFSP instances HFFSP MISTA2013: 29 August 2013 27/28 52. Results and Discussion Contributions: Extension of PBSS for learning permutations with precedence constraints Simple model + RL approach can yield good quality results for challenging HFFSP instances Discussion & future work: Precedence relations do make the problem harder Parameter tuning Convergence Larger instances (50, 100 jobs) Explore possibilities for improvement in machine assignment HFFSP MISTA2013: 29 August 2013 27/28 53. Thank you! Questions? bert.vanvreckem@hogent.be http://www.slideshare.net/bertvanvreckem/ HFFSP MISTA2013: 29 August 2013 28/28 </p>

Recommended

View more >