國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,於Spark上實作平行化珊瑚礁演算法解工作排程問題,Parallel Coral Reef Algorithm for Solving JSP on Spark

論文名稱 Title	於Spark上實作平行化珊瑚礁演算法解工作排程問題 Parallel Coral Reef Algorithm for Solving JSP on Spark
系所名稱 Department	資訊工程學系 Department of Computer Science and Engineering
畢業學年期 Year, semester	105 學年度第 1 學期 The fall semester of Academic Year 105	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	63
研究生 Author	張恒慈 Heng-tzu Chang
指導教授 Advisor	江明朝 Ming-Chao Chiang
召集委員 Convenor	李宗南 Chung-Nan Lee
口試委員 Advisory Committee	賴威光, 蔡崇煒 Wei-Kuang Lai; Chun-Wei Tsai
口試日期 Date of Exam	2016-11-04	繳交日期 Date of Submission	2017-01-16
關鍵字 Keywords	MapReduce、啟發式演算法、工作排程問題、珊瑚礁演算法、Spark MapReduce, Spark, Coral reef optimization algorithm, Job shop scheduling problem, Metaheuristic algorithms
統計 Statistics	本論文已被瀏覽 5676 次，被下載 0 次 The thesis/dissertation has been browsed 5676 times, has been downloaded 0 times.

中文摘要
由於大部分的啟發式演算法在設計時，並未考慮應用於分散式計算的環境。在求解大量且複雜的最佳化問題時，通常需要耗費大量的計算成本，因此目前在即時應用的議題上成果較少。近期雲端計算的平台及環境日益成熟，本研究嘗試將珊瑚礁演算法以 MapReduce 架構，將運算工作分配給各節點進行平行化運算，並將其實作於 Spark 系統上，加速啟發式演算法。此外，本研究並將改良全域搜尋與區域搜尋步驟，減少傳輸資料的消耗，使之能更有效率的使用各個節點的計算資源。實驗結果顯示，本論文所提之平行化珊瑚礁演算法，可在大型的工作排程問題，於較短的時間內找到較好的排程結果。
Abstract
Since most metaheuristic algorithms are not designed for distributed computing environments, they will normally take a lot of computation cost for large-scale and complex optimization problems. That is why not many successful results are out there for real-time applications. With the advance of cloud computing, this study is thus aimed at building the so-called coral reef optimization (CRO) on the MapReduce framework so as to distribute the computation tasks to a set of cluster nodes and to realize it on Spark to speed up its response time. Moreover, the global and local search operators of the CRO are enhanced to reduce the amount of data transmitted so that the computation resources of each cluster node are more efficiently used. The simulation results show that the proposed algorithm is able to find a better result than the original CRO for large-scale job shop scheduling problems.

目次 Table of Contents
論文審定書 i Acknowledgments iv 摘要 v Abstract vi List of Figures x List of Tables xii Chapter 1 簡介 1 1.1 動機 2 1.2 論文貢獻 2 1.3 論文架構 3 Chapter 2 相關文獻探討 4 2.1 叢集式電腦 4 2.1.1 MapReduce 5 2.1.2 Apache Hadoop 6 2.1.3 Apache Spark 7 2.1.3.1 Spark on YARN 9 2.1.3.2 彈性分散式資料集 10 2.2 工作排程問題 12 2.3 基因演算法 13 2.4 禁忌搜尋演算法 14 2.5 珊瑚礁演算法 14 2.5.0.3 演算法流程 15 2.5.0.4 演算法模擬 17 2.6 結論 17 Chapter 3 平行化珊瑚礁演算法 19 3.1 演算法設計概念 20 3.2 演算法流程 21 3.2.1 演算法機制 3.2.1.1 初始化 3.2.1.2 交配繁殖 3.2.1.3 無性生殖 3.2.1.4 適應 3.2.2 key-value 配對 25 3.2.3 Evolution 模組 25 3.2.4 競爭操作 26 3.2.5 捕食操作 27 3.3 演算法範例 28 3.3.1 初始化範例 28 3.3.2 平行化珊瑚礁演算法範例 28 Chapter 4 實驗結果 31 4.1 實驗環境 31 4.2 模擬結果 32 4.2.1平行化珊瑚礁演算法與單機版演算法之分析 33 4.2.1.1 資料集規模分析 35 4.2.2 平行化珊瑚礁演算法與平行化基因演算法之比較分析 39 4.2.3 使用不同數量之運算節點的運行效率 41 4.2.4 平行化負載率 41 4.3 總結 43 Chapter 5 結論與未來展望 45 5.1 結論 45 5.2 未來展望 46 Bibliography 47

參考文獻 References
[1] M. R. Garey, D. S. Johnson, and R. Sethi, “The complexity of flowshop and jobshop scheduling,” Mathematics of operations research, vol. 1, no. 2, pp. 117–129, 1976. [2] F. Della Croce, R. Tadei, and G. Volta, “A genetic algorithm for the job shop problem,” Computers & Operations Research, vol. 22, no. 1, pp. 15–24, 1995. [3] S. Meeran and M. Morshed, “A hybrid genetic tabu search algorithm for solving job shop scheduling problems: a case study,” Journal of intelligent manufacturing, vol. 23, no. 4, pp. 1063–1078, 2012. [4] L. Sun, X. Cheng, and Y. Liang, “Solving job shop scheduling problem using genetic algorithm with penalty function,” International Journal of Intelligent information processing, vol. 1, no. 2, pp. 65–77, 2010. [5] B. Giffler and G. L. Thompson, “Algorithms for solving production-scheduling problems,” Operations research, vol. 8, no. 4, pp. 487–503, 1960. [6] Hadoop Wiki, “Apache hadoop,” 2016, Accessed on Fabruary 15, 2016. [Online]. Available: https://en.wikipedia.org/wiki/Apache Hadoop [7] M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia, “A view of cloud computing,” Communications of the ACM, vol. 53, no. 4, pp. 50–58, 2010. [8] P. M. Mell and T. Grance, “Sp 800-145. the nist definition of cloud computing,” Gaithersburg, MD, United States, Tech. Rep., 2011. [9] K. Shvachko, H. Kuang, S. Radia, and R. Chansler, “The hadoop distributed file system,” in Proceedings of the IEEE Symposium on Mass Storage Systems and Technologies, 2010, pp. 1–10. [10] Spark Wiki, “Apache spark,” 2016. [Online]. Available: https://en.wikipedia.org/wiki/ Apache Spark [11] Apache ignite, “Apache ignite,” 2016. [Online]. Available: https://ignite.apache.org/ features/igfs.html [12] C. W. Tsai, H. C. Chang, and M. C. Chiang, “Parallel coral reef algorithm for solving jsp on spark,” in Proceedings of the International Conference on Systems, Man, and Cybernetics, 2016, pp. 1872–1877. [13] S. Salcedo-Sanz, J. Del Ser, I. Landa-Torres, S. Gil-L´opez, and J. Portilla-Figueras, “The coral reefs optimization algorithm: a novel metaheuristic for efficiently solving optimization problems,” The Scientific World Journal, vol. 2014, pp. 1–15, 2014. [14] S. Lohr, “The age of big data,” 2012, Accessed on Fabruary 15, 2016. [Online]. Available: http://www.nytimes.com/2012/02/12/sunday-review/ big-datas-impact-in-the-world.html? r=0 [15] J. Dean and S. Ghemawat, “Mapreduce: simplified data processing on large clusters,” Communications of the ACM, vol. 51, no. 1, pp. 107–113, 2008. [16] C.W. Tsai, C. H. Hsieh, and M. C. Chiang, “Parallel black hole clustering based on mapreduce,” in Proceedings of the International Conference on Systems, Man, and Cybernetics, 2015, pp. 2543–2548. [17] D.-W. Huang and J. Lin, “Scaling populations of a genetic algorithm for job shop scheduling problems using mapreduce,” in Proceedings of the IEEE Second International Conference on Cloud Computing Technology and Science, 2010, pp. 780–785. [18] J. Wang, D. Yuan, and M. Jiang, “Parallel k-pso based on mapreduce,” in Proceedings of the IEEE 14th International Conference on Communication Technology, 2012, pp. 1203– 1208. [19] A. W. McNabb, C. K. Monson, and K. D. Seppi, “Parallel pso using mapreduce,” in Proceedings of the IEEE Congress on Evolutionary Computation, 2007, pp. 7–14. [20] The Apache Software Foundation, “What is apache hadoop?” 2016, Accessed on Fabruary 15, 2016. [Online]. Available: http://hadoop.apache.org/ [21] S. Ghemawat, H. Gobioff, and S.-T. Leung, “The google file system,” in Proceedings of the ACM SIGOPS operating systems review, vol. 37, no. 5, 2003, pp. 29–43. [22] S. Pandey and V. Tokekar, “Prominence of mapreduce in big data processing,” in Proceedings of the International Conference on Communication Systems and Network Technologies, 2014, pp. 555–560. [23] Apache Spark, “Apache spark,” 2016. [Online]. Available: http://spark.apache.org/ [24] V. K. Vavilapalli, A. C. Murthy, C. Douglas, S. Agarwal, M. Konar, R. Evans, T. Graves, J. Lowe, H. Shah, S. Seth et al., “Apache hadoop yarn: Yet another resource negotiator,” in Proceedings of the 4th annual Symposium on Cloud Computing, 2013, p. 5. [25] E. G. Coffman, Computer and job-shop scheduling theory. Wiley, 1976. [26] D. Applegate and W. J. Cook, “A computational study of the job-shop scheduling problem.” Informs Journal on Computing, vol. 3, no. 2, pp. 149–156, 1991. [27] J. H. Holland, Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. U Michigan Press, 1975. [28] C. Darwin, “On the origins of species by means of natural selection,” London: Murray, vol. 247, 1859. [29] S. K. Hasan, R. Sarker, and D. Cornforth, “Hybrid genetic algorithm for solving job-shop scheduling problem,” in Proceedings of the 6th IEEE/ACIS International Conference on Computer and Information Science, 2007, pp. 519–524. [30] F. Glover, “Tabu search-part i,” ORSA Journal on computing, vol. 1, no. 3, pp. 190–206, 1989. [31] F. Werner, “Genetic algorithms for shop scheduling problems: a survey,” Preprint, vol. 31, no. 11, 2011. [32] R. Cheng, “A study on genetic algorithm-based optimal scheduling techniques,” Journal of Japan Society for Fuzzy Theory and Systems, vol. 10, no. 3, p. 485, 1998. [33] J. F. Muth and G. L. Thompson, Industrial scheduling. Prentice-Hall, 1963. [34] S. Lawrence, “Resource constrained project scheduling: an experimental investigation of heuristic scheduling techniques (supplement),” Graduate School of Industrial Administration, 1984. [35] R. H. Storer, S. D.Wu, and R. Vaccari, “New search spaces for sequencing problems with application to job shop scheduling,” Management science, vol. 38, no. 10, pp. 1495–1509, 1992. [36] J. E. Beasley, “Or-library: distributing test problems by electronic mail,” Journal of the operational research society, vol. 41, no. 11, pp. 1069–1072, 1990. [37] B. J. Park, H. R. Choi, and H. S. Kim, “A hybrid genetic algorithm for the job shop scheduling problems,” Computers & industrial engineering, vol. 45, no. 4, pp. 597–613, 2003. [38] I. G. Medeiros, J. C. Xavier, and A. M. Canuto, “Applying the coral reefs optimization algorithm to clustering problems,” in Proceedings of the International Joint Conference on Neural Networks, 2015, pp. 1–8.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：永不公開 not available 校外 Off-campus：永不公開 not available 您的 IP(校外) 位址是 18.225.117.183 論文開放下載的時間是校外不公開 Your IP address is 18.225.117.183 This thesis will be available to you on Indicate off-campus access is not available.
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 永不公開 not available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS