國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,潛力預測演算法：一個以搜尋經驗為基礎之新穎分群演算法,Potential Forecast Algorithm: A Novel Search-Experience-based Clustering Algorithm

論文名稱 Title	潛力預測演算法：一個以搜尋經驗為基礎之新穎分群演算法 Potential Forecast Algorithm: A Novel Search-Experience-based Clustering Algorithm
系所名稱 Department	資訊工程學系 Department of Computer Science and Engineering
畢業學年期 Year, semester	106 學年度第 1 學期 The fall semester of Academic Year 106	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	95
研究生 Author	丁詠淳 Yong-Chun Ding
指導教授 Advisor	江明朝 Ming-Chao Chiang
召集委員 Convenor	李宗南 Chung-Nan Lee
口試委員 Advisory Committee	蔡崇煒, 鄺獻榮 Chun-Wei Tsai; Shiann-Rong Kuang
口試日期 Date of Exam	2017-08-28	繳交日期 Date of Submission	2017-09-11
關鍵字 Keywords	資料探勘、啟發式演算法、潛力預測演算法、分群、k-means potential forecast algorithm, metaheuristic, data mining, Clustering, k-means
統計 Statistics	本論文已被瀏覽 5685 次，被下載 0 次 The thesis/dissertation has been browsed 5685 times, has been downloaded 0 times.

中文摘要
分群問題是一個經典問題，其研究價值在於許多工程或科學甚至醫學及經濟學等，都存在這類問題，且常被使用來作為初步分析的解決方案。在解決分群問題時許多的演算法容易受到初始解或陷入區域最佳解等問題，使搜尋的最佳解品質不穩定。因此，本研究我們提出了一個以搜尋經驗為基礎的演算法架構稱為「潛力預測演算法」，其主要的概念是透過以往的搜尋資訊進行分析及推測，預測較有機會搜尋出較佳解的位置，並結合k-means 演算法作為區域搜尋的機制，以提昇分群演算法的搜尋品質。為了驗證潛力預測演算法的搜尋效能，本研究將利用潛力預測演算法與其他啟發式演算法演算法進行模擬實驗及比較分析，再更進一步的對於潛力預測演算法的參數分別進行測試實驗並整理分析結果。模擬實驗結果說明，潛力預測演算法能提昇分群結果的品質，並使搜尋品質更佳的穩定。
Abstract
Clustering is a classical problem that has been a valuable research topic because it exists in many fields, such as engineering, computer science, medical science, and economics, and it has been widely used as the initial stage in solving these problems. Many algorithms for clustering are likely to fall into local optima easily or are extremely sensitive to the initial solution of the clustering problem, thus making the quality of the end result quite unstable. Therefore, we proposed a search-experience-based algorithm, called potential forecast algorithm (PFA). The underlying idea of the proposed algorithm is to use not only the past searched information to forecast the potential positions which may end up with better solutions, it also uses k-means as a local search mechanism to improve the quality of the end result. To evaluate the performance of PFA, we compare it with other state-of-the-art algorithms. We also test and analyze the influence of all the parameters. The simulation results indicate that PFA can provide not only a better solution but also a more stable quality of the end result.

目次 Table of Contents
論文審定書 i 誌謝 iii 摘要 iv Abstract v List of Figures ix List of Tables x Chapter 1 簡介 1 1.1 動機 3 1.2 論文貢獻 4 1.3 論文架構 4 Chapter 2 相關文獻探討 5 2.1 分群問題 5 2.1.1 資料分群 5 2.1.2 分割式分群法 6 2.2 演化式演算法及分群算法 7 2.2.1 k-means Algorithm 7 2.2.2 Genetic k-Means Algorithm 9 2.2.3 Particle Swarm Optimization 11 2.2.4 Firefly Algorithm and k-means 12 2.2.5 Black Hole Algorithm 13 2.3 結論 14 Chapter 3 潛力預測演算法 16 3.1 演算法的設計概念 16 3.2 演算法架構及流程 17 3.3 潛力預測演算法解分群問題 26 3.4 潛力預測演算法時間複雜度分析 31 3.5 結論 33 Chapter 4 實驗結果 34 4.1 實驗環境及參數設定 34 4.2 模擬實驗及分析 35 4.3 參數分析 38 4.3.1 收斂實驗分析 38 4.3.1.1 儲存集合大小分析 39 4.3.1.2 分組數量分析 41 4.3.1.3 k-means疊代次數分析 44 4.3.2 非收斂實驗分析 46 4.3.2.1 儲存集合大小分析 48 4.3.2.2 分組數量分析 48 4.3.2.3 k-means疊代次數分析 50 4.4 總結 53 Chapter 5 結論與未來展望 57 5.1 結論 57 5.2 未來展望 58 Bibliography 59 Chapter A 潛力預測演算法完整虛擬碼 64 Chapter B 演算法模擬實驗結果分析說明 66 B.1 Abalone 66 B.2 Balance Scale 67 B.3 Ecoli 68 B.4 Haberman’s Survival 70 B.5 Iris 72 B.6 Letter Recognition 72 B.7 Liver Disorders 74 B.8 SPECT Heart 76 B.9 SPECTF Heart 77 B.10 Wine 78 B.11 Yeast 80 B.12 Statlog (Shuttle) 80

參考文獻 References
[1] R. S. Michalski, J. G. Carbonell, and T. M. Mitchell, Machine learning: An artificial intelligence approach. Springer Science & Business Media, 2013. [2] P. Berkhin, “A survey of clustering data mining techniques,” in Grouping multidimensional data. Springer, 2006, pp. 25–71. [3] I. H.Witten, E. Frank, M. A. Hall, and C. J. Pal, Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, 2016. [4] J. S. Ahlquist and C. Breunig, “Model-based clustering and typologies in the social sciences,” Political Analysis, vol. 20, no. 1, pp. 92–112, 2012. [5] R. Xu and D. Wunsch, “Survey of clustering algorithms,” IEEE Transactions on neural networks, vol. 16, no. 3, pp. 645–678, 2005. [6] B. Everitt, S. Landau, and M. Leese, “Cluster analysis. 2001,” Arnold, London, 2001. [7] J. A. Hartigan, Clustering Algorithms. John Wiley & Sons, Inc., 1975. [8] A. K. Jain, M. N. Murty, and P. J. Flynn, “Data clustering: a review,” ACM computing surveys (CSUR), vol. 31, no. 3, pp. 264–323, 1999. [9] A. K. Jain, “Data clustering: 50 years beyond K-means,” Pattern recognition letters, vol. 31, no. 8, pp. 651–666, 2010. [10] J. MacQueen, “Some methods for classification and analysis of multivariate observations,” in Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol. 1, no. 14, 1967, pp. 281–297. [11] H. Steinhaus, “Sur la division des corp materiels en parties,” Bull. Acad. Polon. Sci, vol. 1, no. 804, p. 801, 1956. [12] L. Fortnow, “The status of the P versus NP problem,” Communications of the ACM, vol. 52, no. 9, pp. 78–86, 2009. [13] L. G. Roberts, “Beyond Moore’s law: Internet growth trends,” Computer, vol. 33, no. 1, pp. 117–119, 2000. [14] X.-S. Yang, Nature-inspired metaheuristic algorithms. Luniver press, 2010. [15] J. H. Holland, Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. MIT press, 1992. [16] D. E. Goldberg and J. H. Holland, “Genetic algorithms and machine learning,” Machine learning, vol. 3, no. 2, pp. 95–99, 1988. [17] M. Dorigo, V. Maniezzo, and A. Colorni, “Ant system: optimization by a colony of cooperating agents,” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 26, no. 1, pp. 29–41, 1996. [18] M. Dorigo, M. Birattari, and T. Stutzle, “Ant colony optimization,” IEEE computational intelligence magazine, vol. 1, no. 4, pp. 28–39, 2006. [19] J. Kennedy, “Particle swarm optimization,” in Encyclopedia of machine learning. Springer, 2011, pp. 760–766. [20] R. Poli, J. Kennedy, and T. Blackwell, “Particle swarm optimization,” Swarm intelligence, vol. 1, no. 1, pp. 33–57, 2007. [21] R. Eberhart and J. Kennedy, “A new optimizer using particle swarm theory,” in Proceedings of the Sixth International Symposium on Micro Machine and Human Science, 1995, pp. 39–43. [22] J. Kennedy and R. Eberhart, “Particle swarm optimization,” in Proceedings of IEEE International Conference on Neural Networks., vol. 4, 1995, pp. 1942–1948 vol.4. [23] D. Karaboga and B. Basturk, “A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm,” Journal of global optimization, vol. 39, no. 3, pp. 459–471, 2007. [24] D. Karaboga and B. Basturk, “On the performance of artificial bee colony (ABC) algorithm,” Applied soft computing, vol. 8, no. 1, pp. 687–697, 2008. [25] D. Karaboga and B. Akay, “A comparative study of artificial bee colony algorithm,” Applied mathematics and computation, vol. 214, no. 1, pp. 108–132, 2009. [26] X.-S. Yang and S. Deb, “Cuckoo search via l´evy flights,” in Proceedings of World Congress on Nature & Biologically Inspired Computing., 2009, pp. 210–214. [27] X.-S. Yang and S. Deb, “Engineering optimisation by cuckoo search,” International Journal of Mathematical Modelling and Numerical Optimisation, vol. 1, no. 4, pp. 330–343, 2010. [28] X.-S. Yang, “Firefly algorithms for multimodal optimization,” in Proceedings of International symposium on stochastic algorithms, 2009, pp. 169–178. [29] X.-S. Yang, “Firefly algorithm, stochastic test functions and design optimisation,” International Journal of Bio-Inspired Computation, vol. 2, no. 2, pp. 78–84, 2010. [30] F.W. Glover and G. A. Kochenberger, Handbook of metaheuristics. Springer Science & Business Media, 2006, vol. 57. [31] S. Das, A. Abraham, and A. Konar, Metaheuristic clustering. Springer, 2009, vol. 178. [32] A. K. Jain and R. C. Dubes, Algorithms for clustering data. Prentice-Hall, Inc., 1988. [33] I. Guyon and A. Elisseeff, “An introduction to feature extraction,” Feature extraction, pp. 1–25, 2006. [34] I. Guyon and A. Elisseeff, “An introduction to variable and feature selection,” J. Mach. Learn. Res., vol. 3, pp. 1157–1182, 2003. [35] A. Jain and D. Zongker, “Feature selection: evaluation, application, and small sample performance,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 2, pp. 153–158, 1997. [36] H. Liu and L. Yu, “Toward integrating feature selection algorithms for classification and clustering,” IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 4, pp. 491–502, 2005. [37] C.-W. Tsai, T.-Y. Lin, M.-C. Chiang, C.-S. Yang, and T.-P. Hong, “Continuous space pattern reduction for genetic clustering algorithm,” in Proceedings of the 14th annual conference companion on Genetic and evolutionary computation, 2012, pp. 1475–1476. [38] M.-C. Chiang, C.-W. Tsai, and C.-S. Yang, “A time-efficient pattern reduction algorithm for k-means clustering,” Information Sciences, vol. 181, no. 4, pp. 716–731, 2011. [39] C.-W. Tsai, S.-P. Tseng, M.-C. Chiang, and C.-S. Yang, “A framework for accelerating metaheuristics via pattern reduction,” in Proceedings of the 12th annual conference on Genetic and evolutionary computation, 2010, pp. 293–294. [40] C.-W. Tsai, C.-Y. Lee, M.-C. Chiang, and C.-S. Yang, “A fast VQ codebook generation algorithm via pattern reduction,” Pattern Recognition Letters, vol. 30, no. 7, pp. 653–660, 2009. [41] S.-P. Tseng, C.-W. Tsai, M.-C. Chiang, and C.-S. Yang, “Fast genetic algorithm based on pattern reduction,” in Proceedings of IEEE International Conference on Systems, Man and Cybernetics, 2008, pp. 214–219. [42] C.-Y. Lee, C.-W. Tsai, M.-C. Chiang, and C.-S. Yang, “Fast VQ codebook generation via pattern reduction,” in Proceedings of IEEE International Conference on Systems, Man and Cybernetics, 2008, pp. 256–261. [43] C.-W. Tsai, C.-S. Yang, and M.-C. Chiang, “A time efficient pattern reduction algorithm for k-means based clustering,” in Proceedings of IEEE International Conference on Systems, Man and Cybernetics, 2007, pp. 504–509. [44] S. J. Nanda and G. Panda, “A survey on nature inspired metaheuristic algorithms for partitional clustering,” Swarm and Evolutionary computation, vol. 16, pp. 1–18, 2014. [45] C. C. Aggarwal and C. K. Reddy, Data clustering: algorithms and applications. CRC press, 2013. [46] K. Krishna and M. N. Murty, “Genetic K-means algorithm,” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 29, no. 3, pp. 433–439, 1999. [47] J. H. Holland, “Adaptation in natural and artificial systems. an introductory analysis with application to biology, control, and artificial intelligence,” University of Michigan Press, 1975. [48] D. W. van der Merwe and A. P. Engelbrecht, “Data clustering using particle swarm optimization,” in Proceedings of Congress on Evolutionary Computation., vol. 1, 2003, pp. 215–220 Vol.1. [49] T. Hassanzadeh and M. R. Meybodi, “A new hybrid approach for data clustering using firefly algorithm and K-means,” in Proceedings of 16th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP), 2012, pp. 007–011. [50] A. Hatamlou, “Black hole: A new heuristic optimization approach for data clustering,” Information sciences, vol. 222, pp. 175–184, 2013. [51] J. Zhang, K. Liu, Y. Tan, and X. He, “Random black hole particle swarm optimization and its application,” in Proceedings of International Conference on Neural Networks and Signal Processing, 2008, pp. 359–365. [52] C. W. Tsai, “Search economics: A solution space and computing resource aware search method,” in Proceedings of IEEE International Conference on Systems, Man, and Cybernetics, 2015, pp. 2555–2560. [53] M. Lichman, “UCI machine learning repository,” 2013. [Online]. Available: http://archive.ics.uci.edu/ml

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：永不公開 not available 校外 Off-campus：永不公開 not available 您的 IP(校外) 位址是 13.59.36.203 論文開放下載的時間是校外不公開 Your IP address is 13.59.36.203 This thesis will be available to you on Indicate off-campus access is not available.
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 永不公開 not available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS