論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus:永不公開 not available
校外 Off-campus:永不公開 not available
論文名稱 Title |
彈性網路演算法解決自動分群問題 An Elastic Net Algorithm for Automatic Clustering |
||
系所名稱 Department |
|||
畢業學年期 Year, semester |
語文別 Language |
||
學位類別 Degree |
頁數 Number of pages |
52 |
|
研究生 Author |
|||
指導教授 Advisor |
|||
召集委員 Convenor |
|||
口試委員 Advisory Committee |
|||
口試日期 Date of Exam |
2014-06-26 |
繳交日期 Date of Submission |
2014-08-20 |
關鍵字 Keywords |
分群、彈性網路分群演算法、分群數目、線性不可分割之資料、自動分群 automatic clustering, number of clusters, clustering, elastic net clustering algorithm, non-linearly separable data |
||
統計 Statistics |
本論文已被瀏覽 5691 次,被下載 0 次 The thesis/dissertation has been browsed 5691 times, has been downloaded 0 times. |
中文摘要 |
分群為分析未知資料的重要工具之一,並且在各個不同的領域中,扮演著相當 重要的角色。即使目前已有許多學者致力於研究分群問題,其中仍有幾個困難的議 題尚未完全地解決。本文提出一個以彈性網路分群演算法為基礎的新型演算法,解 決這之中兩項重要的研究議題,分別為:對線性不可分割之資料分群,以及自動決 定資料的分群數目。為了評估本文所提之演算法的效能,本研究使用了數個較常見 的資料集,並將此演算法與其他相關演算法進行比較。實驗結果顯示,此演算法不 僅能找出合適的分群數目,且對於大多數的測試資料,也能得到較高的準確率。 |
Abstract |
Clustering has always been playing a vital role in many different disciplines because it is an important tool for analyzing a set of unknown input patterns. However, some important issues related to clustering, such as automatically determining the number of clusters and partitioning non-linearly separable data, are never fully solved even though many researchers work on this subject for a long time. As such, a novel method based on the so-called elastic net clustering al- gorithm is presented in this thesis to deal with exactly the two issues: partitioning non-linearly separable data and automatically determining the number of clusters. To evaluate the perfor- mance of the proposed algorithm, we compare it with several state-of-the-art methods using several well-known datasets. The experimental results show that not only can the proposed algorithm find the appropriate number of clusters, it can also provide a higher accuracy rate than all the other methods compared in this study for most datasets. |
目次 Table of Contents |
論文審定書 i 誌謝 iii 摘要 iv Abstract v List of Figures viii List of Tables ix Chapter 1 簡 介 1 1.1 動機 2 1.2 論文貢獻 2 1.3 論文架構 3 Chapter 2 文獻探討 4 2.1 資料分群問題 4 2.2 彈性網路相關演算法 6 2.2.1 彈性網路演算法 6 2.2.2 彈性網路分群演算法 8 2.3 演化式計算求解分群問題 10 2.3.1 差分進化演算法解分群問題 11 2.3.2 基因演算法解自動分群題 12 2.3.3 差分進化演算法解自動分群問題 14 2.4 分群指標 17 2.4.1 Davies-Bouldin index 17 2.4.2 PBM-index 18 2.5 總結 18 Chapter 3 彈性網路演算法解自動分群問題 19 3.1 演算法概念 19 3.2 演算法流程 20 3.2.1 彈性網路演算法 21 3.2.2 分割 21 3.2.3 合併 22 3.2.4 提取 22 3.3 範例 23 Chapter 4 實驗結果 27 4.1 執行環境、參數設定、資料集介紹 27 4.2 模擬結果 28 4.2.1 評估正確分群數 29 4.2.2 評估分配資料點的正確率 31 4.3 分析 33 4.4 總結 34 Chapter 5 結論與未來展望 35 5.1 結論 35 5.2 未來展望 35 Bibliography 37 |
參考文獻 References |
[1] R. Xu and I. Wunsch, D., “Survey of clustering algorithms,” IEEE Transactions on Neural Networks, vol. 16, no. 3, pp. 645–678, 2005. [2] A. K. Jain, M. N. Murty, and P. J. Flynn, “Data clustering: A review,” ACM Computing Survey, vol. 31, no. 3, pp. 264–323, 1999. [3] G. Coleman and H. C. Andrews, “Image segmentation by clustering,” Proceedings of the IEEE, vol. 67, no. 5, pp. 773–785, 1979. [4] R. Kosala and H. Blockeel, “Web mining research: A survey,” ACM SIGKDD Explorations Newsletter, vol. 2, no. 1, pp. 1–15, 2000. [5] G. Getz, H. Gal, I. Kela, D. A. Notterman, and E. Domany, “Coupled two-way clustering analysis of breast cancer and colon cancer gene expression data,” Bioinformatics, vol. 19, no. 9, pp. 1079–1089, 2003. [6] G. Punj and D. W. Stewart, “Cluster analysis in marketing research: Review and suggestions for application,” Journal of Marketing Research, vol. 20, no. 2, pp. 134–148, 1983. [7] W. J. Welch, “Algorithmic complexity: three NP-hard problems in computational statistics,” Journal of Statistical Computation and Simulation, vol. 15, no. 1, pp. 17–25, 1982. [8] J. MacQueen, “Some methods for classification and analysis of multivariate observations,” in Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297, 1967. [9] S. Z. Selim and K. Alsultan, “A simulated annealing algorithm for the clustering problem,” Pattern Recognition, vol. 24, no. 10, pp. 1003–1008, 1991. [10] K. S. Al-Sultan, “A tabu search approach to the clustering problem,” Pattern Recognition, vol. 28, no. 9, pp. 1443–1451, 1995. [11] D. Goldberg and J. Holland, “Genetic algorithms and machine learning,” Machine Learning, vol. 3, no. 2-3, pp. 95–99, 1988. [12] D. W. Van Der Merwe and A. Engelbrecht, “Data clustering using particle swarm optimization,” in Proceedings of the IEEE Congress on Evolutionary Computation, vol. 1, pp. 215–220, 2003. [13] M. Dorigo, V. Maniezzo, and A. Colorni, “Ant system: optimization by a colony of cooperating agents,” IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 26, no. 1, pp. 29–41, 1996. [14] A. K. Jain, “Data clustering: 50 years beyond k-means,” Pattern Recognition Letters, vol. 31, no. 8, pp. 651 – 666, 2010. [15] S. Bandyopadhyay and U. Maulik, “Genetic clustering for automatic evolution of clusters and application to image classification,” Pattern Recognition, vol. 35, no. 6, pp. 1197–1208, 2002. [16] S. Das, A. Abraham, and A. Konar, “Automatic kernel clustering with a multi-elitist particle swarm optimization algorithm,” Pattern Recognition Letters, vol. 29, no. 5, pp. 688–699, 2008. [17] R. Durbin and D. Willshaw, “An analogue approach to the travelling salesman problem using an elastic net method,” Nature, vol. 326, no. 6114, pp. 689–691, 1987. [18] C. W. Tsai, C. H. Tung, and M. C. Chiang, “An elastic net clustering algorithm for non-linearly separable data,” in Proceedings of the Intelligent Information and Database Systems, vol. 7802, pp. 108–116, 2013. [19] A. K. Jain and R. C. Dubes, Algorithms for Clustering Data. Upper Saddle River, NJ, USA: Prentice-Hall Incorporated, 1988. [20] J. H. Ward, “Hierarchical grouping to optimize an objective function,” Journal of the American Statistical Association, vol. 58, no. 301, pp. 236–244, 1963. [21] J. Bezdek, R. Ehrlich, and W. Full, “FCM: The fuzzy c-means clustering algorithm,” Computers and Geosciences, vol. 10, pp. 191–203, 1984. [22] J. Yi, G. Yang, Z. Zhang, and Z. Tang, “An improved elastic net method for traveling salesman problem,” Neurocomputing, vol. 72, no. 4–6, pp. 1329–1335, 2009. [23] M. Al-Mulhem and T. Al-Maghrabi, “Efficient convex-elastic net algorithm to solve the euclidean traveling salesman problem,” IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 28, no. 4, pp. 618–620, 1998. [24] D. Burr, “An improved elastic net method for the traveling salesman problem,” in Proceedings of the IEEE International Conference on Neural Networks, pp. 69–76, 1988. [25] M. Boeres, L. de Carvalho, and V. Barbosa, “A faster elastic-net algorithm for the traveling salesman problem,” in Proceedings of the International Joint Conference on Neural Networks, vol. 2, pp. 215–220, 1992. [26] E. Hruschka, R. J. G. B. Campello, A. Freitas, and A. C. P. L. F. De Carvalho, “A survey of evolutionary algorithms for clustering,” IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, vol. 39, no. 2, pp. 133–155, 2009. [27] R. Storn and K. Price, “Differential evolution : A simple and efficient heuristic for global optimization over continuous spaces,” Journal of Global Optimization, vol. 11, pp. 341–359, 1997. [28] S. Paterlini and T. Krink, “High performance clustering with differential evolution,” in Proceedings of the IEEE Congress on Evolutionary Computation, vol. 2, pp. 2004–2011, 2004. [29] S. Paterlini and T. Krink, “Differential evolution and particle swarm optimisation in partitional clustering,” Computational Statistics & Data Analysis, vol. 50, no. 5, pp. 1220–1247, 2006. [30] D. L. Davies and D. W. Bouldin, “A cluster separation measure,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 1, no. 2, pp. 224–227, 1979. [31] S. Das, A. Abraham, and A. Konar, “Automatic clustering using an improved differential evolution algorithm,” IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans, vol. 38, no. 1, pp. 218–237, 2008. [32] C. W. Tsai, C. A. Tai, and M. C. Chiang, “An automatic data clustering algorithm based on differential evolution,” in Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, pp. 794–799, 2013. [33] K. S. Tan, N. A. M. Isa, and W. H. Lim, “Color image segmentation using adaptive unsupervised clustering approach,” Applied Soft Computing, vol. 13, no. 4, pp. 2017–2036, 2013. [34] C. H. Chou, M. C. Su, and E. Lai, “A new cluster validity measure and its application to image compression,” Pattern Analysis and Applications, vol. 7, no. 2, pp. 205–220, 2004. [35] K. L. Wu and M. S. Yang, “A cluster validity index for fuzzy clustering,” Pattern Recognition Letters, vol. 26, no. 9, pp. 1275 – 1291, 2005. [36] M. K. Pakhira, S. Bandyopadhyay, and U. Maulik, “Validity index for crisp and fuzzy clusters,” Pattern Recognition, vol. 37, no. 3, pp. 487 – 501, 2004. [37] S. Wu, A. Liew, H. Yan, and M. Yang, “Cluster analysis of gene expression data based on self-splitting and merging competitive learning,” IEEE Transactions on Information Technology in Biomedicine, vol. 8, no. 1, pp. 5–15, 2004. [38] A. Bensaid, L. Hall, J. Bezdek, L. P. Clarke, M. Silbiger, J. Arrington, and R. Murtagh, “Validity-guided (re)clustering with applications to image segmentation,” IEEE Transactions on Fuzzy Systems, vol. 4, no. 2, pp. 112–123, 1996. [39] G. Borges and M. Aldon, “A split-and-merge segmentation algorithm for line extraction in 2d range images,” in Proceedings of 15th International Conference on Pattern Recognition., vol. 1, pp. 441–444 vol.1, 2000. [40] Speech and Image Processing Unit, School of Computing, University of Eastern Finland, “Clustering datasets,” Accessed on 2013. [41] K. Bache and M. Lichman, “UCI machine learning repository,” Accessed on 2013. [42] S. N. Goodman, “Toward evidence-based medical statistics. 1: The p value fallacy,” Annals of Internal Medicine, vol. 130, no. 12, pp. 995–1004, 1999. [43] S. N. Goodman, “Toward evidence-based medical statistics. 2: The bayes factor,” Annals of Internal Medicine, vol. 130, no. 12, pp. 1005–1013, 1999. |
電子全文 Fulltext |
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。 論文使用權限 Thesis access permission:自定論文開放時間 user define 開放時間 Available: 校內 Campus:永不公開 not available 校外 Off-campus:永不公開 not available 您的 IP(校外) 位址是 3.128.198.21 論文開放下載的時間是 校外不公開 Your IP address is 3.128.198.21 This thesis will be available to you on Indicate off-campus access is not available. |
紙本論文 Printed copies |
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。 開放時間 available 永不公開 not available |
QR Code |