Responsive image
博碩士論文 etd-0907112-210106 詳細資訊
Title page for etd-0907112-210106
論文名稱
Title
一個新的自動決定群數多目標演化式分群演算法
A Novel Multiobjective EA-based Clustering Algorithm with Automatic Determination of the Number of Clusters
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
49
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2012-07-27
繳交日期
Date of Submission
2012-09-07
關鍵字
Keywords
多樣性、多目標分群、分群
diversity, multiobjective clustering, Clustering
統計
Statistics
本論文已被瀏覽 5701 次,被下載 0
The thesis/dissertation has been browsed 5701 times, has been downloaded 0 times.
中文摘要
在不了解資料的情況下要事先決定分群數對於資料分群問題而言是一件困難的研究議題。一個有效率的多目標演化式分群演算法被提出來克服上述的問題並且提供了更好的分群結果。我們所提出的演算法已經不同於傳統的演化式演算法只使用一個交配運算和一個突變運算,而是使用超過一個的交配運算和突變運算,而這些運算被各自放入交配候選池和突變候選池中,在每次突變及交配運算時以供隨機挑選用來增加每代搜尋的多樣性。最後利用多個著名的資料集來估算本文所提出的演算法效能,實驗結果顯示此演算法不僅能夠自動決定分群的數量,並且相較我們在此所比較的分群演算法能夠提供較好的分群結果。
Abstract
Automatically determining the number of clusters without a priori knowledge is a difficult research issue for data clustering problem. An effective multiobjective evolutionary algorithm based clustering algorithm is proposed to not only overcome this problem but also provide a better clustering result in this study. The proposed algorithm differs from the traditional evolutionary algorithm in the sense that instead of a single crossover operator and a single mutation operator, the proposed algorithm uses a pool of crossover operators and a pool of mutation operators that are selected at random to increase the search diversity. To evaluate the performance of the proposed algorithm, several well-known datasets are used. The simulation results show that not only can the proposed algorithm automatically determine the number of clusters, but it can also provide a better clustering result.
目次 Table of Contents
論文審定書i
誌謝iii
摘要iv
Abstract v
List of Figures viii
List of Tables ix
Chapter 1 簡介1
1.1 動機. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 論文貢獻. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 論文架構. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Chapter 2 文獻探討5
2.1 分群問題. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 多目標最佳化問題. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.1 支配(domination)觀念. . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.2 柏拉圖最佳化(pareto optimality) . . . . . . . . . . . . . . . . . . . . 9
2.3 多目標最佳化演算法. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.1 NSGA-II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.2 PESA-II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 多目標分群演算法. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.1 MOCK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4.2 VRJGGA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4.3 VAMOSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.5 總結. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Chapter 3 新的自動決定群數多目標演化式分群演算法17
3.1 多個資訊交換的多目標分群演算法. . . . . . . . . . . . . . . . . . . . . . 17
3.1.1 染色體表示方式. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.1.2 初始化. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.1.3 目標函式. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.1.4 多種資訊交換突變運算. . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1.5 多種資訊交換交配運算. . . . . . . . . . . . . . . . . . . . . . . . . 24
3.1.6 最後解挑選方法. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Chapter 4 實驗結果28
4.1 環境,參數設定,資料集合. . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.2 實驗結果. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2.1 分群品質. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2.2 正確分群次數. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Chapter 5 結論與未來展望34
Bibliography 35
參考文獻 References
[1] G. Piatetsky-Shapiro and W. J. Frawley, eds., Knowledge Discovery in Databases.
AAAI/MIT Press, 1991.
[2] G. B. Coleman and H. C. Andrews, “Image segmentation by clustering,” Proceedings of the IEEE, vol. 67, no. 5, pp. 773–785, 1979.
[3] A. K. Jain, R. P. W. Duin, and J. Mao, “Statistical pattern recognition: A review,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 4–37, 2000.
[4] D. A. Lelewer and D. S. Hirschberg, “Data compression,” ACM Computing Surveys, vol. 19, no. 3, pp. 261–296, 1987.
[5] A. K. Jain, M. N. Murty, and P. J. Flynn, “Data clustering: A review,” ACM Computing Surveys, vol. 31, no. 3, pp. 264–323, 1999.
[6] A. K. Jain and R. C. Dubes, “Algorithms for clustering data,” in Prentice Hall, Englewood Cliffs, New Jersey, 1988.
[7] J. B. MacQueen, “Some methods for classification and analysis of multivariate observations,” in Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297, 1967.
[8] C. Blum and A. Roli, “Metaheuristics in combinatorial optimization: Overview and conceptual comparison,” ACM Computing Surveys, vol. 35, no. 3, pp. 268–308, 2003.
[9] K. S. Al-Sultan, “A tabu search approach to the clustering problem,” Pattern Recognition, vol. 28, no. 9, pp. 1443–1451, 1995.
[10] S. Z. Selim and K. Alsultan, “A simulated annealing algorithm for the clustering problem,” Pattern Recognition, vol. 24, no. 10, pp. 1003–1008, 1991.
[11] D. W. van der Merwe and A. P. Engelbrecht, “Data clustering using particle swarm optimization,” in IEEE Congress on Evolutionary Computation (1), pp. 215–220, 2003.
[12] P. S. Shelokar, V. K. Jayaraman, and B. D. Kulkarni, “An ant colony approach for clustering,” Analytica Chimica Acta, vol. 509, no. 2, pp. 187–195, 2004.
[13] E. R. Hruschka, R. J. G. B. Campello, A. A. Freitas, and A. C. P. L. F. de Carvalho, “A survey of evolutionary algorithms for clustering,” IEEE Transactions on Systems, Man, and Cybernetics, Part C, vol. 39, no. 2, pp. 133–155, 2009.
[14] M. Steinbach, G. Karypis, and V. Kumar in Proceedings of KDD Workshop on Text Mining, pp. 1–2, 2000.
[15] J. Handl and J. D. Knowles, “An evolutionary approach to multiobjective clustering,” IEEE Transactions on Evolutionary Computation, vol. 11, no. 1, pp. 56–76, 2007.
[16] P. K. Prasad and C. P. Rangan, “Privacy preserving birch algorithm for clustering over arbitrarily partitioned databases,” in Advanced Data Mining and Applications, pp. 146–157, 2007.
[17] K. Deb, ed., Multi-Objective Optimization using Evolutionary Algorithms. Chichester, UK: John Wiley, 2001.
[18] K. Deb, S. Agrawal, A. Pratap, and T. Meyarivan, “A fast elitist non-dominated sorting genetic algorithm for multi-objective optimisation: NSGA-II,” in Parallel Problem Solving from Nature, pp. 849–858, 2000.
[19] D.W.Corne, N.R.Jerram, J.D.Knowles, and M.J.Oates, “PESA-II: region-based selection in evolutionary multiobjective optimization,” in Proceedings of the Genetic and Evolutionary Computation Conference, pp. 283–290, 2001.
[20] S. Bandyopadhyay, S. Saha, U. Maulik, and K. Deb, “A simulated annealing-based multiobjective optimization algorithm: AMOSA,” IEEE Transactions on Evolutionary Computation, vol. 12, no. 3, pp. 269–283, 2008.
[21] C. A. C. Coello and M. S. Lechuga, “MOPSO: A proposal for multiple objective particle swarm optimization,” IEEE Proceedings World Congress on Computational Intelligence, vol. 2, pp. 1051–1056, 2002.
[22] C. Garc’ıa-Mart’ınez, O. Cord’on, and F. Herrera, “An empirical analysis of multiple objective ant colony optimization algorithms for the bi-criteria tsp,” in ANTS Workshop, pp. 61–72, 2004.
[23] U. Maulik and S. Bandyopadhyay, “Pareto-based multi-objective differential evolution,” Proceedings of the Congress on Evolutionary Computation, vol. 2, p. 862–869, 2003.
[24] N. Srinivas and K. Deb, “Multiobjective optimization using nondominated sorting in genetic algorithms,” Evolutionary Computation, vol. 2, no. 3, pp. 221–248, 1994.
[25] K. S. N. Ripon, C. H. Tsang, and S. Kwong, “Multi-objective data clustering using variable-length real jumping genes genetic algorithm and local search method,” in International Joint Conference on Neural Networks, pp. 3609–3616, 2006.
[26] S. Saha and S. Bandyopadhyay, “A symmetry based multiobjective clustering technique for automatic evolution of clusters,” Pattern Recognition, vol. 43, no. 3, pp. 738–751, 2010.
[27] X. L. Xie and G. Beni, “A validity measure for fuzzy clustering,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 13, pp. 841–847, 1991.
[28] S. Bandyopadhyay and S. Saha, “A point symmetry-based clustering technique for automatic evolution of clusters,” IEEE Transactions on Knowledge and Data Engineering, vol. 20, pp. 1–17, 2008.
[29] S. Saha and S. Bandyopadhyay, “Application of a new symmetry-based cluster validity index for satellite image segmentation,” IEEE Geoscience and Remote Sensing Letters, vol. 5, pp. 166–170, 2008.
[30] G. Syswerda, “Uniform crossover in genetic algorithms,” in Proceedings of the Third International Conference on Genetic Algorithms, pp. 2–9, 1989.
[31] M. K. Pakhira, S. Bandyopadhyay, and U. Maulik, “Validity index for crisp and fuzzy clusters,” Pattern Recognition, vol. 37, no. 3, pp. 487–501, 2004.
[32] D. L. Davies and D. W. Bouldin, “A cluster separation measure,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 1, pp. 224–227, 1979.
[33] UCI-Machine Learning Repository, 2012. Available at http://archive.ics.uci.edu/
ml/.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus:永不公開 not available
校外 Off-campus:永不公開 not available

您的 IP(校外) 位址是 18.191.46.36
論文開放下載的時間是 校外不公開

Your IP address is 18.191.46.36
This thesis will be available to you on Indicate off-campus access is not available.

紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code