Responsive image
博碩士論文 etd-0203109-233634 詳細資訊
Title page for etd-0203109-233634
論文名稱
Title
以模糊集合理論改善支持向量機之增量學習演算法
Enhancement of Incremental Learning Algorithm for Support Vector Machines Using Fuzzy Set Theory
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
76
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2008-07-29
繳交日期
Date of Submission
2009-02-03
關鍵字
Keywords
分類、模糊集合理論、增量學習、支持向量機
Fuzzy Set Theory, Classification, SVM, Incremental Learning
統計
Statistics
本論文已被瀏覽 5680 次,被下載 0
The thesis/dissertation has been browsed 5680 times, has been downloaded 0 times.
中文摘要
近年來,有相當多關於支持向量機(Support Vector Machines, SVMs)的研究。支持向量機被廣泛地應用在許多領域中且具有相當不錯的分類(預測)率。但面對資料集中含有大量資料的情況下,便需要冗長的計算時間與龐大的記憶體空間。為了降低運算時的複雜度,學者們提出增量學習的演算法,用來協助處理那些待訓練的資料。一些研究指出,某些在超平面附近非支持向量的資料會有助於學習過程。因此,本研究對於N. A. Syed的方法進行改良,提出三種新的增量學習演算法: 混合式增量學習演算法(MIL),半混合式增量學習演算法(HMIL)與回合氏增量學習演算法(PIL),之後再加入模糊集合理論來協助分類測試資料,期待獲得較佳的分類準確率。本實驗分別探討三種測試的方法與學習採用的回合次數對於分類結果的影響並針對五種不同的資料集,進行準確度的測試。實驗結果顯示,與其他採用增量學習或是主動學習演算法之模擬結果比較,MIL皆提供了不錯的分類準確度。而HMIL以及PIL則特別針對在某些研究獲得分類高準確度的資料集上,獲得更近一步的效能改善。
Abstract
Over the past few years, a considerable number of studies have been made on Support Vector Machines (SVMs) in many domains to improve classification or prediction. However, SVMs request high computational time and memory when the datasets are large. Although incremental learning techniques are viewed as one possible solution developed to reduce the computation complexity of the scalability problem, few studies have considered that some examples close to the decision hyperplane other than support vectors (SVs) might contribute to the learning process. Consequently, we propose three novel algorithms, named Mixed Incremental learning (MIL), Half-Mixed Incremental learning (HMIL), and Partition Incremental learning (PIL), by improving Syed’s incremental learning method based on fuzzy set theory. We expect to achieve better accuracy than other methods. In the experiments, the proposed algorithms are investigated on five standard machine learning benchmark datasets to demonstrate the effectiveness of the method. Experimental results show that HIL have superior classification accuracy than the other incremental or active learning algorithms. Especially, for the datasets that might have high accuracy in other research reports, HMIL and PIL could even improve the performance.
目次 Table of Contents
Chapter 1 Introduction 1
Chapter 2 Literature Reviews 3
2.1 Support Vector Machines 3
2.2 Fuzzy Set Theory 6
2.3 Related Work 15
Chapter 3 The Proposed Methods 18
3.1 Incremental Training 20
3.1.1 Mixed Incremental Training 20
3.1.2 Half-Mixed Incremental Training 23
3.1.3 Partitional Incremental Training 25
3.2 Extension based on fuzzy set theory 27
3.3 Test Methods 30
Chapter 4 Simulation 33
4.1 Datasets 34
4.2 Simulation 1 - Comparison of Proposed Test Methods with 10-Fold Cross Validation 36
4.3 Simulation 2 - Analysis on the Number of Iterations 40
4.4 Simulation 3 - Evaluation of Candidate Examples 51
4.5 Simulation 4 - Performance Measure of Learning Algorithms 58
Chapter 5 Conclusion 65
References 67
參考文獻 References
[1] C. Cambell, N. Cristianini, A. Smola, Query learning with large margin classifiers, Proceedings of the 17th International Conference on Machine Learning, Stanford University, CA, June 2000, pp.111-118.
[2] Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
[3] C. Cheng, Frank Y. Shih, An improved incremental training algorithm for support vector machines using active query, Pattern Recognition, Volume 40, Issue 3, March 2007, Pages 964-971.
[4] C. Cortes, V. Vapnik, Support-vector network, March. Learn. 20 (1995) 273-297.
[5] H. Drucker, D. Wu, and V. N. Vapnik, Support vector machines for spam categorization, IEEE Trans. Neural Networks, vol. 10, no. 5, pp. 1048-1054, 1999.
[6] R. E. Fan, P. H. Chen, and C. J. Lin, Working set selection using the second order information for training SVM. Journal of Machine Learning Research 6, 1889-1918, 2005
[7] J. S. R. Jang, C. T. Sun, E. Mizutani, Neuro-Fuzzy and Soft Computing. Prentice-Hall, Inc. Upper Saddle River, NJ, USA, 1997.
[8] T. Joachims, Text Categorization with Support Vector Machines: Learning with Many Relevant Features. Proceedings of the European Conference on Machine Learning (ECML), Springer, 1998.
[9] P. Mitra, C. A. Murthy, S. K. Pal, A probabilistic active support vector learning algorithm, IEEE Trans, Pattern Anal. Mach. Intel. 26(3) (2004) 413-418.
[10] D. J. Newman, S. Hettich, C. L. Blake, C. J. Merz. UCI Repository of machine learning databases [http://www.ics.uci.edu/~mlearn/MLRepository.html]. Irvine, CA: University of California, Department of Information and Computer Science, 1998.
[11] H. T. Nguyen, N. R. Prasad, C. L. Walker, E. A. Walker, A First Course in Fuzzy and Neural control. Chapman & Hall/CRC, Boca Raton, FL, 2003.
[12] H. T. Nguyen, A. Smeulders, Active learning using pre-clustering, Proceedings of the 21st International Conference on Machine Learning, Alberta, Canada, July 2004.
[13] E. Osuna, R. Freund, F. Girosi. An improved training algorithm for support vector machines. In Proceedings of IEEE NNSP’97, Amelia Island, FL, 1997.
[14] G. Schohn, D. Cohn, Less is more: active learning with support vector machines, Proceedings of the 17th International Conference on Machine Learning, Stanford University, CA, June 2000, pp.839-846.
[15] N.A. Syed, H. Liu, K. K. Sung, Incremental learning with support vector machines, Proceedings of the International Joint Conference on Artificial Intelligence, Stockholm, Sweden, July 1999.
[16] S. Tong and D. Koller, Support Vector Machine Active Learning with Applications to Text Classification, Journal of Machine Learning Research, vol. 2, pp.45-66, 2001.
[17] V. Vapnik, The Nature of Statistical Learning Theory. New York: Springer-Verlag, 1995.
[18] L. A. Zadeh, Fuzzy Sets, Information and Control 8, 338-353(1965).
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內校外均不公開 not available
開放時間 Available:
校內 Campus:永不公開 not available
校外 Off-campus:永不公開 not available

您的 IP(校外) 位址是 18.119.17.207
論文開放下載的時間是 校外不公開

Your IP address is 18.119.17.207
This thesis will be available to you on Indicate off-campus access is not available.

紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code