Responsive image
博碩士論文 etd-0728105-154002 詳細資訊
Title page for etd-0728105-154002
論文名稱
Title
建構依使用者喜好之演進式學習電子郵件分類器
Constructing an E-mail Classifier Based on User's Preferences with Adaptive Learning
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
50
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2005-07-22
繳交日期
Date of Submission
2005-07-28
關鍵字
Keywords
電子郵件、使用者喜好、演進式學習、分類器
classifier, E-mail, users’ preferences, adaptive learning
統計
Statistics
本論文已被瀏覽 5828 次,被下載 7
The thesis/dissertation has been browsed 5828 times, has been downloaded 7 times.
中文摘要
電子郵件是現代人最常使用的通訊工具,然而許多業者為了強迫促銷,便大量的傳送電子郵件以達到廣告宣傳的效果,造成了使用者的困擾。近來,許多學者提出利用資料探勘、人工智慧等技術來幫忙過濾電子郵件,以還給使用者一個乾淨清新的收信環境。
但是目前的方法尚有不足之處。例如分類器的建置、演進式學習能力以及客製化郵件處理等議題仍待探究。本研究的目的即在於建構一個具有演進式學習功能的分類器,能從錯誤中逐漸學習改正;並提出一個客製化郵件處理流程,針對每一個使用者之喜好而客製化處理其郵件,並且以所建構的分類器偵測使用者喜好改變,加以學習適應,以達到客製化電子郵件處理的目的。
本研究提出兩個實驗來驗證所建構的分類器效能,其實驗結果有相當高的準確率(Accuracy)和準確度(precision),最後以一個實際個案來說明如何應用所提的客製化郵件處理流程,並驗證所建構的分類器能偵測使用者喜好改變,並加以學習適應。這些實驗結果驗證了本研究所提方法的適用性。
Abstract
The electronic mail has become one of the most popular communication channels in the modern world. Due to its convenience and low cost, however, many business salesmen utilize this channel to promote their products by distributing e-mails to people as far as they can reach, which causes troubles to irrelevant e-mail receivers. As a result, many a research has been devoted to filtering irrelevant e-mails based on data mining techniques to alleviate users’ mental loadings in processing e-mails they receive.
Nevertheless, current approaches have their own drawbacks. Issues on what appropriate classifies to construct, how to endow such classifiers with the adaptive learning ability, and how to customize the e-mail management process for each user are still under investigation. The objective of this research is therefore to construct an e-mail classifier with learning ability to self-correct from erroneous outcomes. Furthermore, we propose a customized e-mail management process that can handle users’ e-mails based on their own preferences. Ultimately, it can adapt itself to the changes of users’ preferences when handling their e-mails.
Several experiments are conducted to verify the performance of the constructed classifier. The results show that our proposed classifier possesses high accuracy and high precision with outstanding adaptive learning ability. We also illustrate a real application of the customized e-mail management process. It shows that our approach can detect the changes of users’ preferences and learn to follow the changes. The feasibility of employing our approach to constructing e-mail classifiers is thus justified.
目次 Table of Contents
第一章 緒論 1
1.1 研究背景 1
1.2 研究目的 2
1.3 論文架構 3
第二章 文獻探討 4
2.1 文字探勘(Text Mining) 4
2.2學習式分類方法 5
2.2.1 貝氏分類器(Naïve Bayes) 6
2.2.2最近鄰居法(KNN) 6
2.2.3 決策樹分類法(Decision Tree) 7
2.2.4 支援向量機(SVM) 8
2.3 電子郵件過濾之相關研究 10
2.3.1 過濾電子郵件的方法 10
2.3.2 過濾電子郵件的系統 11
2.3.3 智慧型郵件代理人 12
第三章 電子郵件過濾方法 15
3.1 電子郵件分類器 15
3.1.1分類器的建構 15
3.1.2 分類器的選擇 17
3.2 分類器演進式學習 19
3.3 客製化郵件處理程序 22
3.3.1 伺服器端代理人 22
3.3.2 客戶端代理人 23
3.3.3 演進式學習之代理人互動 25
3.3.4 起始分類器之訓練 25
第四章 實驗與結果 27
4.1實驗設計 27
4.2 實驗一 29
4.3 實驗二 32
4.4 客製化郵件處理之實例 36
第五章 結論 40
參考文獻 42
參考文獻 References
1. Androutsopoulos, I., Koutsias, J., Chandrinos, K. V., Paliouras, G., and Spyropoulos, C. D. (2000). An evaluation of Naive Bayesian anti-spam filtering. In Proceedings of the workshop on Machine Learning in the New Information Age, pp. 9-17.
2. Bernard, R. (1996). The Corporate Intranet: Create and Manage an Internal Web for your Organization, Wiley.
3. Boser, B.E., Guyon, I.M. and Vapnik, V. (1992). A training algorithm for optimal margin classifiers. In Fifth Annual Workshop on Computational Learning Theory, Pittsburgh, ACM.
4. Carreras, X, and Marquez, L. (2001). Boosting trees for anti-spam e-mail filtering . In Proceedings of the 3rd Conference on Recent Advances on NLP, RANLP '01.
5. Chang-Jiun Tsai, Shian-Shyong Tseng, and Her-Tsaan Cheng. (1999). An Intelligent E-mail Management System. SMC’99 Conference, Tokyo, Japan, Oct., ICCE99.
6. Chin, D.N. (1991). Intelligent Interfaces as Agents. In J.W. Sullivan, and S.W. Tyler(Eds.), Intelligent User Interfaces (pp. 177-206). New York: ACM Press.
7. Cortes, C. and Vapnik, V. (1995). Support Vector Networks. In Machine Learning, 20:273-297.
8. Cranor, L. F. and LaMacchia, B. A. (1998). Spam! Communications of ACM, 41(8): 74-83.
9. Drucker, H., Wu, D. H., and Vapnik, VN. (1999). Support vector machine for spam categorization, IEEE T. Neur. Network 10.
10. Gee, K. R. (2003). Using Latent Semantic Indexing to Filter Spam. In Proc. ACM Symposium on Applied Computing 2003, Data Mining Track, pp. 460-464, Mar.
11. Gunn, S. R. (1998). Support Vector Machines for Classification and Regression. Technical Report. Dept. of electronics and Computer Science, University of Southampton.
12. Joachims, T. (2002). Learning to classify text using support vector machines. London: Kluwer Academic publishers.
13. Maes, P. (1995). Artificial Life Meets Entertainment: Life Like Autonomous Agents. Communications of the ACM, 38(11), 108-114
14. Maes, P., and Kozierok, R. (1993). Learning interface agents, In Proceedings of the Eleventh National Conference on Artificial Intelligence, MN.
15. Malone, T.W., Grant, K.R. , Turbak, F.A., Brobst, S.A., and Cohen, M.D.(1987). Intelligent Information-Sharing Systems, Comm. ACM, vol.30, no.5, pp. 390-402.
16. T. M. Mitchell. (1997). Machine Learning, McGraw-Hill.
17. Mladenic, D. (1996). Personal Web Watcher: Implementation and Design, Technical Report IJS DP-7472.
18. Nwana, H.S. , and Ndumu, D. T. (1997). An Introduction to Agent Technology. In H. S. Nwana and N. Azarmi (Eds.), Software Agents and Soft Computing. Berlin, Germany: Springer-Verlag.
19. Nwana, H.S. and Wooldridge, M. (1997). Software Agent Technology. In H. S. Nwana and N. Azarmi (Eds.), Software Agents and Soft Computing. Berlin, Germany: Springer-Verlag.
20. Palme, J. (1995), Electronic Mail, Norwood, MA: Artech House.
21. Payne, T. R., Edwards, P. and Green, C. L. (1997). Experience with Rule Induction and k-Nearest Neighbor Methods for Interface Agents that Learn. IEEE Transactions on Knowledge and Data Engineering 9(2): pp. 329-335.
22. Platt, J. (1998). Fast Training of SVMs using Sequential Minimal Optimization. In B. Scholkopf, C. Burges, and A. Smola (Eds.), Advances in Kernel Methods –Support Vector Learning. MIT Press.
23. Quinlan, J. (1993). R. C4.5: Programs for machine learning. Morgan Kaufmann Publishers.
24. Ruping, S. (2001). Incremental learning with support vector machines. In Processing of the 2001 IEEE International Conference on Data Mining (ICDM'01).
25. Sahami, M., Dumais, S., Heckerman, D., and Horvitz, E.(1998). A Bayesian approach to filtering junk e-mail. In Proceedings of Workshop on Learning for Text Categorization.
26. N. Syed, H. Liu, and K. Sung. (1999). Incremental learning with support vector machines. In Proceedings of the Workshop on Support Vector Machines at the International Joint Conference on Articial Intelligence (IJCAI-99), Stockholm, Sweden.
27. V. Vapnik. (1995). The Nature of Statistical Learning Theory. Springer Verlag, New York.
28. Vapnik, V. (1998). Statistical Learning Theory. Springer, N.Y.
29. Yang, Y. (1997). An Evaluation of Statistical Approaches to Text Categorization. Technical Report, Carnegie Mellon University, Pittsburgh, PA. CMU-CS-97-102.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內一年後公開,校外永不公開 campus withheld
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus:永不公開 not available

您的 IP(校外) 位址是 3.134.104.173
論文開放下載的時間是 校外不公開

Your IP address is 3.134.104.173
This thesis will be available to you on Indicate off-campus access is not available.

紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code