Title page for etd-0726105-140312
 論文名稱Title 以約略集合理論為基礎之垃圾郵件過濾研究A Spam Filter Based on Rough Sets Theory 系所名稱Department 資訊管理學系Department of Information Management 畢業學年期Year, semester 93 學年度 第 2 學期The spring semester of Academic Year 93 語文別Language 中文Chinese 學位類別Degree 碩士Master 頁數Number of pages 64 研究生Author 曾漠益Mo-yi Tzeng 指導教授Advisor 召集委員Convenor 口試委員Advisory Committee 口試日期Date of Exam 2005-07-25 繳交日期Date of Submission 2005-07-26 關鍵字Keywords 資料探勘與人工智慧、約略集合理論、垃圾郵件Rough Sets Theory, Spam, Data Mining and Artificial Intelligence 統計Statistics 本論文已被瀏覽 5895 次，被下載 0 次The thesis/dissertation has been browsed 5895 times, has been downloaded 0 times.
 中文摘要 隨著網際網路的普及與電子郵件的廣泛使用，垃圾郵件的數量日益增多，造成電子郵件使用者的不便。若能將郵件伺服器與資料探勘、人工智慧技術相結合，建立一套自動學習及自動過濾垃圾郵件的機制，將可幫助電子郵件者享有一個乾淨清新的收信環境。　　本研究提出一套區域聯防架構來防堵垃圾郵件的散播，在該架構下，需要一套基於規則的資料探勘與人工智慧演算法，約略集合理論符合該架構之需求。約略集合理論由波蘭邏輯學家Palwak所提出，是個基於規則的資料探勘與人工智慧演算法，適合用來尋找不精確、不完整資料的隱含知識。　　本研究實際建立一個以約略集合理論為基礎的垃圾郵件過濾器，找出垃圾郵件的特徵規則，並使用這些垃圾郵件特徵規則來過濾垃圾郵件。本研究發展之系統可附加於傳統的電子郵件協定上，適用於大部分現存的郵件伺服器。該系統支援中、日、韓等雙位元組文字語系，克服大部分垃圾郵件過濾軟體僅能處理英文語系郵件的問題。而在未來的後續研究中，可發展一套郵件伺服器間垃圾郵件規則的交換機制，實現區域聯防架構。 Abstract With the popularization of Internet and the wide use of electronic mails, the number of spam mails grows continuously. The matter has made e-mail users feel inconvenient. If e-mail servers can be integrated with data mining and artificial intelligence techniques and learn spam rules and filter out spam mails automatically, they will help every person who is bothered by spam mails to enjoy a clear e-mail environment. In this research, we propose an architecture called union defense to oppose against the spread of spam mails. Under the architecture, we need a rule-based data mining and artificial intelligence algorithm. Rough sets theory will be a good choice. Rough sets theory was proposed by Palwak, a logician living in Poland. It is a rule-based data mining and artificial intelligence algorithm and suitable to find the potential knowledge of inexact and incomplete data out. This research developed a spam filter based on rough sets theory. It can search for the characteristic rules of spam mails and can use these rules to filter out spam mails. This system set up by this research can be appended to most of existing e-mail servers. Besides, the system support Chinese, Japanese and Korean character sets and overcome the problem that most spam filters only can deal with English mails. We can develop a rule exchange approach between e-mail servers in the future works to realize union defense.
 目次 Table of Contents 第一章 序論 8第一節 研究背景 8第二節 垃圾郵件 9第三節 研究動機 11第四節 研究步驟 14第二章 相關研究 17第一節 垃圾郵件相關研究 17第二節 資料探勘與人工智慧 24第三節 約略集合理論 26第三章 系統設計 30第一節 郵件內容擷取模組 31第二節 郵件管理模組 32第三節 規則運算模組 33第四節 規則管理模組 35第五節 郵件過濾模組 36第四章 系統建置與驗證 39第一節 資料庫建置 40第二節 郵件內容擷取模組建置 42第三節 郵件管理模組建置 45第四節 規則運算模組建置 46第五節 規則管理模組建置 47第六節 郵件過濾模組建置 49第七節 系統驗證 52第五章 結論 56第一節 結論 56第二節 未來發展 57參考文獻 58附錄 60
 參考文獻 References [1] Internet Software Consortium, “Internet Domain Survey”. http://www.isc.org/ds[2] Lorrie Faith, Brain A. LaMacchia. “Spam!”, Communications of Teach, August 1998.[3] 張維平, "我國網路犯罪現況分析". http://www.ccpb.gov.tw/internet-fraud-01.htm[4] Denning, P. “Electronic junk”, Commun. ACM 3, 25, Mar 1982.[5] CNET, “辦公室「信」騷擾調查報告”. http://taiwan.cnet.com/enterprise/features/0,2000062876,20085772-2,00.htm[6] Grzymala-Busse, J. “Knowledge acquisition under uncertainty - A rough set approach”, Journal of intelligent and Robotic Systems 1, 1988.[7] Yiming Yang, Xin Liu. “A Re-Examination of Text Categorization Methods”, October 2004.[8] Zhang Lian-hua, Zhang Guan-hua, Yu Lang, Zhang Jie and Bai Yinh-cai. “Intrusion Detection Using Rough Sets Classification”, July 2003.[9] Z. Pawlak. “Rough sets”, Int. J. Inf. Comput. Sci. 11, 1982.[10] L. A. Zadeh, “Fuzzy sets”, Inf Control 8, 1965.[11] B. Walczak, D. L. Massart. “Rough Sets Theory”, December 1998.[12] EBRSC Rough Set software links. http://www2.cs.uregina.ca/~roughset/software.html[13] GROBIAN. http://www.infj.ulst.ac.uk/~cccz23/grobian/grobian.html[14] Rough Analysis. http://www.lsi.upc.es/~ealvarez/rough.html[15] DATALOGIC. http://ourworld.compuserve.com/homepages/reduct/[16] Rosetta. http://www.idt.unit.no/~aleks/rosetta/rosetta.html[17] Rough Enough. http://www.trolldata.no/renough/[18] ROSE. http://www.cs.put.poznan.pl/research/labDec/roughset/[19] RSES. http://logic.mimuw.edu.pl/~rses/start.html[20] K-DYS. http://www.rs-systems.com/[21] Aleksander
 電子全文 Fulltext 本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：校內校外均不公開 not available開放時間 Available：校內 Campus：永不公開 not available校外 Off-campus：永不公開 not available您的 IP(校外) 位址是 35.174.62.162論文開放下載的時間是 校外不公開Your IP address is 35.174.62.162This thesis will be available to you on Indicate off-campus access is not available. 紙本論文 Printed copies 紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available
 QR Code