國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,以URL資訊和TF-IDF為主的網路釣魚偵測,Phishing Detection Based on URL and TF-IDF

論文名稱 Title	以URL資訊和TF-IDF為主的網路釣魚偵測 Phishing Detection Based on URL and TF-IDF
系所名稱 Department	資訊工程學系 Department of Computer Science and Engineering
畢業學年期 Year, semester	97 學年度第 2 學期 The spring semester of Academic Year 97	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	62
研究生 Author	朱怡俊 I-chun Chu
指導教授 Advisor	官大智 D. J. Guan
召集委員 Convenor	陳嘉玫 Chia-Mei Chen
口試委員 Advisory Committee	范俊逸 Chun-I Fan
口試日期 Date of Exam	2009-07-13	繳交日期 Date of Submission	2009-09-08
關鍵字 Keywords	域名、URL、釣魚郵件 URL, Domain, Phishing
統計 Statistics	本論文已被瀏覽 5771 次，被下載 1960 次 The thesis/dissertation has been browsed 5771 times, has been downloaded 1960 times.

中文摘要
現代人採用電子郵件作為溝通方式已相當普遍, 例如: 學校透過電子郵件向學生傳達公開訊息、公司主管向員工傳達工作指示、朋友之間透過電子郵件分享網路新知等。由於電子郵件通訊協定的不完善, 使得 spammer、 phisher 可輕易申請大量免費帳號而廣發垃圾郵件, 詐騙郵件或網路釣魚郵件, 讓收件人的電子信箱每天充滿了不請自來的圾圾郵件、詐騙郵件或釣魚郵件, 垃圾信件有些內容純為廣告信件, 有的夾帶惡意的附加檔案。網路釣魚則為假冒線上銀行、網路銀行等名義, 大量散佈偽造郵件, 讓收信人信以為真而點選郵件內文的 URL 連結而連到釣魚網頁, 並輸入個人帳號、密碼, 造成使用者金錢、名譽上的損失。本研究以分析網路釣魚信件中的 URL 連結訊息為主的方式及郵件主旨、內文的關鍵字來輔助偵測郵件伺服器是否收到釣魚郵件, 保護使用者免於陷入網路釣魚的危機中。從釣魚郵件中的 URL 本身定義6個特徵值, 包含 URL 本身及域名的特性, 域名使用 google search 和 Whois 查詢增加3個特徵值來幫助判斷。關鍵字: Phishing、URL、域名、釣魚郵件
Abstract
Peopel now use E-mail to communicate mutually is widespread. For example, schools convey information to students through e-mail, companies convey to the staff in charge of the task, friends to share the interested things of internet through e-mail and so on. Because of the imperfect of SMTP protocol, the spammer and phisher can delivery phishing e-mail or spam to unknown recipients widely and easily by the forged sender ID. It was result in the recipient's e-mail filled with numerous unsolicited advertising e-mail or faked electronic commerce e-mail. Some content of spam are advertising alone, but some are with harmful attachments, for instance, trojan or virus. Phishing use the name of online banking, Internet auction to delivery numerous e-mail, which let the recipient to believe the contents of the e-mail. By clicking the hyperlink connected to the website of the phishing, and input personal account, password, causing the recipient to lose their money or reputation. Information security software vendor SonicWall 2008’s whitepaper points out that even if the obvious tips of the e-mail which is a phishing e-mail, some percent recipient still click on the hyperlink and input their personal ID and password. This means that the recipinet can easily pay attention to the hyperlink. In this research, by analysising of the information of the URL link in e-mail to detect whether the mail server have been recived phishing e-mail to protect users from phishing in the crisis.

目次 Table of Contents
1 緒論 1 1.1 前言 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 研究背景 . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 動機 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4 論文架構 . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2 文獻探討 10 2.1 電子郵件系統 . . . . . . . . . . . . . . . . . . . . . . ..10 2.2 電子郵件架構 . . . . . . . . . . . . . . . . . . . . . . .11 2.3 目前偵測網路釣魚技術 . . . . . . . . . . . . . . .13 2.4 釣魚與垃圾郵件的差異. . . . . . . . . . . . . . . .20 3 設計方法 22 3.1 實作方法 . . . . . . . . . . . . . . . . . . . . . . . . . . .22 3.2 Weight: TF-IDF . . . . . . . . . . . . . . . . . . . . . .26 3.3 記分方式 . . . . . . . . . . . . . . . . . . . . . . . . . . .28 4 系統設計 30 4.1 設計流程 . . . . . . . . . . . . . . .. . . . . . . . . . . . .30 4.2 系統架構 . . . . . . . . . . . . . . .. . . . . . . . . . . . .32 4.3 程式模組說明 . . . . . . . . . . .. . . . . . . . . . . . .34 4.4 效能評估依據 . . . . . . . . . . .. . . . . . . . . . . . .36 4.5 實驗步驟 . . . . . . . . . . . . . . .. . . . . . . . . . . . .37 4.6 實驗結果與數據 . . . . . . . . .. . . . . . . . . . . . .39 4.7 效能比較 . . . . . . . . . . . . . . .. . . . . . . . . . . . .40 5 結論與未來展望 43 參考文獻 45 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .49 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .50

參考文獻 References
[1] 入口網站轉址漏洞. http://tw.myblog.yahoo.com/roamer-tw/article? mid=4501&next=4496&l=f&?d=15. [2] 反網路釣魚瀏覽器外掛程式–Monkeyspaw. http://www.ithome.com.tw/ itadm/article.php?c=42741. [3] 另類Spam: 網路釣魚騙術. http://www.zdnet.com.tw/enterprise/ technology/0,2000085680,20087713,00.htm. [4] 走上霸王硬上弓之路的網路釣魚. http://tw.myblog.yahoo.com/ roamer-tw/article?mid=947&prev=1524&next=946&l=f&?d=15. [5] Opera 9.5版將新增惡意程式防護機制. http://www.itis.tw/node/1855. [6] 2004 毒賣新聞-木馬大盜網路釣魚... 你上勾了嗎?(下). http://tw. trendmicro.com/tw/threats/vinfo/weeknews/article/20071001094654. html. [7] 假Google、雅虎的轉址、釣魚郵件資安事件頻傳. http://www.itis.tw/ node/1659. [8] 微軟結盟協力打擊釣魚詐騙技術. http://www.zdnet.com.tw/news/ software/0,2000085678,20102602,00.htm. [9] 經濟不景氣想節稅? 當心假節稅教學真詐騙個資. http://domynews.blog. ithome.com.tw/post/1252/24626. 10] 資安之眼: 去年網路釣魚網站增6成6 金融幌子居多. http://www.itis.tw/ node/2721. [11] 漫談轉址服務. http://blog.chweng.idv.tw/archives/272. [12] 網路行銷 VS. 電子商務. http://www.publish.com.tw/new/hotissue/ 20040805.htm. [13] 網路釣魚等網上詐騙盜竊活動防範知識. http://forum.icst.org.tw/phpbb/ viewtopic.php?f=29&t=6409. [14] 網路詐騙新手法假藉協助減稅誘騙消費者竊取個人機密資料. http:// rogerspeaking.com/2009/04/2046. [15] 賽門鐵克網路安全威脅報告第十三期. http://eval.symantec.com/ mktginfo/enterprise/white papers/b-whitepaper internet security threat report xiii 04-2008.en-us.pdf. [16] apcert 2009. http://www.apcert.org/index.html. [17] Committed to Wiping Out Internet Scams and Fraud. In APWG Phish- ing Activity Trends Report Q2/2008. formation Assurance: Intrusion Detection and Prevention. [18] E-mail. http://en.wikipedia.org/wiki/E-mail. [19] Index of publiccorpus. http://spamassassin.apache.org/publiccorpus/. [20] Lottery scam. http://en.wikipedia.org/wiki/Lottery scam. [21] PageRank. http://zh.wikipedia.org/w/index.php?title= Pagerank&variant=zh-tw. [22] Phishing Corpus. http://monkey.org/?jose/phishing/20051114.mbox. [23] TF-IDF. http://zh.wikipedia.org/w/index.php?title= TF-IDF&variant=zh-tw/. [24] Type I and type II errors. http://en.wikipedia.org/wiki/False positive# Type I error/. [25] S. Abu-Nimeh, D. Nappa, X. Wang, and S. Nair. A comparison of machine learning techniques for phishing detection. In Proceedings of the anti-phishing working groups 2nd annual eCrime researchers summit, pages 60–69. ACM New York, NY, USA, 2007. [26] M. Chandrasekaran, K. Narayanan, and S. Upadhyaya. Phishing E- mail Detection Based on Structural Properties. Information Assurance: Intrusion Detection and Prevention, page 2. [27] R. Dodge and A. Ferguson. Using Phishing for User Email Security Awareness. In Rannenberg, K.(Ed.), Yngstr‥om, L.(Ed.), Lindskog, S.(Ed.), In Proceedings of the IFIP TC-11 21st International Informa- tion Security Conference (SEC 2006), pages 454–458. Springer, 2006. [28] C.E. Drake, J.J. Oliver, and E.J. Koontz. Anatomy of a phishing email. In First Conference on Email and Anti-Spam (CEAS), Mountain View, CA, USA, pages 2–3, 2004. [29] I. Fette, N. Sadeh, and A. Tomasic. Learning to detect phishing emails. In Proceedings of the 16th international conference on World Wide Web, pages 649–656. ACM New York, NY, USA, 2007. [30] D. Florencio and C. Herley. Analysis and improvement of anti-phishing schemes. In Security And Privacy in Dynamic Environments: Proceed- ings of the IFIP TC-11 21st International Information Security Con- ference (SEC 2006), 22-24 May 2006, Karlstad, Sweden, page 148. Springer, 2006. [31] D. Florencio and C. Herley. Password rescue: a new approach to phish- ing prevention. In Proceedings of USENIX Workshop on Hot Topics in Security, 2006. [32] S. Garera, N. Provos, M. Chew, and A.D. Rubin. A framework for detection and measurement of phishing attacks. In Proceedings of the 2007 ACM workshop on Recurring malcode, pages 1–8. ACM New York, NY, USA, 2007. [33] Mail Delivery Agent. http://en.wikipedia.org/wiki/Mail delivery agent. [34] Mail Transfer Agent. http://en.wikipedia.org/wiki/Mail transfer agent. [35] Mail User Agent. http://en.wikipedia.org/wiki/Mail User Agent. [36] Simple Mail Transfer Protocol. http://en.wikipedia.org/wiki/Simple Mail Transfer Protocol. [37] @2008 SonicWALL. Bayesian Spam Classification Applied to Phishing E-Mail. http://www.sonicwall.com/downloads/WP-ENG-025 Phishing-Bayesian-Classification.pdf. [38] Wiki. Phishing. http://en.wikipedia.org/wiki/Phishing. [39] Wiki. Social engineering (security). http://en.wikipedia.org/wiki/ Social engineering (computer security).

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：校內校外完全公開 unrestricted 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0908109-012739.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS