國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,基於檔案與機碼行為的惡意軟體分類 ,Malware Classification Based on File and Registry Activities

論文名稱 Title	基於檔案與機碼行為的惡意軟體分類 Malware Classification Based on File and Registry Activities
系所名稱 Department	資訊管理學系 Department of Information Management
畢業學年期 Year, semester	100 學年度第 2 學期 The spring semester of Academic Year 100	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	44
研究生 Author	曾琳銘 Ling-Ming Zeng
指導教授 Advisor	陳嘉玫 chia-Mei Chen
召集委員 Convenor	官大智 Da-zhi Guan
口試委員 Advisory Committee	鄭憲宗 Sheng-Tzong Cheng
口試日期 Date of Exam	2012-07-26	繳交日期 Date of Submission	2012-09-12
關鍵字 Keywords	支持向量機、惡意軟體分類、惡意軟體 Malware, Malware Classification, SVM
統計 Statistics	本論文已被瀏覽 5878 次，被下載 275 次 The thesis/dissertation has been browsed 5878 times, has been downloaded 275 times.

中文摘要
網路犯罪者試圖利用惡意軟體感染使用者主機竊取個人資訊以獲取利益，而防毒軟體為目前最常使用的惡意軟體識別工具，但病毒特徵碼更新速度往往不及新型惡意軟體增加的速度。截至目前為止，我們仍缺少一套有效的、快速的惡意軟體識別工具。除了防毒軟體，軟體分析平台是另一種選擇，軟體分析平台所提供的分析報告，讓使用者得以了解該軟體的行為。目前多數的軟體分析平台僅提供分析報告，軟體是否為惡意，仍需使用者自行判斷，對於缺乏專家經驗的使用者而言，這樣的分析報告對於判斷軟體惡意與否，並無太大助益；雖有少數軟體分析平台能結合防毒軟體進行識別，然對於新型惡意軟體的識別，仍力有未逮。本研究從文獻與分析報告中，歸納出惡意軟體與正常軟體在檔案與機碼行為上的差異，並從這兩方面著手，定義出惡意軟體分類特徵，並且我們採用支持向量機（SVM）做為分類的學習演算法，建立能夠區分惡意軟體與正常軟體的分類器。經過實驗評估後，證實我們的分類器，對於惡意軟體的識別率達到97.6%的高識別率。最後，我們也與ThreatExpert平台作比較，證實我們的分類器效能並不亞於現今常用的軟體分析平台。
Abstract
Cyber criminals are trying to steal personal information from victim’s machine to acquire more benefits by using malware. Antivirus is the most commonly used tool of malware identification, but the frequency of virus definition update is often less than the frequency of new type malware increase. Therefore, we need an effective and fast tool of malware identification in the current environment. In addition to antivirus, software analysis platform is currently one of the ways to identify malware. User could figure out behaviors of software in detail by the analysis report provided by software analysis platform. Most of software analysis platforms only offer an analysis report, user have to identify whether the software is malware or not by them self. This type of report is no help for user if their expertise not enough to find out these behaviors. Some of software analysis platforms which used antivirus can provide information to user about the software is malware or not, but they don’t have the ability of identifying new type malware immediately. According to research and analysis report, we generalized differences in file and registry activities of normal software and malware and defined malware classification features from these differences. We adopted Support Vector Machine（SVM）as our algorithm of classification to build and test three classifiers which can identify normal software and malware. After several experimental evaluations, we confirmed that the identification rate of malware was up to 97.6%. Finally, we compared the performance of our classifiers with ThreatExpert, and the result show that the performance of our classifiers is as well as ThreatExpert.

目次 Table of Contents
第一章緒論 1 1.1. 研究背景 1 1.2. 研究動機 2 1.3. 研究目的 2 第二章文獻討探 3 2.1. 開源軟體與閉源軟體 3 2.1.1. 開源軟體 3 2.1.2. 閉源軟體 4 2.2. 惡意軟體分析方式 4 2.2.1. 靜態分析 4 2.2.2. 動態分析 5 2.3. 支持向量機 9 2.3.1. 可線性分割的資料集 9 2.3.2. 不可線性分割的資料集 10 第三章分類系統 11 第四章系統分類效能評估 20 4.1. 樣本資訊 21 4.2. 實驗一（五項特徵）22 4.3. 實驗二（二十項特徵）23 4.4. 實驗三（改以相似度做為F5特徵的特徵值）26 4.5. 實驗四（與ThreatExpert比較）28 4.6. 結果分析與討論 30 第五章結論與未來方向 32 參考文獻 34

參考文獻 References
[1] MessageLab, “WHITE PAPER: The Online Shadow Economy: A Billion Dollar Market For Malware Authors,” http://www.symanteccloud.com/zh/tw/white_papers/online_shadow_economy.aspx [2] McAfee, “McAfee Threats Report: First Quarter 2012,” http://www.mcafee.com/us/resources/reports/rp-quarterly-threat-q1-2012.pdf. [3] Anubis, http://anubis.iseclab.org/ [4] Comodo Instant Malware Analysis, http://camas.comodo.com/ [5] Malbox, http://malbox.xjtu.edu.cn/ [6] Eureka, http://eureka.cyber-ta.org/ [7] ThreatExpert, http://www.threatexpert.com/ [8] WIKIPEDIA, “Open-source oftware,“ http://en.wikipedia.org/wiki/Open-source_software [9] WIKIPEDIA, “Closed source software,” http://en.wikipedia.org/wiki/Closed_source_software [10] Harvad Bussiness Review, “Open Source Software Hits a Strategic Tipping Point”, http://blogs.hbr.org/cs/2011/03/open_source_software_hits_a_st.html [11] D. Inoue, K. Yoshioka, M. Eto, Y. Hoshizawa, and K. Nakao, “Malware Behavior Analysis in Isolated Miniature Network for Revealing Malware’s Network Activity,” in Proc. ICC ’08 IEEE International Conference on, May 19-23, 2008, pp.1715-1721. [12] R. Tian, L. Batten, R. Islam, and S. Versteeg, “An automated classiﬁcation system based on the strings of trojan and virus families”, in Proc. Malicious and Unwanted Software, 2009 4th International Conference, Oct. 13-14, 2009, pp. 23-30 [13] C. Wang, J. Pang, R. Zhao, and X. Liu, “Using API Sequence and Bayes Algorithm to Detect Suspicious Behavior,” in Proc. 2009 International Conference on Communication Software and Networks, Feb. 27-28, 2009, pp. 544-548. [14] I. Santos, F. Brezo, J. Nieves, Y. K. Penya, B. Sanz, C. Laorden, and P. G. Bringsa, “Idea: Opcode-sequence-based Malware Detection,” in Proc. ESSoS'10 Proceedings of the Second international conference on Engineering Secure Software and Systems, 2010, pp. 35-43. [15] S. Yu, S. Zhou, L. Liu, R. Yang, and J. Luo, “Malware Variants Identification Based on Byte Frequency”, in Proc. Network Security Wireless Communications and Trusted Computing (NSWCTC), 2010 Second International Conference, Apr. 24-25, 2010, pp. 32-35. [16] M. Bailey, J. Andersen, Z. Morleyman, F. Jahanian, “Automated classification and analysis of internet malware”, in Proc. Recent Advances in Intrusion Detection (RAID’07), 2007. [17] J. Zhang, P. Porras, and V. Yegneswaran, and “Host-Rx _ Automated Malware Diagnosis Based on Probabilistic Behavior Models”, in Proc. Technical report, SRI International, 2009. [18] I. Firdausi, C. Lim, A. Erwin, and A. Satriyo Nugroho, “ANALYSIS OF MACHINE LEARNING TECHNIQUES USED IN BEHAVIOR-BASED MALWARE DETECTION,” in Proc. 2010 Second International Conference on Advances in Computing, Control, and Telecommunication Technologies, Dec. 2-3, 2010, pp. 201-203. [19] L. Nataraj, V. Yegneswaran, P. Porras, and J. Zhang,“A comparative assessment of malware classification using binary texture analysis and dynamic analysis”, in Proc. AISec '11 Proceedings of the 4th ACM workshop on Security and artificial intelligence, New York, 2011, pp. 21-30. [20] T. Lee, and J. J. Mody, “Behavioral_Classification”, in Proc. 2006 EICAR Conference, May 1, 2006. [21] K. Rieck, T. Holz, C. Willems, P. Duessel, and P. Laskov, “Learning and Classification of Malware Behavior”, in Proc. the 5th international conference on Detection of Intrusions and Malware, and Vulnerability Assessment, Springer-Verlag Berlin, Heidelberg, 2008, pp. 108-125. [22] H. Zhao, M. Xu, N. Zheng, J. Yao, and Q. Hou, “Malicious executables classification based on behavioral factor analysis”, in Proc. 2010 International Conference on e-Education, e-Business, e-Management and e-Learning, Jan 22-24, 2010, pp. 502-506. [23] M. Alazab, R. Layton, S. Venkatraman, and P. Watters, “Malware Detection Based on Structural and Behavioral Features of API calls,” in Proc. International Cyber Resilience Conference, Aug. 23-24, 2010, pp. 1-10 [24] R. Tian, R. Islam, L. Batten, and S. Versteeg, “Differentiating Malware from Cleanware using Behavioral Analysis,” in Proc. International Conference on Malicious and Unwanted Software (5th: 2010), Oct. 19-20, 2010, pp. 23-30. [25] K. Rieck, P. Trinius, C. Willems, and T. Holz, “Automatic Analysis of Malware Behavior using Machine Learning”, Journal of Computer Security(JCS), 2011. [26] M. K. Shankarpani, K. Kancherla, R. Movva, and S. Mukkamala, “Computational Intelligent Techniques ans Similarity Measures for Malware Classification,” in Proc. Computational Intelligence for Privacy and Security, 2012, pp.215-236. [27] J. Han and M. Kamber, Data Mining：Concepts and Techniques. 滄海書局, 2008. [28] C. Willems, T. Holz, and f. Freiling, “Toward automated dynamic malware analysis using CWSandbox,” IEEE Security and Privacy, vol.5, Mar. 2007, pp 32-39. [29] C. Harlan, “The Windows Registry as a Forensic resource,” Digital Investigation, vol.2, 2005, pp. 201-205. [30] D. J. Farmer, “A Windows Registry Quick-Reference for the Everyday Examiner,” Oct. 23, 2007. [31] D. J. Farmer, “A FORENSIC ANALYSIS OF THE WINDOWS REGISTRY,” 2007. [32] Igniteds Security Community, “Trojan White Paper,” http://igniteds.net/trojan_white_paper.pdf. [33] J. L Edwards, “System, Method and Computer Program Product for Preventing Spyware/Malware for Installing a Registry,” 2006. [34] Symantec, “Are You Infected? Detecting Malware Infection,” http://www.symantec.com/connect/articles/are-you-infected-detecting-malware-infection [35] L. W. Wong, “Forensic Analysis of the Windows Registry,” Forensic Focus, 2006. [36] John Platt, “Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines,” In B. Sch¨olkopf, C. J. C. Burges, and A. J. Smola, editors, Advances in Kernel Methods— Support Vector Learning, 1998, pp.185-208. [37] WIKIPIDEA, “Levenshtein Distance,” http://en.wikipedia.org/wiki/Levenshtein_distance [38] Osalt, http://osalt.com/ [39] Sourceforge, http://sourceforge.net/

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0912112-145949.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS