國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,植基於API記錄資料探勘之惡意程式偵測系統,Malware Detection System Based on API Log Data Mining

論文名稱 Title	植基於API記錄資料探勘之惡意程式偵測系統 Malware Detection System Based on API Log Data Mining
系所名稱 Department	資訊工程學系 Department of Computer Science and Engineering
畢業學年期 Year, semester	101 學年度第 2 學期 The spring semester of Academic Year 101	語文別 Language	英文 English
學位類別 Degree	碩士 Master	頁數 Number of pages	51
研究生 Author	周君翰 Chun-han Chou
指導教授 Advisor	范俊逸, 蕭漢威 Chun-I Fan; Han-wei Hsiao
召集委員 Convenor	楊竹星 Chu-Sing Yang
口試委員 Advisory Committee	陳嘉玫 Chia-Mai Chen
口試日期 Date of Exam	2013-07-18	繳交日期 Date of Submission	2013-08-14
關鍵字 Keywords	System Call、資料探勘、分類法、惡意程式、API Classification, System Call, API, Data Mining, Malware
統計 Statistics	本論文已被瀏覽 5692 次，被下載 144 次 The thesis/dissertation has been browsed 5692 times, has been downloaded 144 times.

中文摘要
隨著資訊科技的發展，人們的各種日常生活與網際網路越來越緊密，人們透過網際網路交換訊息、學習、娛樂和商業活動等等。並且在行動裝置與雲端運算的興起下，越來越多的裝置能連上網際網路，這些裝置已成為有心人士的目標。近年來除了網路詐騙外，網路釣魚與惡意網頁等攻擊越來越多，攻擊者將惡意程式植入受害者的裝置中，竊取資料與控制裝置，造成使用者龐大的損失。惡意程式的數量在近年來急速上升，惡意程式的行為從以前的單純破壞到現在的滲透攻擊與隱形技術，散佈方式也變得多樣化，導致傳統的防毒軟體越來越不敷使用。本研究使用Hooking技術來追蹤惡意程式常見的隱形技術，接著使用資料探勘技術來比較惡意程式與正常程式行為的差異，藉此來區分並偵測惡意程式。從實驗結果顯示，本研究的偵測率可以達到95%，而且只需要使用80個屬性(Attribute)。顯示我們的方法可以使用很低的運算量就達到很高的偵測率
Abstract
As information technology improves, the Internet is involved in every area in our daily life. When the mobile devices and cloud computing technology start to play important parts of our life, they have become more susceptible to attacks. In recent years, phishing and malicious websites have increasingly become serious problems in the field of network security. Attackers use many approaches to implant malware into target hosts in order to steal significant data and cause substantial damage. The growth of malware has been very rapid, and the purpose has changed from destruction to penetration. The signatures of malware have become more difficult to detect. In addition to static signatures, malware also tries to conceal dynamic signatures from anti-virus inspection. In this research, we use hooking techniques to trace the dynamic signatures that malware tries to hide. We then compare the behavioural differences between malware and benign programs by using data mining techniques in order to identify the malware. The experimental results show that our detection rate reaches 95% with only 80 attributes. This means that our method can achieve a high detection rate with low complexity.

目次 Table of Contents
論文審定書i Acknowledgments iii 摘要iv Abstract v List of Figures viii List of Tables ix List of Listings x Chapter 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Contributions of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Organization of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Chapter 2 Related Works 4 2.1 Malware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Static Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 Signature-Based Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.4 Dynamic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.5 Hooking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.6 Host Based Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Chapter 3 The Proposed Method 14 3.1 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.2 API Record Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.3 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.4 Classification Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Chapter 4 Experiment and Evaluation 23 4.1 Sample Program Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.2 API Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.5 Attribute Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.6 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Chapter 5 Conclusion and Future Works 31 Bibliography 33 Appendix A TraceHook Source Codes 36

參考文獻 References
[1] Mamoun Alazab, Sitalakshmi Venkatraman, Paul Watters, and Moutaz Alazab. Zeroday malware detection based on supervised learning algorithms of api call signatures. In Proceedings of the Ninth Australasian Data Mining Conference-Volume 121, pages 171– 182, 2011. [2] AV-TEST. 2013 malware statics from av-test institute. Technical report, AV-TEST Institute, http://www.av-test.org/en/statics/malware/, 2013. [3] Parvez Faruki, Vijay Laxmi, MS Gaur, and P Vinod. Mining control flow graph as api callgrams to detect portable executable malware. In Proceedings of the Fifth International Conference on Security of Information and Networks, pages 130–137, 2012. [4] Ming-Yen Hsieh. Improved malware behavior detection using static analysis. Master’s thesis, National Cheng Kung University, 2010. [5] Rafiqul Islam, Ronghua Tian, Lynn Batten, and Steve Versteeg. Classification of malware based on string and function feature selection. In Cybercrime and Trustworthy Computing Workshop (CTC), 2010 Second, pages 9–17, 2010. [6] Kazuki Iwamoto and Katsumi Wasaki. Malware classification based on extracted api sequences using static analysis. In Proceedings of the Asian Internet Engineeering Conference, pages 31–38, 2012. [7] Boojoong Kang, Taekeun Kim, Heejun Kwon, Yangseo Choi, and Eul Gyu Im. Malware classification method via binary content comparison. In Proceedings of the 2012 ACM Research in Applied Computation Symposium, pages 316–321, 2012. [8] Clemens Kolbitsch, Paolo Milani Comparetti, Christopher Kruegel, Engin Kirda, Xiaoyong Zhou, and XiaoFeng Wang. Effective and efficient malware detection at the end host. In USENIX Security Symposium, pages 351–366, 2009. [9] Jusuk Lee, Kyoochang Jeong, and Heejo Lee. Detecting metamorphic malwares using code graphs. In Proceedings of the 2010 ACM symposium on applied computing, pages 1970–1977, 2010. [10] Syed Bilal Mehdi, Ajay Kumar Tanwani, and Muddassar Farooq. Imad: in-execution malware analysis and detection. In Proceedings of the 11th Annual conference on Genetic and evolutionary computation, pages 1553–1560, 2009. [11] Vinod P Nair, Harshit Jain, Yashwant K Golecha, Manoj Singh Gaur, and Vijay Laxmi. Medusa: Metamorphic malware dynamic analysis usingsignature from api. In Proceedings of the 3rd international conference on Security of information and networks, pages 263–269, 2010. [12] Zahra Salehi, Mahboobeh Ghiasi, and Ashkan Sami. A miner for malware detection based on api function calls and their arguments. In Artificial Intelligence and Signal Processing (AISP), 2012 16th CSI International Symposium on, pages 563–568, 2012. [13] Ashkan Sami, Babak Yadegari, Hossein Rahimi, Naser Peiravian, Sattar Hashemi, and Ali Hamze. Malware detection based on mining api calls. In Proceedings of the 2010 ACM Symposium on Applied Computing, pages 1020–1025, 2010. [14] Ronghua Tian, Rafiqul Islam, Lynn Batten, and Steven Versteeg. Differentiating malware from cleanware using behavioural analysis. In Malicious and Unwanted Software (MALWARE), 2010 5th International Conference on, pages 23–30, 2010. [15] TrendMicro. Detecting apt activity with network traffic analysis. Technical report, Trend- Micro, http://www.trendmicro.com/cloud-content/us/pdfs/security-intelligence/whitepapers/ wp-detecting-apt-activity-with-network-traffic-analysis.pdf, 2012. [16] TrendMicro. The heartbeat apt campaign. Technical report, TrendMicro, http://www.trendmicro.com/cloud-content/us/pdfs/security-intelligence/whitepapers/ wp the-heartbeat-apt-campaign.pdf, 2012. [17] TrendMicro. Trendlabs2012 annual security roundup evolved threats in a postpc world. Technical report, TrendMicro, http://www.trendmicro.com/cloudcontent/ us/pdfs/security-intelligence/reports/rpt-evolved-threats-in-a-post-pc-world.pdf, 2012. [18] Philipp Trinius, Carsten Willems, Thorsten Holz, and Konrad Rieck. A malware instruction set for behavior-based analysis. Technical report, University of Mannheim, 2011. [19] Ksenia Tsyganok, Evgeny Tumoyan, Liudmila Babenko, and Maxim Anikeev. Classification of polymorphic and metamorphic malware samples based on their behavior. In Proceedings of the Fifth International Conference on Security of Information and Networks, pages 111–116, 2012. [20] Cheng Wang, Jianmin Pang, Rongcai Zhao, Wen Fu, and Xiaoxian Liu. Malware detection based on suspicious behavior identification. In Proceedins of First International Workshop on Education Technology and Computer Science, 2009., pages 198–202, 2009. [21] Hengli Zhao, Ming Xu, Ning Zheng, Jingjing Yao, and Qiang Ho. Malicious executables classification based on behavioral factor analysis. In Proceedings of International Conference on e-Education, e-Business, e-Management, and e-Learning, 2010., pages 502–506, 2010.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0714113-160717.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS