Responsive image
博碩士論文 etd-0621114-120229 詳細資訊
Title page for etd-0621114-120229
論文名稱
Title
多因子Android惡意程式偵測系統
Multi-Factor Android Malware Detection System
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
63
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2014-06-12
繳交日期
Date of Submission
2014-07-21
關鍵字
Keywords
系統函式、惡意App偵測、資料探勘、權限、分類法
Malicious App Detection, Data Mining, Classifying, System Call, Permission
統計
Statistics
本論文已被瀏覽 5640 次,被下載 414
The thesis/dissertation has been browsed 5640 times, has been downloaded 414 times.
中文摘要
自從Apple的iPhone以及Google的Android系列智慧型手機在2007與2008上市後,智慧型手機的市占率便節節上升,而其中又以Android系統的智慧型手機之市占率成長率最為顯著。智慧型手機能夠成功擄獲使用者的心的最主要的原因之一就是在官方App市集(App Store、Google Play)上資源豐富的App。由於Android系統的開放性以及有些熱心使用者會將一些原本需要付費的App重新封裝後供他人下載,一般的使用者即可輕易的在自己的智慧型手機上安裝第三方市集上所下載之App。然而,由於第三方市集上之App毋須經過官方認證,因此出現惡意App之機率較高。
本研究先側錄了App執行時、閒置時所使用到的System Call以及App所要求之權限,接著使用資料探勘(Data Mining)的技術來比較官方市集與已知惡意App之紀錄差異,再使用機器學習(Machine Learning)之技術來建造偵測模型,未來即可使用此模型來偵測未知之惡意App,最後再使用特徵選取(Attribute Selection)之演算法來進行降維的動作,降低偵測所需要的時間。本研究實驗結果顯示,本研究所使用之方法對於App的正確判別率可以超過96%,且根據模型偵測之結果,第三方市集上約莫有20%的App含有惡意行為。
Abstract
Since Apple and Google introduced iPhone and Android smartphones in 2007 and 2008, the market share of smartphones has been on the increase. Above all, the market share of Android devices has had the most significant increment. One of the reasons that smartphones became so successful is because of the official application store(App Store and Google Play).

In this research, first we recorded the system calls an App uses while in execution and while idle as well as the permissions it requested. We then used the techniques of data mining to find record differences between malicious Apps and benign Apps, and machine learning techniques to build the model for detecting unknown malicious Apps. Finally, with the help of attribute selection methods, we reduced the time cost by using less attributes. The experiment results showed that the accuracy achieved more than 96% with proposed scheme, and that approximately 20% of the Apps of third-party market acts maliciously.
目次 Table of Contents
論文審定書 i
Acknowledgments iv
摘要v
Abstract vi
List of Figures ix
List of Tables x
List of Listings xi
Chapter 1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Chapter 2 Background 4
2.1 MalApp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Static Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Dynamic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Chapter 3 The Proposed Method 10
3.1 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2 System Call Recording . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3 Feature Extracting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4 Model Building . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Chapter 4 Experiment and Evaluation 20
4.1 Sample Apps Collecting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2 Data Collecting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2.1 Dynamic Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2.2 Static Features Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.5 Attribute Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.6 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.7 Third Party Market Apps Diagnosis . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.8 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Chapter 5 Conclusion and FutureWorks 36
Bibliography 38
Appendix A Source Codes 44

List of Figures
3.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.1 Key Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2 Scatter Plots of Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

List of Tables
4.1 Datasets for Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2 RedFlags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.3 Confusion Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.4 Result of Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.5 Result of Experiment 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.6 Top 5 Weighted and Selected Attributes for Experiment 1 . . . . . . . . . . . . 28
4.7 Top 5 Weighted and Selected Attributes for Experiment 2 . . . . . . . . . . . . 29
4.8 Important Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.9 Wrongly Classified Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.10 Attributes of Wrongly Classified Samples . . . . . . . . . . . . . . . . . . . . . . 32
4.11 Estimations of imobile Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.12 Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

List of Listings
A.1 Set Up Emulator and System Call Record . . . . . . . . . . . . . . . . . . . . . . 44
A.2 Transform Data to Acceptable Format . . . . . . . . . . . . . . . . . . . . . . . . 47
A.3 Retrieve Static and Dynamic Features . . . . . . . . . . . . . . . . . . . . . . . . 49
參考文獻 References
[1] 手機之家. http://imobile.com.cn/.
[2] Androguard. https://code.google.com/p/androguard/.
[3] Android Malware ITU Regional Forum on Cybersecurity. https://www.itu.int/
ITU-D/eur/rf/cybersecurity/presentations/symantec-itu_mobile.pdf.
[4] Android rooting. https://en.wikipedia.org/wiki/Android_rooting.
[5] APK Downloader. http://apps.evozi.com/apk-downloader/.
[6] Bayesian network. https://en.wikipedia.org/wiki/Bayesian_network.
[7] Contagio mobile. http://contagiominidump.blogspot.tw/.
[8] Logistic regression. https://en.wikipedia.org/wiki/Logistic_regression#
Bayesian_logistic_regression.
[9] Strace for Android. http://benno.id.au/android/strace.
[10] UI/Application Exerciser Monkey. https://developer.android.com/tools/help/
monkey.html.
[11] VirusTotal - Free Online Virus, Malware and URL Scanner. https://www.virustotal.
com/.
[12] XDA Developer Forum. http://forum.xda-developers.com/.
[13] Yousra Aafer, Wenliang Du, and Heng Yin. DroidAPIMiner: Mining API-Level Features
for Robust Malware Detection in Android. In Tanveer Zia, Albert Y. Zomaya, Vijay
Varadharajan, and Zhuoqing Morley Mao, editors, SecureComm, volume 127 of Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications
Engineering, pages 86–103. Springer, 2013.
[14] Kevin Joshua Abela, Don Kristopher Angeles, Jan Raynier Delas Alas, Robert Joseph To-
lentino, and Miguel Alberto Gomez. An Automated Malware Detection System for An-
droid using Behavior-based Analysis AMDA. In International Journal of Cyber-Security
and Digital Forensics, pages 1–11, 2013.
[15] D. Aha and D. Kibler. Instance-based learning algorithms. Machine Learning, 6:37–66,
1991.
[16] Zarni Aung and Win Zaw. Permission-Based Android Malware Detection. In Interna-
tional Journal of Scientific & Technology Research, pages 228–234, 2013.
[17] AV-Comparatives. Mobile Security Review August 2013. Technical report, AV-
Comparatives, 2013.
[18] AV-Comparatives. File Detection Test March 2014. Technical report, AV-Comparatives,
2014.
[19] Iker Burguera, Urko Zurutuza, and Simin Nadjm-Tehrani. Crowdroid: Behavior-based
Malware Detection System for Android. In Proceedings of the 1st ACM Workshop on
Security and Privacy in Smartphones and Mobile Devices, SPSM ’11, pages 15–26, New
York, NY, USA, 2011. ACM.
[20] Chih-Chung Chang and Chih-Jen Lin. LIBSVM - A Library for Support Vector Machines,
2001. The Weka classifier works with version 2.82 of LIBSVM.
[21] Blue Coat. Blue Coat Systems 2014 Mobile Malware Report. Technical report, Blue
Coat, 2014.
[22] Corinna Cortes and Vladimir Vapnik. Support-Vector Networks. Mach. Learn.,
20(3):273–297, September 1995.
[23] Gianluca Dini, Fabio Martinelli, Andrea Saracino, and Daniele Sgandurra. MADAM:
A Multi-level Anomaly Detector for Android Malware. In Proceedings of the 6th Inter-
national Conference on Mathematical Methods, Models and Architectures for Computer Network Security: Computer Network Security, MMM-ACNS’12, pages 240–253, Berlin,
Heidelberg, 2012. Springer-Verlag.
[24] Yasser EL-Manzalawy. WLSVM, 2005. You don’t need to include the WLSVM package
in the CLASSPATH.
[25] F-Secure. Mobile Threat Report Q1 2014. Technical report, F-Secure, 2014.
[26] Adrienne Porter Felt, Erika Chin, Steve Hanna, Dawn Song, and David Wagner. Android
Permissions Demystified. In Proceedings of the 18th ACM Conference on Computer and
Communications Security, CCS ’11, pages 627–638, New York, NY, USA, 2011. ACM.
[27] Eibe Frank and Ian H. Witten. Generating Accurate Rule Sets Without Global Optimiza-
tion. In J. Shavlik, editor, Fifteenth International Conference on Machine Learning, pages
144–151. Morgan Kaufmann, 1998.
[28] Yoav Freund and Robert E. Schapire. A Decision-Theoretic Generalization of on-Line
Learning and an Application to Boosting, 1997.
[29] Jerome Friedman, Trevor Hastie, and Robert Tibshirani. Additive Logistic Regression: a
Statistical View of Boosting. Annals of Statistics, 28:2000, 1998.
[30] Joao Gama. Functional Trees. 55(3):219–250, 2004.
[31] Alexander Genkin, David D. Lewis, and David Madigan. Large-scale bayesian logistic
regression for text categorization. Technical report, DIMACS, 2004.
[32] Sheran Gunasekera. Android Apps Security. Apress, Berkely, CA, USA, 1st edition, 2012.
[33] Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and
Ian H. Witten. The WEKA Data Mining Software: An Update. SIGKDD Explor. Newsl.,
11(1):10–18, November 2009.
[34] You Joung Ham, Daeyeol Moon, Hyung-Woo Lee, Jae Deok Lim, , and Jeong Nyeo Kim.
Android Mobile Application System Call Event Pattern Analysis for Determination of
Malicious Attack. In International Journal of Security and Its Applications, pages 231–
246, 2014.
[35] Andrew Hoog. Android Forensics Investigation, Analysis, and Mobile Security for Google
Android. Elsevier, 2011.
[36] Chun-Ying Huang, Yi-Ting Tsai, and Chung-Han Hsu. Performance Evaluation on
Permission-Based Detection for Android Malware. In Proceedings of International Computer
Symposium (ICS), pages –, 2012.
[37] Takamasa Isohara, Keisuke Takemori, and Ayumu Kubota. Kernel-based Behavior Analysis
for Android Malware Detection. In Proceedings of the 2011 Seventh International
Conference on Computational Intelligence and Security, CIS ’11, pages 1011–1015,
Washington, DC, USA, 2011. IEEE Computer Society.
[38] George H. John and Pat Langley. Estimating Continuous Distributions in Bayesian Classifiers.
In Eleventh Conference on Uncertainty in Artificial Intelligence, pages 338–345,
San Mateo, 1995. Morgan Kaufmann.
[39] Ryan Johnson, ZhaohuiWang, Corey Gagnon, and Angelos Stavrou. Analysis of Android
Applications’ Permissions. In SERE (Companion), pages 45–46. IEEE, 2012.
[40] S. Kullback and R. A. Leibler. On Information and Sufficiency. Ann. Math. Statist.,
22(1):79–86, 1951.
[41] Kaspersky Lab. Kaspersky Security Bulletin 2013. Technical report, Kaspersky Lab,
2013.
[42] McAfee Labs. McAfee Labs Threats Report: Fourth Quarter 2013. Technical report,
McAfee Labs, 2013.
[43] Niels Landwehr, Mark Hall, and Eibe Frank. Logistic model trees. 95(1-2):161–205,
2005.
[44] Lookout. Mobile Threats, Made to Measure. Technical report, Lookout, 2013.
[45] Steve Mansfield-Devine. Android malware and mitigations. Network Security,
2012(11):12–20, 2012.
[46] Andreas Moser, Christopher Kruegel, and Engin Kirda. Limits of static analysis for malware
detection. In ACSAC, pages 421–430. IEEE Computer Society, 2007.
[47] Ross Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers,
San Mateo, CA, 1993.
[48] Vaibhav Rastogi, Yan Chen, and Xuxian Jiang. Droidchameleon: Evaluating android
anti-malware against transformation attacks. In Proceedings of the 8th ACM SIGSAC
Symposium on Information, Computer and Communications Security, ASIA CCS ’13,
pages 329–334, New York, NY, USA, 2013. ACM.
[49] Borja Sanz, Igor Santos, Carlos Laorden, Xabier Ugarte-Pedrero, Pablo Garcia Bringas,
and Gonzalo Álvarez Marañón. PUMA: Permission Usage to Detect Malware in Android.
In CISIS/ICEUTE/SOCO Special Sessions, pages 289–298, 2012.
[50] Borja Sanz, Igor Santos, Xabier Ugarte-Pedrero, Carlos Laorden, Javier Nieves, and
Pablo Garcia Bringas. Anomaly Detection Using String Analysis for Android Malware
Detection. In SOCO-CISIS-ICEUTE, pages 469–478, 2013.
[51] Sophos. Security Threat Report 2014. Technical report, Sophos, 2014.
[52] Marc Sumner, Eibe Frank, and Mark Hall. Speeding up Logistic Model Tree Induction.
In 9th European Conference on Principles and Practice of Knowledge Discovery in
Databases, pages 675–683. Springer, 2005.
[53] Symantec. Mobile Adware and Malware Analysis. Technical report, Symantec, 2013.
[54] F. Tchakounté and P. Dayang. System Calls Analysis of Malwares on Android. In International
Journal of Science and Technology, pages 669–674, 2013.
[55] Ian H. Witten, Eibe Frank, and Mark A. Hall. Data Mining: Practical Machine Learning
Tools and Techniques, Third Edition. Elsevier, 2011.
[56] Mike Wolfson. Android Developer Tools Essentials. O’Reilly Media, 2013.
[57] Xindong Wu, Vipin Kumar, J. Ross Quinlan, Joydeep Ghosh, Qiang Yang, Hiroshi Motoda,
Geoffrey J. McLachlan, Angus Ng, Bing Liu, Philip S. Yu, Zhi-Hua Zhou, Michael
Steinbach, David J. Hand, and Dan Steinberg. Top 10 Algorithms in Data Mining. Knowl.
Inf. Syst., 14(1):1–37, December 2007.
[58] Min Zhao, Tao Zhang, Fangbin Ge, and Zhijian Yuan. RobotDroid: A Lightweight Malware
Detection Framework On Smartphones. JNW, 7(4):715–722, 2012.
[59] Yajin Zhou and Xuxian Jiang. Dissecting Android Malware: Characterization and Evolution.
In Proceedings of the 2012 IEEE Symposium on Security and Privacy, SP ’12, pages
95–109, Washington, DC, USA, 2012. IEEE Computer Society.
[60] Jiawei Zhu, Zhi Guan, Yang Yang, Liangwen Yu, Huiping Sun, and Zhong Chen.
Permission-based Abnormal Application Detection for Android. In Proceedings of the
14th International Conference on Information and Communications Security, ICICS’12,
pages 228–239, Berlin, Heidelberg, 2012. Springer-Verlag.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code