Responsive image
博碩士論文 etd-0629117-100050 詳細資訊
Title page for etd-0629117-100050
論文名稱
Title
企業客戶語音業務流失預測之研究
The Research on the Prediction of Enterprise Customer Churn on Voice Services
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
51
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2017-07-24
繳交日期
Date of Submission
2017-08-01
關鍵字
Keywords
Xgboost、顧客流失預測、資料探勘
Churn Prediction, Xgboost, Marketing, Customer Churn, Data Mining
統計
Statistics
本論文已被瀏覽 5963 次,被下載 53
The thesis/dissertation has been browsed 5963 times, has been downloaded 53 times.
中文摘要
近年來由於IP網路技術成熟,應用於IP網路上的服務日益普及,加上4G出現提供行動高速上網,人與人之間的連繫已從電話轉變成網路。透過高速網路加免費社群通訊軟體,即可做到語音通訊、傳送文字或圖片,甚至多人共同聊天等動作,因此語音服務被網路服務取代已經是完全不可逆轉的現象;語音服務轉換到網路服務是一種減價轉換的現象,這使得電信業者可獲取之營收減少,然而投注在寬頻網路及行動電話基地台建設的成本卻是不變,電信業者獲利降低;另外從NCC公布之我國電信業者營運實績統計數據顯示,國內電信業者的用戶總數不再成長,其中市內電話及行動電話用戶數呈現衰減,台灣電信市場顯然已經飽和,難有突破性成長,客戶保留及客戶流失管理再度成為電信業者所需關注的重要議題,期望至少能保留住手上這批願意使用語音服務的客戶,不讓客戶流失至競爭業者,維持手中掌握的語音服務營收。

  本研究主題為企業客戶語音流失預測,為找出可能流失客戶,運用Xgboost演算法建立預測模型,並考量企業客戶特性,納入企業客戶專有變數,以提升預測準確率;另外依據預測結果判斷哪些可能流失客戶要進行挽留,並計算其產生之營收損失,透過設定最適合閾值,讓營收損失最小化。實驗結果為預測模型之AUC達到80.2%,召回率達到74.2%,最適合閾值0.72,最小化營收損失為3,221單位。
Abstract
Due to the maturity of IP network technology and the provision of 4G mobile high speed networks, Internet-based services have become more popular. People now gradually change their way of communications from phone to the internet.

  As a result, the conversion of voice services to Internet services has reduced the revenue of telecom operators. However, the construction cost of broadband network and mobile phone base station remain the same, thereby drastically reducing the profit of telecom operators. In addition, reports from the NCC shows that the number of fixed-line telephone and mobile phone users is decreasing. Taiwan’s telecommunications market has been saturated. Therefore, Customer retention and customer churn management become important issues for telecom operators.

  In this work, we have engaged in the study of finding the possible churn customers. Various variables, including the enterprise customer’s unique variables, have been identified, and the Xgboost algorithm is used to establish the predict model. According to the predict results, about 74% of churned customers can be identified by our model. We further try to minimize revenue loss by setting the most appropriate threshold. The experimental results show that the AUC of the model is 80.2%, and the most suitable threshold is 0.72, resulting in the minimum loss of 3,221 units.
目次 Table of Contents
論文審定書+i
誌謝+ii
摘要+iii
Abstract+iv
目錄+v
表次+vii
圖次+viii
第一章 緒論+1
1.1 研究背景+1
1.2 研究動機與目的+2
1.3 論文架構+3
第二章 文獻探討+4
2.1 企業客戶定義+4
2.2 客戶流失定義+7
2.3 資料探勘技術+8
2.3.1 羅吉斯迴歸(Logistic Regression)+8
2.3.2 Boosting+9
2.4 電信業流失客戶預測之相關研究+10
第三章 客戶流失預測研究方法+13
3.1 資料來源及定義+13
3.2 資料前置處理+14
3.2.1 資料抽取+14
3.2.2 變數選擇+16
3.3 羅吉斯迴歸分析(Logistic Regression)+19
3.3.1 P-Value+19
3.3.2 變數重要性(Importance of Features)+25
3.3.3 偽R平方 (Pseudo R2)+27
3.4 Xgboost分析(eXtreme Gradient Boosting)+27
第四章 實驗結果+30
4.1 模型建置與評估方法+30
4.2 預測結果+31
4.3 營收損失最小化計算+35
第五章 結論及未來研究建議+39
5.1 研究結論與貢獻+39
5.2 未來研究建議+40
參考文獻+41
參考文獻 References
[1]我國電信業者營運實績統計數據,國家通訊傳播委員會,2016
[2] 2016中小企業白皮書,經濟部中小企業處,2016
[3] Jill Avery and Amy Gallo, The Value of Keeping the Right Customer, Harvard Business Review, 2014
[4] Paul D. Allison, Measures of Fit for Logistic Regression, Statistical Horizons LLC and the University of Pennsylvania, 2014
[5] Rob Mattison, The Telco Churn Management Handbook, Ch3-4, 2005
[6] Gareth James, Daniela Witten, Trevor Hastie and Rob Tibshirani, Data for An Introduction to Statistical Learning with Applications in R, R package version 1.0, 2015
[7] Tianqi Chen and Tong He, Xgboost: extreme gradient boosting, R package version 0.4-2 , 2015
[8] Rich Caruana and Alexandru Niculescu-Mizil , An Empirical Comparison of Supervised Learning Algorithms, International Conference on Machine Learning, 2006
[9] Mohammed Hassouna, Ali Tarhini, Tariq Elyas and Mohammad Saeed AbouTrab, Customer Churn in Mobile Markets: A Comparison of Techniques, International Business Research, Vol. 8, No. 6, 2015
[10] A. Keramatia, R. Jafari-Marandia, M. Aliannejadib, I. Ahmadianc, M. Mozaffaria, and U. Abbasia, Improved churn prediction in telecommunication industry using datamining techniques, Applied Soft Computing, Vol. 24, 994–1012, 2014
[11] Tom Au, Shaomin Li and Guangqin Ma, Applying and evaluating models to predict customer attrition using data mining techniques, Journal of Comparative International Management , Vol. 6, No. 1, 10-22, 2003.
[12] Chih-Ping Wei and I-Tang Chiu, Turning telecommunications call details to churn prediction: a data mining approach, Expert Systems with Applications, Vol. 23, No. 2, 103–112, 2002.
[13] Shin-Yuan Hung, David C. Yen and Hsiu-Yu Wang, Applying data mining to telecom churn management, Expert Systems with Applications Vol. 31, No. 3, 515-524, 2006
[14] Ron Kohavi, A study of cross-validation and bootstrap for accuracy estimation and model selection, Ijcai. Vol. 14. No. 2. 1995.
[15] Yang Liu, Nitesh V. Chawla, Mary P. Harper, Elizabeth Shriberg and Andreas Stolcke, A study in machine learning from imbalanced data for sentence boundary detection in speech, Computer Speech & Language, Vol. 20, No. 4, 468-494, 2006
[16] Tom Fawcett, An introduction to ROC analysis, Pattern recognition letters, Vol. 27, No. 8, 861-874, 2006
[17] David M W Powers, Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation, Technical Report SIE-07-001, 2007
[18] MA Mahmood and EJ Szewczak, Measuring information technology investment payoff: contemporary approaches, IGI Global, 1999.
[19] Tianqi Chen and Carlos Guestrin, Xgboost: A scalable tree boosting system, Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, 2016.
[20] J. Burez, and D Van den Poel, Handling class imbalance in customer churn prediction, Expert Systems with Applications, Vol. 36 , 4626-4636, 2009
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code