Responsive image
博碩士論文 etd-0723115-144204 詳細資訊
Title page for etd-0723115-144204
論文名稱
Title
從財金新聞預測公司財報之營收走勢
Predicting Company Revenue Trend Using Financial News
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
76
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2015-07-15
繳交日期
Date of Submission
2015-08-23
關鍵字
Keywords
營收預測、情緒分析、文字探勘、財金情緒字典、支援向量機
sentiment analysis, revenue prediction, support vector machine, financial news analysis, text mining
統計
Statistics
本論文已被瀏覽 6007 次,被下載 549
The thesis/dissertation has been browsed 6007 times, has been downloaded 549 times.
中文摘要
文字探勘應用領域相當廣泛,而本研究主要著重在財金領域。以往研究財金領域都是採用計量模型或財金指標,而近期出現許多運用財金新聞資料之研究,
這些研究主要分為兩類,分別是預測股票市場之趨勢或股票價格以及偵測公司是否有詐騙或破產等行為。而在財務報表中投資人最關注的不外乎就是營收,透過營收資訊也能了解公司的金流或市占狀況,因此營收是非常重要的指標。
但過去的研究鮮少預測公司之營收。在本研究中,我們利用現有的財金情緒字典進行自動化修正與擴充,也手動建立其他相關字典以便進行情緒分析,此外我們也提出了一個能用來表達公司整體情況的框架,使用支援向量機進行營收走向預測。實驗結果顯示,我們透過一季的新聞文章來預測該季的營收走勢可達到近80%的準確度。
Abstract
The applications of text mining are much diversified, and our study focuses on the financial fields. Previous studies financial indicator predictions are mostly based on econometric models. In recent years, with the advance of text mining techniques, more and more studies employ financial news as the data source for analysis, and these studies focus on two areas, namely forecasting trend of the stock market or the stock price and company fraud or bankruptcy detection. We observe that the financial indicator that many investors are quite concerned about companies’ revenue. Revenue information implies the company's cash flow and market share, and therefore revenue is a very important indicator.
However, to the best of our knowledge, there are very few research focusing on predicting company revenue. In our study, we identify a few features that potentially impact company revenue. We further propose an approach to determine feature values, which involves the automatic extension of existing financial sentiment dictionary and the aggregation of sentiment values. Experimental results show that we are able to predict the revenue trend from news in the last quarter through the news articles with the accuracy up to 80%.
目次 Table of Contents
論文審定書 i
摘 要 ii
Abstract iii
CHAPTER 1 – Introduction 1
1.1 Background and Motivation 1
1.2 Research Purpose 2
1.3 Thesis Organization 3
CHAPTER 2 – Literature Review 4
2.1 Financial Aspect 4
2.1.1 The Importance of Revenue 4
2.1.2 News Impact Financial Markets 6
2.2 Technical Aspect 7
2.2.1 Text Mining Approach 7
2.2.2 Text Mining Using Financial News 8
CHAPTER 3 – The Research Process 10
3.1 Research Skeleton 10
3.2 News/Event Schema Construction 11
3.3 Data Crawling 13
3.4 Lexicon Construction 14
3.5 Sentiment Analysis 19
3.6 Prediction Model Construction 20
CHAPTER 4 – Sentiment Lexicon Construction and Sentiment Analysis 22
4.1 Initial Lexicon Construction 22
4.2 Revising Sentiment Dictionary 23
4.3 Expanding Sentiment Dictionary 31
4.4 The method of determining entity, aspect, and sentiment 33
CHAPTER 5 – Evaluation 39
5.1 Dataset Construction 39
5.1.1 News Database Construction 40
5.1.2 Event Database Construction 41
5.1.3 Experimental Data Construction 42
5.2 Experiment Design 43
5.3 Experiment Result 44
5.4 Discussions 49
CHAPTER 6 – Conclusion 53
6.1 Implication 53
6.2 Limitation 54
6.3 Future Work 54
References 55
Appendix A – Word List with Accumulated Word Frequency in Sinica Corpus 3.0 58
Appendix B – Sentiment Dictionary 60
參考文獻 References
Boiy, E., Hens, P., Deschacht, K., & Moens, M.-F. (2007). Automatic sentiment analysis in on-line text. Paper presented at the ELPUB.
Chih, H.-H., Chih, H.-L., & Huang, Y.-T. (2008). Doing Good with or without Being Known? The Impact of Media Coverage of Corporate Social Performance on Corporate Financial Performance.
Choudhary, A. K., Harding, J. A., & Tiwari, M. K. (2009). Data mining in manufacturing: a review based on the kind of knowledge. Journal of Intelligent Manufacturing, 20(5), 501-521.
Committee, I. A. S., & Board, I. A. S. (2000). International Accounting Standards: International Accounting Standards Committee.
Devitt, A., & Ahmad, K. (2007). Sentiment polarity identification in financial news: A cohesion-based approach. Paper presented at the ACL.
Ding, X., Liu, B., & Yu, P. S. (2008). A holistic lexicon-based approach to opinion mining. Paper presented at the Proceedings of the 2008 International Conference on Web Search and Data Mining.
Engelberg, J. E., & Parsons, C. A. (2011). The causal impact of media in financial markets. The Journal of Finance, 66(1), 67-97.
Farris, P. W., Bendle, N. T., Pfeifer, P. E., & Reibstein, D. J. (2010). Marketing metrics: The definitive guide to measuring marketing performance: Pearson Education.
Feldman, R., Rosenfeld, B., Bar-Haim, R., & Fresko, M. (2011). The stock sonar—sentiment analysis of stocks based on a hybrid approach. Paper presented at the Twenty-Third IAAI Conference.
Glancy, F. H., & Yadav, S. B. (2011). A computational model for financial reporting fraud detection. Decision Support Systems, 50(3), 595-601.
Graham, J. R., Harvey, C. R., & Rajgopal, S. (2005). The economic implications of corporate financial reporting. Journal of accounting and economics, 40(1), 3-73.
Hagenau, M., Liebmann, M., & Neumann, D. (2013). Automated news reading: Stock price prediction based on financial news using context-capturing features. Decision Support Systems, 55(3), 685-697.
Hu, M., & Liu, B. (2004). Mining and summarizing customer reviews. Paper presented at the Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining.
Huang, W.-Q., Zhuang, X.-T., & Yao, S. (2009). A network analysis of the Chinese stock market. Physica A: Statistical Mechanics and its Applications, 388(14), 2956-2964.
Huang, W., Nakamori, Y., & Wang, S.-Y. (2005). Forecasting stock market movement direction with support vector machine. Computers & Operations Research, 32(10), 2513-2522.
Huefner, R. J., & Largay III, J. A. (2008). The role of accounting information in revenue management. Business Horizons, 51(3), 245-255.
Humpherys, S. L., Moffitt, K. C., Burns, M. B., Burgoon, J. K., & Felix, W. F. (2011). Identification of fraudulent financial statements using linguistic credibility analysis. Decision Support Systems, 50(3), 585-594.
Jans, M., Lybaert, N., & Vanhoof, K. (2010). Internal fraud risk reduction: Results of a data mining case study. International Journal of Accounting Information Systems, 11(1), 17-41.
Kim, Y. (1997). Measuring efficiency: The economic impact model of reputation. Paper presented at the annual conference of the Public Relations Society of America, Nashville, TN.
Kim, Y. (2001). Measuring the economic value of public relations. Journal of Public Relations Research, 13(1), 3-26.
Kinney, W. R. (1971). Predicting earnings: entity versus subentity data. Journal of Accounting Research, 127-136.
Koh, H. C., & Tan, G. (2011). Data mining applications in healthcare. Journal of healthcare information management, 19(2), 65.
Lin, I.-H. (2013). Creating and Verifying Sentiment Dictionary of Finance and Economics via Financial News.
Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies, 5(1), 1-167.
Loughran, T., & McDonald, B. (2011). When is a liability not a liability? Textual analysis, dictionaries, and 10‐Ks. The Journal of Finance, 66(1), 35-65.
Luca, M. (2011). Reviews, reputation, and revenue: The case of Yelp. com: Harvard Business School.
Macıas, M., Guitart, J., Center, B. S., & Girona, J. (2008). Influence of reputation in revenue of grid service providers. Paper presented at the Proceedings of the 2nd International Workshop on High Performance Grid Middleware (HiPerGRID 2008).
Mahajan, A., Dey, L., & Haque, S. M. (2008). Mining financial news for major events and their impacts on the market. Paper presented at the Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT'08. IEEE/WIC/ACM International Conference on.
Mullen, T., & Collier, N. (2004). Sentiment Analysis using Support Vector Machines with Diverse Information Sources. Paper presented at the EMNLP.
Qiu, G., Liu, B., Bu, J., & Chen, C. (2009). Expanding Domain Sentiment Lexicon through Double Propagation. Paper presented at the IJCAI.
Qiu, G., Liu, K., Bu, J., Chen, C., & Kang, Z. (2007). Extracting opinion topics for Chinese opinions using dependence grammar. Paper presented at the Proceedings of the 1st international workshop on Data mining and audience intelligence for advertising.
Ravisankar, P., Ravi, V., Raghava Rao, G., & Bose, I. (2011). Detection of financial statement fraud and feature selection using data mining techniques. Decision Support Systems, 50(2), 491-500.
Robinson, J. P., & Levy, M. R. (1996). News media use and the informed public: A 1990s update. Journal of Communication, 46(2), 129-135.
Rust, R. T., Moorman, C., & Dickson, P. R. (2002). Getting return on quality: revenue expansion, cost reduction, or both? Journal of marketing, 66(4), 7-24.
Schumaker, R., & Chen, H. (2006). Textual analysis of stock market prediction using financial news articles.
Schumaker, R. P., & Chen, H. (2009). Textual analysis of stock market prediction using breaking financial news: The AZFin text system. ACM Transactions on Information Systems (TOIS), 27(2), 12.
Schumaker, R. P., & Chen, H. (2010). A discrete stock price prediction engine based on financial news. Computer, 43(1), 51-56.
Schumaker, R. P., Zhang, Y., Huang, C.-N., & Chen, H. (2012). Evaluating sentiment in financial news articles. Decision Support Systems, 53(3), 458-464.
Taboada, M., Brooke, J., Tofiloski, M., Voll, K., & Stede, M. (2011). Lexicon-based methods for sentiment analysis. Computational linguistics, 37(2), 267-307.
Zagibalov, T., & Carroll, J. (2008). Automatic seed word selection for unsupervised sentiment classification of Chinese text. Paper presented at the Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1.
Zhai, Y., Hsu, A., & Halgamuge, S. K. (2007). Combining news and technical indicators in daily stock price trends prediction Advances in Neural Networks–ISNN 2007 (pp. 1087-1096): Springer.
Zhang, D., & Zhou, L. (2004). Discovering golden nuggets: data mining in financial application. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 34(4), 513-522.
Zhou, W., & Kapoor, G. (2011). Detecting evolutionary financial statement fraud. Decision Support Systems, 50(3), 570-575.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code