Responsive image
博碩士論文 etd-0727111-144058 詳細資訊
Title page for etd-0727111-144058
論文名稱
Title
一種根植於消費者資訊搜尋流程模式之旅遊文章分類的方法
On Travel Article Classification Based on Consumer Information Search Process Model
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
69
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2011-07-08
繳交日期
Date of Submission
2011-07-27
關鍵字
Keywords
本體論、命名實體辨識、文件分類、旅遊文章分類、資訊搜尋模型
travel article classification, text categorization, ontology, named entity recognition, information search model
統計
Statistics
本論文已被瀏覽 5978 次,被下載 2033
The thesis/dissertation has been browsed 5978 times, has been downloaded 2033 times.
中文摘要
資訊科技的蓬勃發展導致資訊量的激增,資訊超載的問題也越來越被受到重視,使用者需要一些代理媒介來過濾資訊以期符合使用者的資訊需求。在本篇論文裡,我們研究如何將旅遊領域的文章分類,以便找出符合使用者資訊需求的文章;我們提出了一個在旅遊領域裡資訊需求導向的資訊搜尋模型,包含了:初始、景點、住宿和路徑規劃這四個資訊搜尋目標,這些目標可以藉由13個特徵去描述。當中某些特徵可以藉由WordNet和Named Entity Recognition這些技術來加強或擴充。為了測試這13個特徵和提出的相關方法在文章分類的效果,我們從網路上最大的旅遊網站─TripAdvisor.com─收集了15,797篇文章,並從其中隨機的選出了600篇的文章來當作我們的訓練資料,實驗結果顯示我們的方法普遍可以和單純使用詞彙當作分類特徵的TFIDF方法匹敵甚至是超越它。
Abstract
The information overload problem becomes imperative with the explosion of information, and people need some agents to facilitate them to filter the information to meet their personal need. In this work, we conduct a research for the article classification in the tourism domain so as to identify articles that meet users’ information need. We propose an information need orientation model in tourism, which consists of four goals: Initiation, Attraction, Accommodation, and Route planning. These goals can be characterized by 13 features. Some of the identified features can be enhanced by WordNet and Named Entity Recognition techniques as supplement techniques. To test the effectiveness of using the 13 features for classification and the relevant methods, we collected 15,797 articles from TripAdvisor.com, the world's largest travel site, and randomly selected 600 articles as training data labeled by two labelers. The experimental results show that our approach generally has comparable or better performance than that of using purely lexical features, namely TF-IDF, for classification, with fewer features.
目次 Table of Contents
摘要 i
Abstract ii
TABLE OF CONETENS iii
LIST OF FIGURES v
LIST OF TABLES vi
CHAPTER 1 - Introduction 1
1.1. Background 1
1.2. Motivation 2
1.3. THESIS ORGANIZATION 3
CHAPTER 2 - Literature Review 5
2.1. Text Categorization 5
2.2. Travel Recommender Systems 7
2.3. Consumer Information Search Behaviors 9
2.3.1. Tourist Information Search Behavior 12
2.4. Social Media 15
2.4.1 Online Travel Reviews 16
CHAPTER 3 - The Model 19
3.1. Initiation 23
3.2. Attraction 24
3.3. Accommodation 25
3.4. Route planning 26
CHAPTER 4 - Method 28
4.1. Lexicon-based method 29
4.2. WordNet-enhanced method 32
4.3. NER-enhanced method 33
4.4. WordNet-NER-enhanced method 35
CHAPTER 5 - Evaluation 39
5.1. Datasets 39
5.2. Hypotheses Verification 41
5.3. Experiment Design for Classification 43
5.4. Performance Results 46
CHAPTER 6 - Conclusion 51
6.1. Limitation and future work 51
References 53
Appendix A 59
Appendix B 60
參考文獻 References
Batista, G. E., Prati, R. C., & Monard, M. C. (2004). A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explorations Newsletter, 6(1), 20-29.
Berka, T., & M., P. (2004). Designing recommender systems for tourism. In: The 11th International Conference on Information Technology in Travel and Tourism, ENTER, Cairo, Egypt.
Berners-Lee, T., Cailliau, R., Groff, J.-F., & Pollermann, B. (2010). World-wide web: the information universe. Internet Research, 20(4), 461-471.
Bettman, J. R. (1979). An information processing theory of consumer choice. 355-382.
Bieger, T., & Laesser, C. (2004). Information Sources for Travel Decisions: Toward a Source Process Model. Journal of Travel Research, 42(4), 357-371.
Blair-Goldensohn, S., Hannan, K., McDonald, R., Neylon, T., Reis, G. A., & Reynar, J. (2008). Building a sentiment summarizer for local service reviews.
Bloch, P. H., Sherrell, D. L., & Ridgway, N. M. (1986). Consumer Search: An Extended Framework. The Journal of Consumer Research, 13(1), 119-126.
Boyd, D. M., & Ellison, N. B. (2008). Social Network Sites: Definition, History, and Scholarship. Journal of Computer-Mediated Communication, 13(1), 210-230.
Burke, R. (2000). Knowledge-based recommender systems. Encyclopedia of Library and Information Systems.
Chen, H., & Lynch, K. J. (1992). Automatic construction of networks of concepts characterizing document databases. Systems, Man and Cybernetics, IEEE Transactions on, 22(5), 885-902.
Compete, I. (2006). Embracing Consumer Buzz Creates Measurement Challenges for Marketers.
Correia, A. (2002). How Do Tourists Choose? Tourism Management, 50(1).
Crompton, J. (1992). Structure of vacation destination choice sets. Annals of Tourism Research, 19(3), 420-434.
DeSarbo, W. S., & Choi, J. (1998). A latent structure double hurdle regression model for exploring heterogeneity in consumer search patterns. Journal of Econometrics, 89(1-2), 423-455.
Duncan, C. P., & Olshavsky, R. W. (1982). External Search: The Role of Consumer Beliefs. Journal of Marketing Research, 19(1), 32-43.
Dunman, S. (1996). Seeking Meaning: A Process Approach to Library and Information Services. Journal of the American Society for Information Science, 47(3), 249-250.
Ellis, D. (1989). A behavioural approach to information retrieval design. Journal of Documentation, 45(3), 171–212.
Ellis, D., Cox, D., & Hall, K. (1993). A comparison of the information seeking patterns of researchers in the physical and social sciences. Journal of Documentation, 49(4), 356 - 369.
Felfernig, A., Friedrich, G., Jannach, D., & Zanker, M. (2007). An Environment for the Development of Knowledgebased Recommender Applications. International Journal of Electronic Commerce.
Finkel, J. R., Grenager, T., & Manning, C. (2005). Incorporating non-local information into information extraction systems by gibbs sampling.
Gaizauskas, R., Humphreys, K., Cunningham, H., & Wilks, Y. (1995). University of Sheffield: description of the LaSIE system as used for MUC-6. Paper presented at the Proceedings of the 6th conference on Message understanding.
Gretzel, U. (2006). Consumer generated content-trends and implications for branding. e-Review of Tourism Research, 4(3), 9 - 11.
Gretzel, U., Fesenmaier, D. R., & O'Leary, J. T. (2006). The Transformation of Consumer Behaviour. Tourism Business Frontiers, 9 - 18.
Gretzel, U., & Yoo, K. Y. (2008). Use and impact of online travel review. [Innsbruck, AT]. In: Proceedings of the 2008 International Conference on Information and Communication Technologies in Tourism, 35-46.
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. (2009). The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter, 11(1), 10-18.
Hayes, P. J., Andersen, P. M., Nirenburg, I. B., & Schmandt, L. M. (1990). Tcs: a shell for content-based text categorization.
Jacobsson, M., Rost, M., & Holmquist, L. E. (2006). When Media Gets Wise: collaborative filtering with mobile media agents. Paper presented at the Proceedings of the 11th international conference on Intelligent user interfaces.
Jacoby, J. (1977). Information Load and Decision Quality: Some Contested Issues. Journal of Marketing Research, 14(4), 569-573.
Jannach, D., Zanker, M., Jessenitschnig, M., & Seidler, O. (2007). Developing a conversational travel advisor with Advisor Suite. To appear in ENTER.
Kaplan, A. M., & Haenlein, M. (2009). Users of the world, unite! The challenges and opportunities of Social Media. Business Horizons, 53(1), 59-68.
Katona, G., & Mueller, E. (1954). A study of purchase decisions. Consumer behavior, 1(30-87).
Kuhlthau, C. C. (1991). Inside the Search Process: Information Seeking from the User's Perspective. Journal of the American Society for Information Science, 42(5), 361-371.
Lau, K. N., Lee, K. H., & Ho, Y. (2005). Text mining for the hotel industry. Cornell Hotel and Restaurant Administration Quarterly, 46(3), 344.
Laurent, G., & Kapferer, J.-N. (1985). Measuring Consumer Involvement Profiles. Journal of Marketing Research, 22(1), 41-53.
Leiper, N. (1990). Tourist attraction systems. Annals of Tourism Research, 17(3), 367-384.
Linden, G., Smith, B., & York, J. (2003). Amazon.com recommendations: item-to-item collaborative filtering. Internet Computing, IEEE, 7(1), 76-80.
Maser, B., & Weiermair, K. (1998). Travel decision-making: From the vantage point of perceived risk and information preferences. Journal of Travel & Tourism Marketing, 7(4), 107-121.
McIntosh, R. W., & Goeldner, C. R. (1990). Tourism: Principles, Prattices, Philosophies.
Miller, G. A. (1995). WordNet: a lexical database for English. Communications of the ACM, 38(11), 39-41.
Mitchell, T. (1997). Machine Learning.
Moutinho, L. (1987). Consumer Behavior in Tourism. European Journal of Marketing, 21, 5 - 44.
Munar, A. M. (2010). Technological mediation and user created content in tourism.
Nguyen, Q. N., Cavada, D., & Ricci, F. (2004). On-Tour Interactive Travel Recommendations. 259-270.
Pan, B., MacLaurin, T., & Crotts, J. C. (2007). Travel Blogs and the Implications for Destination Marketing. Journal of Travel Research, 46(1), 35-45.
Pekar, V. (2008). Discovery of subjective evaluations of product features in hotel reviews. Journal of Vacation Marketing, 14(2), 145.
Ricci, F. (2002). Travel recommender systems. IEEE Intelligent Systems, 55 - 57.
Ricci, F., & Missier, F. D. (2004). Supporting Travel Decision Making through Personalized Recommendation. 221-251.
Ricci, F., & Nguyen, Q. N. (2005). Critique-Based Mobile Recommender Systems. OEGAI Journal, 24(4).
Rose, D. E., & Levinson, D. (2004). Understanding user goals in web search. Paper presented at the Proceedings of the 13th international conference on World Wide Web.
Schmidt, J. B., & Spreng, R. A. (1996). A Proposed Model of External Consumer Information Search. Journal of the Academy of Marketing Science, 24(3), 246.
Sebastiani, F. (2002). Machine learning in automated text categorization. ACM computing surveys (CSUR), 34(1), 1-47.
Sharon, E. B., & Smith, S. M. (1987). External Search Effort: An Investigation Across Several Product Categories. The Journal of Consumer Research, 14(1), 83-95.
Sheldon, P. J. (1997). Tourism information technology. Oxford: CAB International.
Srinivasan, N. (1990). Pre-Purchase External Information Search for Information., 153 - 189.
Titov, I., & McDonald, R. (2008). A joint model of text and aspect ratings for sentiment summarization. Urbana, 51, 308-316.
Vermeulen, I. E., & Seegers, D. (2009). Tried and tested: The impact of online hotel reviews on consumer consideration. Tourism Management, 30(1), 123-127.
Vogt, C. A., & Fesenmaier, D. R. (1998). Expanding the functional information search model. Annals of Tourism Research, 25(3), 551-578.
Werthner, H., & Klein, S. (1999). Information technology and tourism: A challenging relationship. Vienna: Springer.
Wilkie, W., & Dickson, P. (1985). Shopping for Appliances: Consumers’ Strategies and Patterns of Information Search.
Wilson, T. D. (1999). Models in information behaviour research. Journal of Documentation, 55(3), 249 - 270.
Witten, I. H., Bainbridge, D., & Nichols, D. M. (2009). How to build a digital library: Morgan Kaufmann.
Witten, I. H., & Frank, E. (2005). Data Mining: Practical machine learning tools and techniques: Morgan Kaufmann Pub.
Wurst, M. (2007). The Word Vector Tool User Guide Operator Reference Developer Tutorial.
Xia, H., & Peng, L. (2009). SVM-Based Comments Classification and Mining of Virtual Community: For Case of Sentiment Classification of Hotel Reviews.
Xiang, Z., & Gretzel, U. (2010). Role of social media in online travel information search. Tourism Management, 31(2), 179-188.
Yang, C. C., Yen, J., & Chen, H. (2000). Intelligent internet searching agent based on hybrid simulated annealing. Decision Support Systems, 28(3), 269-277.
Yang, Y., & Pedersen, J. O. (1997). A comparative study on feature selection in text categorization.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內外都一年後公開 withheld
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code