Responsive image
博碩士論文 etd-0812103-164119 詳細資訊
Title page for etd-0812103-164119
論文名稱
Title
群集式協同過濾推薦方法之研究
Cluster-based Collaborative Filtering Recommendation Approach
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
66
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2003-07-22
繳交日期
Date of Submission
2003-08-12
關鍵字
Keywords
分群、推薦系統、群集式協同過濾推薦方法、協同過濾推薦方法、偏好預測
Recommendation Systems, Clustering, Cluster-based Collaborative Filtering Recommendation, Preference Prediction, Collaborative Filtering Recommendation
統計
Statistics
本論文已被瀏覽 5723 次,被下載 3279
The thesis/dissertation has been browsed 5723 times, has been downloaded 3279 times.
中文摘要
在人們的日常生活中,推薦(Recommendation)已是很普遍的一種社會行為,而隨著科技的快速發展,資訊超載(Information Overload)使得個人在資訊的使用與搜尋上面臨極大的挑戰,更加刺激了對於推薦資訊的需求,因此許多推薦技術相繼被提出,推薦系統也相應而生,不僅使得推薦的範圍擴大了,推薦的型態也更為豐富多元;同時,在近年電子商務的發展中,對於個人化與顧客導向的熱烈推崇,使得推薦系統逐漸變成一種必要的線上服務。在眾多被提出的推薦技術之中,協同過濾推薦方法(collaborative filtering recommendation approach)是最成功且最常被採用的推薦技術。然而,傳統的協同過濾推薦方法忽略了推薦項目(Item)的內容相關性(content proximities),在尋找與目標使用者品味相似者的過程中,將使用者對於每個項目的偏好都視為同等重要,並未去區分項目在內容上的親疏遠近,因此我們提出一個群集式的協同過濾推薦方法(cluster-based collaborative filtering recommendation approach),將項目之間在內容上的相關性反應到協同過濾推薦方法中尋找品味相似者的過程。根據實證評估的結果,以群集的方式來改良傳統協同過濾推薦方法的確可以提高偏好預測時的準確度,而且不會犧牲預測的涵蓋範圍(coverage)。此外,由於資料的稀疏性(sparsity)問題,當可用的品味相似者(available neighbors)過少時,採用群內平均法(cluster average method)可以達到比所提方法更佳的預測準確度,因此我們考量了可用品味相似者數的影響,在最後提出一個改進的群集式協同過濾推薦方法(enhanced cluster-based collaborative filtering recommendation approach),實證結果顯示,改進過的方法可以得到更佳的推薦效果。
Abstract
Recommendation is not a new phenomenon arising from the digital era, but an existing social behavior in real life. Recommendation systems facilitate such natural social recommendation behavior and alleviate information overload facing individuals. Among different recommendation techniques proposed in the literature, the collaborative filtering approach is the most successful and widely adopted recommendation technique to date. However, the traditional collaborative filtering recommendation approach ignores proximities between items. That is, all user ratings on items are deemed identically important and given an equal weight in neighborhood formation process. In this study, we proposed a cluster-based collaborative filtering recommendation approach that takes into account the content similarities of items in the collaborative filtering process. Our empirical evaluation results show that the cluster-based collaborative filtering approach improves the prediction accuracy without sacrificing the prediction coverage, using those achieved by the traditional collaborative filtering approach as performance benchmarks. Due to the sparsity problem, when a prediction is made based on few neighbors, the cluster average method could achieve a better prediction accuracy than the proposed approach. Thus, we further proposed an enhanced cluster-based collaborative filtering approach that combines our approach and the cluster average method. The empirical results suggest that the enhanced approach could result in a prediction accuracy comparable to or even better than that accomplished by the cluster average method.
目次 Table of Contents
CHAPTER 1 INTRODUCTION 1
1.1 RESEARCH BACKGROUND 1
1.2 RESEARCH MOTIVATION AND OBJECTIVE 4
1.3 ORGANIZATION OF THE THESIS 5
CHAPTER 2 LITERATURE REVIEW 7
2.1 COLLABORATIVE FILTERING RECOMMENDATION APPROACH 7
2.1.1 Framework of Collaborative Filtering Recommendation Approach 8
2.1.2 Neighborhood Formation 9
2.1.3 Recommendation Generation 12
2.1.4 Strengths and Limitations of Collaborative Filtering Approach 13
2.2 DATA AND DOCUMENT CLUSTERING 14
2.2.1 Data Clustering 14
2.2.2 Document Clustering 17
CHAPTER 3 CLUSTER-BASED COLLABORATIVE FILTERING RECOMMENDATION 20
3.1 UNDERLYING RECOMMENDATION FRAMEWORK AND OVERALL PROCESS 20
3.2 ITEM-CLUSTERING 23
3.3 NEIGHBORHOOD FORMATION 25
3.3.1 Computation of User Similarity 25
3.3.2 Neighborhood Selection 27
3.4 RECOMMENDATION GENERATION 28
CHAPTER 4 EMPIRICAL EVALUATION 29
4.1 EVALUATION DATASETS 29
4.1.1 Movie Dataset 29
4.1.2 Literature Dataset 32
4.2 EVALUATION PROCEDURE AND METRICS 34
4.3 EVALUATION RESULTS 35
4.3.1 Cluster Formation of the Cluster-based Collaborative Filtering Approach 36
4.3.2 Effect of Neighbor Size 38
4.3.3 Comparative Evaluation with Collaborative Filtering Approach 43
4.3.4 Effect of Significance Weighting 45
4.3.5 Effect of Sparsity Level 50
4.3.6 Comparative Evaluation with Cluster Average Approach 53
CHAPTER 5 CONCLUSIONS AND FUTURE RESEARCH DIRECTIONS 58
REFERENCES 61
參考文獻 References
[A73] Anderberg, M. R., Clustering Analysis for Applications, Academic Press, Inc., 1973.

[B92] Brill, E., “A Simple Rule-Based Part of Speech Tagger,” Proceedings of the Third Conference on Applied Natural Language Processing, ACL, Trento, Italy, 1992, pp.152-156.

[B94] Brill, E., “Some Advances in Rule-Based Part of Speech Tagging,” Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI-94), Seattle, WA, 1994, pp.722-727.

[BHC98] Basu, C., Hirsh, H. and Cohen, W., “Recommendation as Classification: Using Social and Content-Based Information in Recommendation,” Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI-98), July 1998, pp.714-720.

[BHK98] Breese, J. S., Heckerman, D. and Kadie, C., “Empirical Analysis of Predictive Algorithms for Collaborative Filtering,” Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-98), 1998, pp.43-52.

[BS97] Balabanovic, M. and Shoham, Y., “Fab: Content-based, Collaborative Recommendation,” Communications of the ACM, Vol. 40, No. 3, March 1997, pp.66-72.

[BS97a] Berson, A. and Smith, S. J., Data Warehousing, Data Mining & OLAP, McGraw-Hill, Inc., 1997.

[BGH99] Boley, D., Gini, M., Gross, R., Han, E., Hastings, K., Karypis, G., Kumar, V., Mobasher, B. and Moore, L., “Partitioning-based Clustering for Web Document Categorization,” Decision Support Systems, Vol. 27, No. 3, December 1999, pp.329-341.

[CGM99] Claypool, M., Gokhale, A., Miranda, T., Murnikov, P., Netes, D. and Sartin, M., “Combining Content-based and Collaborative Filters in an Online Newspaper,” Proceedings of the ACM SIGIR’99 Workshop on Recommender Systems: Algorithms and Evaluation, University of California, Berkeley, August 1999.

[DPH98] Dumais, S., Platt, J., Heckerman, D. and Sahami, M., “Inductive Learning Algorithms and Representations for Text Categorization,” Proceedings of the 1998 ACM 7th International Conference on Information and Knowledge Management (CIKM’98), 1998, pp.148-155.

[GNO92] Goldberg, D., Nichols, D., Oki, B. M. and Terry, D., “Using Collaborative Filtering to Weave An Information Tapestry,” Communications of the ACM, Vol.35, No.12, December 1992, pp.61-70.

[GSK99] Good, N., Schafer, J., Konstan, J., Borchers, A., Sarwar, B., Herlocker, J. and Riedl, J., “Combining Collaborative Filtering with Personal Agents for Better Recommendations,” Proceedings of the 1999 Conference of the American Association of Artificial Intelligence (AAAI-99), 1999, pp.439-446.

[HCO02] Huang, Z., Chung, W., Ong, T. and Chen, H., “A Graph-Based Recommender System for Digital Library,” Proceedings of the second ACM/IEEE-CS joint conference on Digital libraries, Portland, OR, 2002.

[HK01] Herlocker, J. L. and Konstan, J. A., “Content-Independent Task-Focused Recommendation,” IEEE Internet Computing, 2001, pp.40-47.

[HKB99] Herlocker, J., Konstan, J., Borchers, A. and Riedl, J., “An Algorithmic Framework for Performing Collaborative Filtering,” Proceedings of 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1999, pp.230-237.

[K89] Kohonen, T., Self-Organization and Associative Memory, Springer, 1989.

[K95] Kohonen, T., Self-Organizing Maps, Springer, 1995.

[KB96] Krulwich, B. and Burkey, C., “Learning User Information Interests through Extraction of Semantically Significant Phrases,” Proceedings of the AAAI Spring Symposium on Machine Learning in Information Access, Stanford, CA, 1996, pp.110-113.

[KLS01] Kim, J. W., Lee, B. H., Shaw, M. J., Chang, H. L. and Nelson, M., “Application of Decision-Tree Induction Techniques to Personalized Advertisements on Internet Storefronts,” International Journal of Electronic Commerce, Vol. 5, No. 3, 2001, pp.45-62.

[KMM97] Konstan, J. A., Miller, B. N., Maltz, D., Herlocker, J. L., Gordon, L. R. and Riedl, J., “GroupLens: Applying Collaborative Filtering to Usenet News,” Communications of the ACM, Vol. 40, No. 3, 1997, pp.77-87.

[KR90] Kaufman, L. and Rousseeuw, P. J., Finding Groups in Data: An Introduction to Cluster Analysis, John Wiley & Sons, Inc., New York, NY, 1990.

[L95] Lang, K., “News Weeder: Learning to Filter Netnews,” Proceedings of the 12th International Conference on Machine Learning, San Francisco, CA, 1995, pp.331-339.

[L01] Lynch, C., “Personalization and Recommender Systems in the Larger Context: New Directions and Research Questions,” Proceedings of the 2nd DELOS Network of Excellence Workshop on Personalization and Recommender Systems in Digital Libraries, Dublin, Ireland, 2001.

[LA99] Larsen, B. and Aone, C., “Fast and Effective Text Mining Using Linear-time Document Clustering,” Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, August 1999, pp.16-22.

[LH98] Lam, W. and Ho, C. Y., “Using A Generalized Instance set for Automatic Text Categorization,” Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia, August 1998, pp.81-89.

[LR94] Lewis, D. and Ringuette, M., “A Comparison of Two Algorithms for Text Categorization,” Proceedings of Symposium on Document Analysis and Information Retrieval, 1994.

[MR00] Mooney, R. J. and Roy, L., “Content-based Book Recommending Using Learning for Text Categorization,” Proceedings of the Fifth ACM Conference on Digital Libraries, 2000, pp.195-204.

[NGL97] Ng, H. T., Goh, W. B. and Low, K. L., “Feature Selection, Perceptron Learning, and A Usability Case Study for Text Categorization,” Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’97), Philadelphia, PA, July 1997, pp.67-73.

[NH94] Ng, R. and Han, J., “Efficient and Effective Clustering methods for spatial Data Mining,” Proceedings of International Conference on Very Large Data Bases, Santiago, Chile, September 1994, pp.144-155.

[OH00] O'Connor, M. and Herlocker, J., “Clustering Items for Collaborative Filtering,” Technical Report, University of Minnesota, Department of Computer Science, Minneapolis, MN, 2000.

[PMB96] Pazzani, M., Muramatsu, J. and Billsus, D., “Syskill & Webert: Identifying Interesting Webs,” Proceedings of the 13th National Conference on Artificial Intelligence, Portland, OR, 1996, pp.54-61.

[RC99] Roussinov, D. and Chen, H., “Document Clustering for Electronic Meetings: An Experimental Comparison of Two Techniques,” Decision Support Systems, Vol. 27, No. 1-2, 1999, pp.67-79.

[RIS94] Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P. and Riedl, J., “GroupLens: An Open Architecture for Collaborative Filtering of Netnews,” Proceedings of the Conference on Computer Supported Cooperative Work (CSCW), Chapel Hill, NC, 1994, pp.175-186.

[RP97] Rucker, J. and Polanco, M. J., “Siteseer: Personalized Navigation for the Web,” Communications of the ACM, Vol. 40, No. 3, 1997, pp.73-76.

[RV97] Resnick, P. and Varian, H. R., “Recommender Systems,” Communications of the ACM, Vol. 40, No.3, March 1997, pp.56-58.

[SKB98] Sarwar, B. M., Konstan, J. A., Borchers, A., Herlocker, J., Miller, B. and Riedl, J., “Using Filtering Agents to Improve Prediction Quality in the GroupLens Research Collaborative Filtering System,” Computer Supported Cooperative Work, 1998.

[SKK00] Sarwar, B. M., Karypis, G., Konstan, J. A. and Riedl, J., “Analysis of Recommendation Algorithms for E-commerce,” Proceedings of the 2th ACM Conference on Electronic Commerce, Minneapolis, MN, 2000, pp.158-167.

[SKK01] Sarwar, B. M., Karypis, G., Konstan, J. A. and Riedl, J., “Item-Based Collaborative Filtering Recommendation Algorithms,” WWW10, May 1-5, 2001, Hong Kong.

[SKR99] Schafer, J. B., Konstan, J. and Riedl, J., “Recommendation Systems in E-commerce,” ACM Conference on Electronic Commerce, 1999.

[SKR01] Schafer, J. B., Konstan, J. and Riedl, J., “E-Commerce Recommendation Applications,” Data Mining and Knowledge Discovery, Vol. 5, No. 1, 2001, pp.115-153.

[SM95] Shardanand, U. and Maes, P., “Social Information Filtering: Algorithms for Automating Word of Mouth,” Proceedings of ACM CHI’95 Conference on Human Factors in Computing Systems, 1995, pp.210-217.

[THA97] Terveen, L., Hill, W., Amento, B., McDonald, D. and Creter, J., “PHOAKS: A System for Sharing Recommendations,” Communications of the ACM, Vol. 40, No. 3, 1997, pp.59-62.

[TSJ99] Tatemura, J., Santini, S. and Jain, R., “Social and Content-based Information Filtering for a Web Graphics Recommender System,” Proceedings of the 10th International Conference on Image Analysis and Processing, Venice, Italy, September 1999.

[V86] Voorhees, E. M., “Implementing Agglomerative Hierarchical Clustering Algorithms for Use in Document Retrieval,” Information Processing and Management, Vol. 22, 1986, pp.465-476.

[V93] Voutilainen, A., “Nptool: A Detector of English Noun Phrases,” Proceedings of Workshop on Very Large Corpora, Ohio, June 1993, pp.48-58.

[WPS03] Wei, C., Piramuthu, S. and Shaw, M. J., “Knowledge Discovery and Data Mining,” Chapter 41 in Handbook of Knowledge Management, Vol. 2, C. W. Holsapple (Ed.), Springer-Verlag, Berlin, Germany, 2003, pp.157-189.

[WSE02] Wei, C., Shaw, M. J. and Easley, R. F., “A Survey of Recommendation Systems in Electronic Commerce,” E-Service: New Directions in Theory and Practice, R. T. Rust and P. K. Kannan (Eds), M. E. Sharpe Publisher, 2002.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內立即公開,校外一年後公開 off campus withheld
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code