國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,分類模型結合資料分析以改善協同過濾：Epinions.com的實驗研究,A Classification Model with Data Analysis for Improving Collaborative Filtering: An Experimental Study on Epinions.com

論文名稱 Title	分類模型結合資料分析以改善協同過濾：Epinions.com的實驗研究 A Classification Model with Data Analysis for Improving Collaborative Filtering: An Experimental Study on Epinions.com
系所名稱 Department	資訊管理學系 Department of Information Management
畢業學年期 Year, semester	103 學年度第 2 學期 The spring semester of Academic Year 103	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	64
研究生 Author	柯勝哲 Sheng-Jhe Ke
指導教授 Advisor	李偉柏 Wei-Po Lee
召集委員 Convenor	鄭炳強 Bing-Chiang Jeng
口試委員 Advisory Committee	蔡玉娟 Yuh-Jiuan Tsay
口試日期 Date of Exam	2015-06-26	繳交日期 Date of Submission	2015-08-26
關鍵字 Keywords	推薦系統、協同式過濾、相似度計算、決策樹、信任傳遞、分類、信任衡量 recommender system, user similarity, collaborative filtering, trust metric, decision tree, classification, trust propagation
統計 Statistics	本論文已被瀏覽 5828 次，被下載 113 次 The thesis/dissertation has been browsed 5828 times, has been downloaded 113 times.

中文摘要
協同式過濾法(Collaborative Filtering, CF)是實作推薦系統(Recommender System)的一種方法，用來預測使用者對產品的喜好程度。隨著社群網路的興起，使用者之間能藉由網站所提供的友好機制建立社群關係，如：Epinions.com提供使用者建立信任(Trust)關係。近年，已有不少研究利用使用者標定的信任關係(trust)取代傳統的相似度(similarity)計算，試圖解決CF潛在的兩項問題：冷啟動(cold-start)與資料稀疏(data sparsity)問題，希望藉此改善CF的推薦績效。後續的文獻認為相似度與信任存在某些程度的不一致性，傾向以線性組合方式(Linear combination)將兩種權重合併。這類的方式是經由反覆的實驗以找到最佳的權重組合的參數值，然而，平均絕對誤差(Mean Absolute Error)的改善程度卻有限，問題或許是出自於，面對不同的目標使用者或項目仍是以相同的權重參數值合併的結果。本研究提出一個結合相似度和信任基礎協同過濾法的分類模型，以資料分析方式，站在資料的角度，企圖反映出資料的屬性，針對資料集(dataset)建立出一個分類模型(Classification Model)，簡易的二分法找出分類規則，預測每一筆資料其最佳的預測方式，為求得較佳的推薦績效。同時，建立模型的過程中，亦能觀察相似度(similarity)與信任(trust)兩者關係。研究目的在於使預測使用者對產品的評分能更加準確。根據實驗的結果，本研究所提出的方法能夠有效降低平均絕對誤差，同時，顯現本研究的優點在於不需經由繁複的實驗，而能得到運用於該資料集的最佳化協同過濾的分類規則。
Abstract
Collaborative Filtering is one of the most popular methods to implement recommender systems. It can predict users’ ratings to generate personalized product recommendations based on the similarity of user preference. However, this type of methods encounter difficulties in dealing with the problems of cold start users and data sparsity. To overcome these difficulties, researchers have suggested the inclusion of more context information in building recommender systems. Several studies have indicated that the relationships among friends and friends of friends within a social network are crucial when referencing trustworthy and reliable information. Therefore, in addition to the user preference, more and more researches focus on the trust concept and attempt to alleviate the above problems by taking into account the trust relationships in recommendation. In this thesis, we present a data analytics approach that combines user preference and social trust for making better collaborative recommendation. The user preference here means the co-rating-based similarity measurement between users, and the social trust means the trust relationships derived from the users (including direct specifications by the users and indirect inference obtained from the calculation of trust transitivity). The proposed approach regards the collaborative recommendation as a classification task. It includes two phases to improve the recommendation performance. In the first phase, a data analysis procedure is performed to explore the target dataset in terms of user similarity and trust relationship. Based on the results of data analysis, a neighborhood method is used to evaluate the effects of the similarity-based and trust-based neighbors for the available user-item rating records. The second phase is to train and to test a classification model. Different features extracted from the data analysis procedure are used to build up a model that can recognize which of the similarity-based neighbors or the trust-based neighbors is more suitable (i.e., to produce a more precise rating prediction) for a particular user-item record. A series of experiments are conducted for performance evaluation. The results show that our presented approach can obtain good results in more objective experimental conditions. It also shows that this approach can be used to enhance the recommendation performance in an adaptive way for different datasets without an iterative parameter-tuning procedure.

目次 Table of Contents
論文審定書 i 摘要 ii Abstract iii 目錄 v 圖次 viii 表次 ix 第一章緒論 1 1.1 研究背景 1 1.2 研究動機與目的 2 1.3 研究流程 4 第二章文獻探討 5 2.1 相似度基礎協同過濾演算法 5 2.1.1 k個最近鄰居法 6 2.2 信任基礎協同過濾演算法 6 2.2.1 信任在社群網路的概念 7 2.2.2 信任的傳遞 7 2.2.3 信任基礎協同過濾演算法 10 2.2.4 信任結合相似度的協同過濾演算法 11 2.3 分類演算法 12 2.3.1 決策樹 13 第三章研究方法 15 3.1 研究系統架構 16 3.2 資料集的介紹 17 3.3 協同過濾法的選擇規則 18 3.4 相似度協同過濾法 19 3.5 信任協同過濾法 20 3.5.1 顯性信任程度的計算 20 3.5.2 信任傳遞-隱性信任的計算 21 3.5.3 以信任為基礎的協同過濾法 22 3.6 預測協同過濾法的模型 22 3.6.1 資料集的分割 22 3.6.2 訂定分類標籤 23 3.6.3 屬性介紹 24 3.6.4 屬性的選擇與模型的建立 26 第四章實驗與結果 30 4.1 評估方法與資料集的處理 30 4.1.1 實驗評估方法 30 4.1.2 資料前處理 30 4.2 協同過濾法的實作和分析 31 4.2.1 相似度協同過濾法結果分析 32 4.2.2 信任協同過濾法結果分析 34 4.2.3 冷啟動使用者 35 4.2.4 實作兩種協同過濾法的小結論 36 4.3 結合相似度與信任度的推薦 38 4.3.1 當相似度與信任方式皆能完成分數預測 38 4.3.2 以觀察方式建立分辨協同過濾法的規則 39 4.3.3 以線性組合方式結合兩種方法的結果 41 4.3.4 模型的推薦結果 42 4.3.4.1 相似度與信任協同過濾皆能完成分數預測 42 4.3.4.2 僅有一種協同過濾法能完成分數預測 43 4.3.4.3 混合式對測試集的改善 44 第五章結論與未來研究 46 5.1 結論 46 5.2 未來研究 47 參考文獻 49

參考文獻 References
[1] David Goldberg, David Nichols, Brian M. Oki and Douglas Terry. "Using collaborative filtering to weave an information tapestry." Communications of the ACM - Special issue on information filtering, Volume 35, Issue 12, 1992, Pages 61 - 70. [2] Jennifer Ann Golbeck. "Computing with trust: Definition, properties, and algorithms." Securecomm and Workshops, 2006, Pages 1 - 7. [3] R. Guha, Ravi Kumar, Prabhakar Raghavan and Andrew Tomkins. "Propagation of trust and distrust." In Proceedings of the 13th International Conference on World Wide Web, 2004, Pages 403 - 412. [4] Sepandar D. Kamvar, Mario T. Schlosser and Hector Garcia-Molina. "The Eigentrust algorithm for reputation management in P2P networks." In Proceedings of the 12th International Conference on World Wide Web, 2003, Pages 640 - 651. [5] Raph Levien and Alexander Aiken. "Attack-resistant trust metrics for public key certification." In Proceedings of the 7th Conference on USENIX Security Symposium - Volume 7, 1998, Pages 18 - 18. [6] Cai-Nicolas Ziegler and Georg Lausen. "Spreading activation models for trust propagation." In Proceedings of the 2004 IEEE International Conference on e-Technology, e-Commerce and e-Service, 2004, Pages 83 - 97. [7] Jennifer Ann Golbeck and James Hendler. "Computing and applying trust in web-based social networks." Doctoral Dissertation University of Maryland, Department of Computer Science, 2005. [8] Jennifer Ann Golbeck and James Hendler. "FilmTrust: Movie Recommendations using Trust in Web-based Social Networks." Consumer Communications and Networking Conference, Volume 1, 2006, Pages 282 - 286. [9] Paolo Massa and Paolo Avesani. "Trust-aware Recommender Systems." In Proceedings of the 2007 ACM Conference on Recommender Systems, 2007, Pages 17 - 24. [10] John O'Donovan and Barry Smyth. "Trust in recommender systems." In Proceedings of the 10th International Conference on Intelligent User Interfaces, 2005, Pages 167 - 174. [11] Samah Al-Oufi, Heung-Nam Kim and Abdulmotaleb El Saddik. "A group trust metric for identifying people of trust in online social networks." Expert Systems with Applications, Volume 39, Issue 18, 2012, Pages 13173 - 13181. [12] Paolo Massa and Paolo Avesani. "Trust metrics on controversial users: balancing between tyranny of the majority and echo chambers." International Journal on Semantic Web and Information Systems, Volume 3, Issue 1, 2007. [13] Hyung Jun Ahn. "A new similarity measure for collaborative filtering to alleviate the new user cold-starting problem." Information Sciences, Volume 178, Issue 1, 2008, Pages 37 - 51. [14] Gediminas Adomavicius and Alexander Tuzhilin. "Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions." IEEE Transactions on Knowledge and Data Engineering, Volume 17, Issue 6, 2005, Page 734 - 749. [15] Hao Ma, Tom Chao Zhou, Michael R. Lyu and Irwin King. "Improving recommender systems by incorporating social contextual information." ACM Transactions on Information Systems, Volume 29, Issue 2, 2011, Article No. 9. [16] Guibing Guo, Jie Zhang and Daniel Thalmann. "Merging trust in collaborative filtering to alleviate data sparsity and cold start." Knowledge-Based Systems, Volume 57, 2014, Pages 57 - 68. [17] Paolo Massa and Paolo Avesani. "Trust-aware Collaborative Filtering for Recommender Systems." On the Move to Meaningful Internet Systems 2004: CoopIS, DOA, and ODBASE, Lecture Notes in Computer Science, Volume 3290, 2004, Pages 492 - 508. [18] Sanjog Ray and Ambuj Mahanti. "Improving prediction accuracy in trust-aware recommender systems." In Proceedings of Hawaii International Conference on System Sciences, 2010, Pages 1 - 9. [19] Aman Kumar Sharma and Suruchi Sahni. "A Comparative Study of Classification Algorithms for Spam Email Data Analysis." International Journal on Computer Science and Engineering, Vol. 3, Issue 5, 2011, Pages 1890-1895. [20] Ammar Alazab, Michael Hobbs, Jemal Abawajy and Moutaz Alazab. "Using feature selection for intrusion detection system." In Proceedings of International Symposium on Communications and Information Technologies, 2012, Pages 296 - 301. [21] Mohsen Jamali and Martin Ester. "Using a trust network to improve top-N recommendation." In Proceedings of ACM Conference on Recommender Systems, 2009, Pages 181 - 188. [22] James A. Hanley and Barbara J. McNeil. "The meaning and use of the area under a receiver operating characteristic (ROC) curve." Radiological Society of North America Incorporated, 1982, Pages 1022 - 1027. [23] Alejandro Bellogín and Pablo Castells. "A Performance Prediction Approach to Enhance." Advances in Information Retrieval Lecture Notes in Computer Science, Volume 5993, 2010, Pages 382-393. [24] Neal Lathia, Stephen Hailes and Licia Capra. "The Role of Trust in Collaborative Filtering."http://www0.cs.ucl.ac.uk/staff/l.capra/publications/lathia_recsys_handbook09.pdf, 2009. [25] Yung-Ming Li, Chun-Te Wu and Cheng-Yang Lai. "A social recommender mechanism for e-commerce: Combining similarity, trust and relationship." Decision Support Systems, Volume 55, Issue 3, 2013, Pages 740 - 752. [26] Mingsong Mao, Jie Lu, Guangquan Zhang and Jinlong Zhang. "Hybridizing Social Filtering for Recommender Systems." Foundations of Intelligent Systems, Advances in Intelligent Systems and Computing, Volume 277, 2014, Pages 273 - 285. [27] Nathaniel Good, J. Ben Schafer, Joseph A. Konstan, Al Borchers, Badrul Sarwar, Jon Herlocker and John Riedl. "Combining collaborative filtering with personal agents for better recommendations." In Proceedings of the Sixteenth National Conference on Artificial Intelligence, 1999, Pages 439 - 446. [28] Allen G. Schick, Lawrence A. Gordon and Susan Haka. "Information overload: A temporal approach." Accounting, Organizations and Society, Volume 15, Issue 3, 1990, Pages 199 - 220. [29] Yu Zhang, Huajun Chen, Chaolun Xia and Xiaohong Jiang. "Building trust in electronic communities by mining web content." International Journal of Computational Science and Engineering, Volume 5, Issue 1, 2010, Pages 58 - 67. [30] Ahmed Sameh. "A Twitter Analytic Tool to Measure Opinion, Influence and Trust." Journal of Industrial and Intelligent Information, Volume 1, No. 1, 2013, Pages 37 - 45. [31] David Ben-Shimon, Alexander Tsikinovsky, Lior Rokach, Amnon Meisles, Guy Shani and Lihi Naamani. "Recommender System from Personal Social Networks." Advances in Intelligent Web Mastering, Advances in Soft Computing, Volume 43, 2007, Pages 47 - 55. [32] Xiwang Yang, Harald Steck, Yang Guo and Yong Liu. "On top-k recommendation using social networks." In Proceedings of ACM Conference on Recommender Systems, 2012, Pages 67 - 74. [33] Chein-Shung Hwang and Yu-Pin Chen. "Using Trust in Collaborative Filtering Recommendation." New Trends in Applied Artificial Intelligence, Lecture Notes in Computer Science, Volume 4570, 2007, Pages 1052 - 1060. [34] Mohsen Jamali and Martin Ester. "A matrix factorization technique with trust propagation for recommendation in social networks." In Proceedings of the Fourth ACM Conference on Recommender Systems, 2010, Pages 135 - 142. [35] Cai-Nicolas Ziegler and Georg Lausen. "Spreading Activation Models for Trust Propagation." In Proceedings of IEEE International Conference on e-Technology, e-Commerce and e-Service, 2004, Pages 83 - 97. [36] Fuzhi Zhang, Long Bai and Feng Gao. "A User Trust-Based Collaborative Filtering Recommendation Algorithm." Information and Communications Security, Lecture Notes in Computer Science, Volume 5927, 2009, Pages 411 - 424. [37] Jonathan L. Herlocker, Joseph A. Konstan and John Riedl. "Explaining collaborative filtering recommendations." In Proceedings of ACM Conference on Computer Supported Cooperative Work, 2000, Pages 241 - 250. [38] Shyong K. Lam and John Riedl. "Shilling recommender systems for fun and profit." Proceedings of the 13th International Conference on World Wide Web, 2004, Pages 393 - 402. [39] Jianshu Weng, Chunyan Miao and Angela Goh. "Improving collaborative filtering with trust-based metrics." Proceedings of ACM symposium on Applied computing, 2006, Pages 1860 - 1864.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0726115-133439.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS