Responsive image
博碩士論文 etd-0902102-193316 詳細資訊
Title page for etd-0902102-193316
論文名稱
Title
文獻數位圖書館推薦技術之研究
Article Recommendation in Literature Digital Libraries
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
50
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2002-07-30
繳交日期
Date of Submission
2002-09-02
關鍵字
Keywords
文獻數位圖書館、推薦、資料採礦
literature digital library, data mining, recommendation
統計
Statistics
本論文已被瀏覽 5856 次,被下載 59
The thesis/dissertation has been browsed 5856 times, has been downloaded 59 times.
中文摘要
文獻數位圖書館將重要的文獻數位化後永久保存,並且提供研究者查詢其相關領域的研究。而藉由網際網路的幫助,文獻數位圖書館將數位化後的各種文獻置於網際網路上,使的研究者可以位於世界各地透過瀏覽器查詢。然而在查詢相關文獻的時候,往往得到非常龐大的文獻列表,之中對研究者真正有興趣的可能只有少部份,為了要提供更有效率的查詢服務,很多文獻數位圖書館也提供了推薦系統,推薦系統根據過去使用統計和使用者目前的查詢紀錄而預測使用者可能對哪些文章有興趣。本研究將目前應用於網頁的推薦技術將之應用於文獻數位圖書館之文獻推薦。對於文獻數位圖書館的推薦,提出一個架構,此架構包含三個循序的步驟: web log處理、資料採礦、與文章推薦。其中web log處理方面提出三種從web log中擷取出transaction的定義方式,資料採礦應用MSapriori,推薦方式則應用了hypergraph與association based 推薦技術。最後我們用中山大學學位論文系統的web log來評估推薦方法的precision、recall、與執行時間。
Abstract
Literature digital libraries is perhaps one of the most important resources to research as the preserved literature data is vital to any researchers and practitioners who need to now what people have done previously in a particular area. The emergence of World Wide Web (www) further boosts the circulation power of literature digital libraries, and people who are interested in a particular topic may easily find related articles by searching a literature digital library that provides a www interface. However, it is quite often that a given search condition will yield a large number of articles, among which only a small subset will indeed interest the user. To provide more effective and efficient information search, many literature digital libraries are equipped with a recommendation subsystem that recommend articles to a user based on his past or current interest. In this thesis, we adapt the existing approaches for web page recommendation to the recommendation of literature digital libraries. We have investigated issues for article recommendation of a literature digital library. We have developed a recommendation framework in this context that makes use of web log of a literature digital library. This framework consists of three sequential steps: data preparation of the web log, association discovery, and article recommendations. We proposed three alternatives in identifying transactions from a web log, adapted the MSApriori algorithm for discovery large itemsets, and discussed two approaches, namely hypergraph and association based recommendations, for making recommendation. These alternatives and approaches were evaluated using the web log of an operational electronic thesis system at NSYSU. It has been found that query-chosen and session-chosen are better methods for transaction identification, and hypergraph based approach yields better quality of article recommendation and has stable running time.
目次 Table of Contents
Chapter 1 Introduction 1
Chapter 2 Literature Review 4
2.1 General recommender systems 4
2.2 Web page recommender systems 6
2.3 Data preparation for web log 9
2.4 Multilevel hypergraph partition 13
2.4.1 Coarsening phase 16
2.4.2 Initial partitioning phase 18
2.4.3 Uncoarseing and refinement phase 18
2.5 Multiple item support Apriori 19
Chapter 3 The Approaches 22
3.1 Preparing article browsing log 23
3.2 Mining article browsing log 26
3.3 Making recommendation 27
3.3.1 Hypergraph-based approach 28
3.3.2 Association rule based approach 29
Chapter 4 Evaluation 31
4.1 Specifying minimum support 32
4.2 Performance metrics 34
4.3 Preliminary experiments for hypergraph-based recommendation 35
4.4 Comparing three transaction identification methods 36
4.5 Comparing hypergraph- and association-based recommendation approaches 39
Chapter 5 Conclusion 47

參考文獻 References
[AEK00] A. Ansari, S. Essengaier, and R. Kohli, “Internet Recommendation Systems,” Journal of Marketing Research, 37(3),pp 363-375,Aug. 2000.
[Berge76] C. Berge, “Graphs and Hypergraphs,” American Elsevier, New York, 1976.
[BHK98] J. S. Breese, D. Heckerman and C. Kadie, “Empirical analysis of predictive algorithms for collaborative filtering,” Tech. Report, MSR-TR-98-12, Microsoft Research, Oct. 1998.
[CMS99] R. Cooley, B. Mobasher and J. Srivastava: “Data preparation for mining World Wide Web browsing patterns,” Journal of Knowledge and Information Systems, 1(1), pp.5-32, 1999.
[CPY96] M.S. Chen, J.S. Park, and P.S. Yu. “Data mining for path traversal patterns in a Web environment,” In Proc. of the 16th International Conference on Distributed Computing Systems, pp.385-392, 1996.
[DK00] M. Deshpande and G. Karypis. “Selective Markov models for predicting Web-PageAccesses,” Technical Report #00-056, University of Minnesota, department of computer science, 2000.
[FBH00] X. Fu, J. Budzik, K. J. “Hammond mining navigation history for recommendation,” In Proc. on Intelligent User Interfaces.pp.106-112, 2000.
[FM82] C.M. Fiduccia and R.M. Mattheyses. “A linear time heuristic for improving network partitions,” In Proc. 19th IEEE Design Automation Conference, pages 175-181, 1982
[HK01] Jonathan L. Herlocker and Joseph A. Konstan, “Content-Independent task-focused recommendation,” IEEE INTERNET COMPUTING 2001,pp.40-47 NOV/DEC. 2001.
[Kar02] George Karypis. “Multilevel Hypergraph Partitioning,” Technique report TR#02-25 University of Minnesota, department of computer science, 2002.
[KB96] B. Krulwich and C. Burkey, “Learning user information interests through extraction of semantically significant phrases,” In Proc. of the AAAI Spring Symposium on Machine Learning in Information Access, Stanford, CA, 1996.
[Lang95] Lang, K., “Newsweeder: learning to filter Netnews," In Proc. of the Twelfth International Conference on Machine Learning,pp.331-339 1995.
[LT92] S. Loeb, D. Terry, Special issue on information filtering Communications of the ACM, 35(12), Dec. 1992.
[LHM99] B. Liu, W. Hsu, , and Y. Ma, “Mining association rules with multiple minimum supports,” In Proc. Of the Knowledge Discovery and Data Mining, San Diego,CA,USA, p337-341, Aug.1999.
[MCS99] B. Mobasher, R. Cooley, and J. Srivastava, “Creating adaptive Web sites through usage-based clustering of URLs,” In Proc. of the 1999 IEEE Knowledge and Data Engineering Exchange Workshop (KDEX), November 1999.
[MDL00] B. Mobasher, H. Dai, T. Luo, Miki Nakagawa, and Jim Witshire. “Discovery of aggregate usage profiles for web personalization,” In Proc. of the WebKDD Workshop, 2000.
[Mob99] B. Mobasher. “A Web Personalization Engine Based on User transaction clustering,” In Proc. of the 9th Workshop on Information Technologies and Systems , Dec 1999.
[MR00] R. J. Mooney and L. Roy, “Content-based book recommending using learning for text categorization,” In Proc. of the 5’th ACM Conf. on Digital Libraries, pp. 195-240, June 2000.
[Pazz99] M. J. Pazzani, “A framework for collaborative, content-based and demographic filtering,” Artificial Intelligence Review, pp. 393-408, 1999.
[PMB96] M. Pazzani, J. Muramatsu, and D. Billsus. “Syskill & Webert: Identifying interesting web sites,” In Proc. of the National Conference on Artificial Intelligence, AIII,pp.54-61, 1996
[PE98] M. Perkowitz and O. Etzioni. “Adaptive Web sites: automatically synthesizing Web pages,” In Proc. of Fifteenth National Conference on Artificial Intelligence. Madison, WI,pp.721-732, 1998.
[RP97] J. Rucker and M. J. Polanco, “Siteseer: personalized navigation for the web,” Communications of the ACM, 35(12), pp. 73-75, Dec. 1992.
[YJM96] T. W. Yan, M. Jacobsen, H. G. Molina, and U. Dayal, “From user access patterns to dynamic hypertext linking,” In Proc. of 5’th Int’l. World Wide Web Conference, Paris, France, pp. 1007-1014, May. 1996.
[YZL01] Yang, Zhang, Li. “Mining Web logs for prediction models in WWW caching and prefetching,” In Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD' 01,pp. 473-478, August 26 - 29, 2001.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內公開,校外永不公開 restricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus:永不公開 not available

您的 IP(校外) 位址是 3.139.82.23
論文開放下載的時間是 校外不公開

Your IP address is 3.139.82.23
This thesis will be available to you on Indicate off-campus access is not available.

紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code