博碩士論文 etd-0828109-151321 詳細資訊


[回到前頁查詢結果 | 重新搜尋]

姓名 李連旺(Lian-Wang Lee) 電子郵件信箱 E-mail 資料不公開
畢業系所 電機工程學系研究所(Electrical Engineering)
畢業學位 碩士(Master) 畢業時期 97學年第2學期
論文名稱(中) 一個查詢相關的資訊擷取評等方法
論文名稱(英) A Query Dependent Ranking Approach for Information
Retrieval
檔案
  • etd-0828109-151321.pdf
  • 本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
    請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
    論文使用權限

    電子論文:校內校外均不公開

    論文語文/頁數 中文/64
    統計 本論文已被瀏覽 5064 次,被下載 0 次
    摘要(中) 建立排序模型在資訊擷取領域上是一個很重要的議題。最近幾年,基於排序學習的想法,許多關於這個主題的方法被提出,而大多數方法企圖利用單個函數,希望可以通用所有查詢並給每篇文件一個分數。在本篇論文裡,我們提出一個新的查詢相關排序架構,將每個訓練查詢和它對應的文件,都分別建立各自的排序模型,當一個新的測試查詢被使用者要求,所擷取到的文件會根據與訓練查詢相似度,挑出一些較合適的排序模型,並且將它們作模型結合,經由這個組合的排序模型得到分數。而此機制也提供了結合模型的權重值。實驗結果可以證明查詢相關的排序方法具有不錯的效果,優於其他方法。
    摘要(英) Ranking model construction is an important topic in information retrieval. Recently, many approaches based on the idea of “learning to rank” have been proposed for this task and most of them attempt to score all documents of different queries by resorting to a single function. In this thesis, we propose a novel framework of query-dependent ranking. A simple similarity measure is used to calculate similarities between queries. An individual ranking model is constructed for each training query with corresponding documents. When a new query is asked, documents retrieved for the new query are ranked according to the scores determined by a ranking model which is combined from the models of similar training queries. A mechanism for determining combining weights is also provided. Experimental results show that this query dependent ranking approach is more effective than other approaches.
    關鍵字(中)
  • 模型結合
  • 查詢相似度
  • 查詢相關的排序
  • 排序學習
  • 資訊擷取
  • 排序模型
  • 關鍵字(英)
  • information retrieval
  • Ranking model
  • model combination
  • query similarity
  • learning to rank
  • query dependent ranking
  • 論文目次 摘要 i
    Abstract ii
    目錄 iii
    圖目錄 v
    表目錄 vi
    第一章 簡介 1
    1.1 研究背景 1
    1.2 問題定義 4
    1.3 研究目的 5
    1.4 論文架構 5
    第二章 文獻探討 7
    2.1 傳統資料檢索方法 7
    2.2 排序學習 10
    2.3 支援向量機排序法(Ranking SVM) 14
    2.4 評估工具 15
    2.4.1 平均精確率(MAP) 15
    2.4.2 正規化遞減累積獲益(NDCG) 17
    第三章 研究方法 19
    3.1 研究動機 19
    3.2 我們的方法(Query Dependent Ranking, QDR) 21
    3.2.1 方法概述 21
    3.2.2 排序模型的建立 23
    3.2.3 查詢的表示 24
    3.2.4 模型的選擇 28
    3.2.5 模型的結合 30
    第四章 實驗結果與分析 32
    4.1 實驗資料 32
    4.2 特徵選取 34
    4.3 結果與分析 39
    第五章 結論與未來展望 52
    5.1 結論 52
    5.2 未來研究方向 52
    參考文獻 53
    參考文獻 [1] R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval. Addison Wesley, 1999.
    [2] C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender, “Learning to Rank Using Gradient Descent,” 22nd International Conference on Machine Learning, pages 89-96, 2005.
    [3] Z. Cao, T. Qin, T.-Y. Liu, M.-F. Tsai, and H. Li, “Learning to Rank: From Parwise Approach to Listwise Approach,” 24th International Conference on Machine Learning, pages 129-136, 2007.
    [4] Y. Freund, R. Iyer, R. E. Schapire, and Y. Singer, “An Efficient Boosting Algorithm for Combining Preferences,” Journal of Machine Learning Research, Vol. 4, pages 933-969, 2003.
    [5] X.-B. Geng, T.-Y. Liu, T. Qin, H. Li, and H.-Y. Shum, “Query-Dependent Ranking Using K-Nearest Neighbor,” 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 115-122, 2008.
    [6] R. Herbrich, T. Graepel, and K. Obermayer, “Large Margin Rank Boundaries for Ordinal Regression,” Advances in Large Margin Classifiers. MIT Press, 2000.
    [7] W. Hersh, C. Burkley, T. J. Leone, and D. Hickam, “Ohsumed: An Interactive Retrieval Evaluation and New Large Test Collection for Research,” 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 115-122, 2008.
    [8] K. Järvelin and J. Kekäläinen, “Cumulated Gain-based Evaluation of IR Techniques,” ACM Transactions on Information Systems, Vol. 20, No. 4, pages 422-446, 2002.
    [9] T. Joachims, “Optimizing Search Engines Using Clickthrough Data,” ACM Conference on Knowledge Discovery and Data Mining, pages 133-142, 2002.
    [10] J. Kleinberg, "Authoritative Sources in a Hyperlinked Environment," Journal of the ACM, Vol. 46, No. 5, pages 604-622, 1999.
    [11] S.-J. Lee and C.-S. Ouyang, ”A Neuro-Fuzzy System Modeling with Self-Constructing Rule Generation and Hybrid SVD-Based Learning,” IEEE Transactions on Fuzzy Systems, Vol. 11, No. 3, pages 341-353, 2003.
    [12] P. Li, C. Burges, and Q. Wu, “McRank: Learning to Rank Using Multiple Classification and Gradient Boosting,” 21st Annual Conference on Neural Information Processing Systems, pages 845-852, 2007.
    [13] R. Nallapati, “Discriminative Models for Information Retrieval,” 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 64-71, 2004.
    [14] L. Page, S. Brin, R. Motwani, and T. Winograd, “The PageRank Citation Ranking: Bringing Order to the Web,” Technical report, Stanford University, 1998.
    [15] G. Salton and M. J. McGill, Introduction to Modern Retrieval. McGraw-Hill Book Company, 1983.
    [16] S. E. Robertson and K. S. Jones, “Relevance weighting of search terms,” Journal of the American Society for Information Sciences, Vol. 27, No. 3, pages 129-146, 1976.
    [17] S. E. Robertson, “Overview of the Okapi Projects”, Journal of Documentation, Vol. 53, No. 1, pages 3-7, 1997.
    [18] M.-F. Tsai, T.-Y. Liu, T. Qin, H.-H. Chen, and W.-Y. Ma, “Frank: A Ranking Method with Fidelity Loss,” 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 383-390, 2007.
    [19] J. Xu and H. Li, “AdaRank: A Boosting Algorithm for Information Retrieval,” 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 391-398, 2007.
    [20] Y. Yue, T. Finley, F. Radlinski, and T. Joachims, “A Support Vector Method for Optimizing Average Precision,” 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 271-278, 2007.
    [21] C. Zhai and J. Lafferty, “A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval,” 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 334-342, 2001.
    [22] http://research.microsoft.com/en-us/people/tyliu/.
    [23] http://research.microsoft.com/en-us/um/beijing/projects/letor/index.html.
    [24] http://searchenginewatch.com/3632382.
    [25] http://svmlight.joachims.org/.
    [26] http://www.find.org.tw/0105/howmany/howmany_disp.asp?id=219.
    口試委員
  • 錢炳全 - 召集委員
  • 李健興 - 委員
  • 潘欣泰 - 委員
  • 潘正祥 - 委員
  • 李錫智 - 指導教授
  • 口試日期 2009-07-24 繳交日期 2009-08-28

    [回到前頁查詢結果 | 重新搜尋]


    如有任何問題請與論文審查小組聯繫