博碩士論文 etd-0808111-134811 詳細資訊


[回到前頁查詢結果 | 重新搜尋]

姓名 江忠益(Jung-Yi Jiang) 電子郵件信箱 jungyi213@gmail.com
畢業系所 電機工程學系研究所(Electrical Engineering)
畢業學位 博士(Ph.D.) 畢業時期 99學年第2學期
論文名稱(中) 文件資料維度縮減與多標籤分類方法之研究  
論文名稱(英) Feature Reduction and Multi-label Classification Approaches for Document Data
檔案
  • etd-0808111-134811.pdf
  • 本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
    請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
    論文使用權限

    電子論文:校內校外完全公開

    論文語文/頁數 英文/122
    統計 本論文已被瀏覽 5073 次,被下載 908 次
    摘要(中) 本論文針對文件資料提出新的特徵縮減與多標籤分類方法。在處理文件資料時通常以向量空間模型(vector space model)來描述一個文件內容。這樣的表示方式使得文件特徵維度會非常的龐大,因此不利於文件內容的分類處理。為了改善高維度造成的影響,在本論文中我們提出一個特徵群聚方法來進行文件維度縮減,並設計一個高效率的方法來處理文件資料分類問題。
    我們利用所研發之自建構群聚方法,針對高維度文件資料特徵進行分群,將分群之後的結果利用加權方式結合,形成新的較低維度資料集。在分群的過程中,我們透過特徵出現在各個類別的機率分佈來表示每個特徵,並利用歸屬函數來計算每個特徵之間的相似度,藉此將相似特徵歸納為一群。自建構群聚方法應用漸進式的方式,逐一處理輸入的資料,並且只處理一次,因此速度比傳統迭代式方法快很多。經過複雜度的分析與實驗結果的比較,證明我們所提出的方法具有快速的執行能力與更高的精確度之優點。透過特徵分群的方式來減少維度,可以節省大量的資料儲存空間,也可以減少執行分類工作訓練與測試的時間。經由實際文件資料實驗佐證,我們的方法可以比其他已知方法更快速有效的進行文件維度縮減,並且幫助分類器取得不錯的分類結果。
    在本論文中我們也提出一個多標籤文件分類方法。多標籤文件為一個文件資料可以同時屬於多個類別。透過模糊相似度的運算將每一個文件表示為對於類別的模糊相似度向量,此向量的長度為類別個數,其遠小於文件維度,以此達到維度縮減的功能,提升後續處理的速度。當文件以模糊相似度向量表示,則可以透過文件對於各類別之間相似程度的分布情形來衡量兩個文件之間是否相似,我們透過一個漸進式的分群方法對以模糊相似度向量表示的文件進行分群,其中每一群代表一種特定的分布情形。然後以最小平方法來評估各種分布情形對於多標籤類別的影響,最後透過訓練樣本的輸入,取得對於每一個類別的門檻值,以此來分類多標籤文件資料。經由實際文件資料實驗佐證,我們的方法可以更快速有效的對多標籤文件資料進行分類的工作。
    摘要(英) This thesis proposes some novel approaches for feature reduction and multi-label classification for text datasets. In text processing, the bag-of-words model is commonly used, with each document modeled as a vector in a high dimensional space. This model is often called the vector-space model. Usually, the dimensionality of the document vector is huge. Such high-dimensionality can be a severe obstacle for text processing algorithms. To improve the performance of text processing algorithms, we propose a feature clustering approach to reduce the dimensionality of document vectors. We also propose an efficient algorithm for text classification.
    Feature clustering is a powerful method to reduce the dimensionality
    of feature vectors for text classification. We
    propose a fuzzy similarity-based self-constructing algorithm for
    feature clustering. The words in the feature vector of a document
    set are grouped into clusters based on similarity test. Words that
    are similar to each other are grouped into the same cluster. Each
    cluster is characterized by a membership function with statistical
    mean and deviation. When all the words have been fed in, a desired
    number of clusters are formed automatically. We then have one
    extracted feature for each cluster. The extracted feature
    corresponding to a cluster is a weighted combination of the words
    contained in the cluster. By this algorithm, the derived membership
    functions match closely with and describe properly the real
    distribution of the training data. Besides, the user need not
    specify the number of extracted features in advance, and
    trial-and-error for determining the appropriate number of extracted
    features can then be avoided. Experimental results show
    that our method can run faster and obtain better extracted features than other methods.
    We also propose a fuzzy similarity clustering scheme for multi-label
    text categorization in which a document can belong to one or more
    than one category. Firstly, feature transformation is performed. An
    input document is transformed to a fuzzy-similarity vector. Next,
    the relevance degrees of the input document to a collection of
    clusters are calculated, which are then combined to obtain the
    relevance degree of the input document to each participating
    category. Finally, the input document is classified to a certain
    category if the associated relevance degree exceeds a threshold. In
    text categorization, the number of the involved terms is usually
    huge. An automatic classification system may suffer from large
    memory requirements and poor efficiency. Our scheme can do without
    these difficulties. Besides, we allow the region a category covers
    to be a combination of several sub-regions that are not necessarily
    connected. The effectiveness of our proposed scheme is demonstrated
    by the results of several experiments.
    關鍵字(中)
  • 特徵分群
  • 自建構分群
  • 多標籤文件分類
  • 文件分類
  • 維度縮減
  • 關鍵字(英)
  • multi-label document classification
  • self-constructing clustering
  • text classification
  • dimension reduction
  • feature clustering
  • 論文目次 摘要 i
    Abstract iii
    Contents v
    List of Figures viii
    List of Tables x
    1  Introduction 1
    1.1  Text Classification  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   1
    1.2  Document Feature Reduction  . . . . . . . . . . . . . . . . . . . . . . . . . . .   3
    1.3  Multi-label Text Classification . . . . . . . . . . . . . . . . . . . . . . . . . . .   6
    1.4  Overview  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   8
    2  Document Feature Reduction 9
    2.1  Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   9
    2.2  Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  10
    2.2.1  Information Gain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  11
    2.2.2  Incremental Orthogonal Centroid Algorithm . . . . . . . . . . . . . . .  12
    2.2.3  Divisive Information-Theoretic Feature Clustering . . . . . . . . . . . .  14
    3  Multi-label Text Categorization 17
    3.1  Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  17
    3.1.1  Problem Transformation . . . . . . . . . . . . . . . . . . . . . . . . . .  19
    3.1.2  Algorithm Adaptation . . . . . . . . . . . . . . . . . . . . . . . . . . .  21
    3.2  Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  23
    3.2.1  Fuzzy Similarity Measure  . . . . . . . . . . . . . . . . . . . . . . . . .  24
    3.2.2  Rank-SVM  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  26
    3.2.3  ML-RBF  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  26
    3.2.4  ML-KNN  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  27
    3.2.5  BoosTexter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  28
    4  A Fuzzy Self-Constructing Feature Clustering Algorithm 29
    4.1  Self-Constructing Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . .  32
    4.2  Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  37
    4.3  Text Classification  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  39
    4.4  An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  41
    5  A Novel Similarity-Based Scheme for Multi-Label Text Categorization     44
    5.1  Feature Transformation  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  46
    5.2  Cluster-similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  49
    5.3  Category-similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  51
    5.4  Hard-limiting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  53
    5.5  Operation Phases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  57
    5.6  An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  59
    6  Experimental Results 64
    6.1  Experimental Results for Document Feature Reduction . . . . . . . . . . . . .  64
    6.1.1  20 Newsgroups Dataset  . . . . . . . . . . . . . . . . . . . . . . . . . .  65
    6.1.2  REUTERS CORPUS VOLUME 1 (RCV1) Dataset . . . . . . . . . . .  71
    6.1.3  Cade12 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  75
    6.2  Experimental Results for Multi-label Classification  . . . . . . . . . . . . . . .  79
    6.2.1  WebKB Dataset  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  81
    6.2.2  Medical Dataset  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  83
    6.2.3  YAHOO Web Page Dataset . . . . . . . . . . . . . . . . . . . . . . . .  85
    6.2.4  RCV1 Dataset  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  87
    7  Conclusion 92
    Bibliography 96
    參考文獻 [1] M. Craven, D. DiPasquo, D. Freitag, A. McCallum, T. Mitchell, K. Nigam, and S. Slat-
    tery, “Learning to extract symbolic knowledge from the World Wide Web,” Fifteenth
    National Conference on Artifical Intelligence, 1998.
    [2] D. D. Lewis and K. A. Knowles, “Threading electronic mail: A preliminary study,”
    Information Processing and Management, vol. 33, no. 2, pp. 209–217, 1997.
    [3] K. Lang, “NewsWeeder : Learning to filter netnews,” International Conference on Ma-
    chine Learning, pp. 331–339, 1995.
    [4] S. Chakrabarti, B. Dom, R. Agrawal, and P. Raghavan, “Keyword detection, navigation,
    and annotation in hierarchical text,” 23rd International Conference on Very Large Data
    Bases, pp. 446–455, 1997.
    [5] S. Weiss, S. Kasif, and E. Brill, “Text classification in USENET newsgroup: A progress
    report,” AAAI Spring Symposium on Machine Learning in Information Access Technical
    Papers, 1996.
    [6] D. Hull, J. Pedersen, and H. Schutze, “Document routing as statistical classification,”
    AAAI Spring Symposium on Machine Learning in Information Access Technical Papers,
    1996.
    [7] T. Yan and H. Molina, “SIFT - a tool for wide-area information dissemination,” 1995
    USENIX Technical Conference, pp. 177–186, 1995.
    [8] G. Salton and M. J. McGill, Introduction to Modern Retrieval.   McGraw-Hill Book
    Company, 1983.
    [9] T. Joachims, “A probabilistic analysis of the Rocchio algorithm with TFIDF for text
    categorization,” in 14th International Conference on Machine Learning, 1997, pp. 143–
    151.
    [10] “Http://people.csail.mit.edu/jrennie/20newsgroups/.”
    [11] “Http://kdd.ics.uci.edu/databases/reuters21578/reuters21578.html.”
    [12] H. Kim, P. Howland, and H. Park, “Dimension reduction in text classification with
    support vector machines,” Journal of Machine Learning Research, vol. 6, pp. 37–53,
    2005.
    [13] F. Sebastiani, “Machine learning in automated text categorization,” ACM Computing
    Surveys, vol. 34, no. 1, pp. 1–47, 2002.
    [14] B. Y. Ricardo and R. N. Berthier, Modern Information Retrieval.   Addison Wesley
    Longman, 1999.
    [15] A. L. Blum and P. Langley, “Selection of relevant features and examples in machine
    learning,” Aritficial Intelligence, vol. 97, no. 1-2, pp. 245–271, 1997.
    [16] E. F. Combarro, E. Monta˜nés, I. Díaz, J. Ranilla, and R. Mones, “Introducing a family
    of linear measures for feature selection in text categorization,” IEEE Transactions on
    Knowledge and Data Engineering, vol. 17, no. 9, pp. 1223–1232, 2005.
    [17] K. Daphne and M. Sahami, “Toward optimal feature selection,” in 13th International
    Conference on Machine Learning, 1996, pp. 284–292.
    [18] R. Kohavi and G. John, “Wrappers for feature subset selection,” Aritficial Intelligence,
    vol. 97, no. 1-2, pp. 273–324, 1997.
    [19] Y. Yang and J. O. Pedersen, “A comparative study on feature selection in text cat-
    egorization,” in 14th International Conference on Machine Learning, 1997, pp. 412–
    420.
    [20] H. Liu and L. Yu, “Toward integrating feature selection algorithms for classification and
    clustering,” IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 4, pp.
    491–502, 2005.
    [21] D. D. Lewis, “Feature selection and feature extraction for text categorization,” in Work-
    shop Speech and Natural Language, 1992, pp. 212–217.
    [22] H. Li, T. Jiang, and K. Zang, “Efficient and robust feature extraction by maximum
    margin criterion,” in Conference on Advances in Neural Information Processing System,
    2004, pp. 97–104.
    [23] E. Oja, Subspace Methods of Pattern Recognition.  Research Studies Press, 1983.
    [24] R. Caruana and D. Freitag, “Greedy attribute selection.” 11th International Conference
    on Machine Learning, pp. 28–36, 1994.
    [25] J. G. Dy and C. E. Brodley, “Feature subset selection and order identification for un-
    supervised learning,” 17th International Conference on Machine Learning, pp. 247–254,
    2000.
    [26] Y. Kim, W. Street, and F. Menczer, “Feature selection for unsupervised learning via
    evolutionary search,” Sixth ACM SIGKDD International Conference on Knowledge Dis-
    covery and Data Mining, pp. 365–369, 2000.
    [27] M. Dash, K. Choi, P. Scheuermann, and H. Liu, “Feature selection for clustering - a
    filter solution,” Second International Conference on Data Mining, pp. 115–122, 2002.
    [28] M. A. Hall, “Correlation-based feature selection for discrete and numeric class machine
    learning,” 17th International Conference on Machine Learning, pp. 359–366, 2000.
    [29] H. Liu and R. Setiono, “A probabilistic approach to feature selection - a filter solution,”
    13th International Conference on Machine Learning, pp. 319–327, 1996.
    [30] L. Yu and H. Liu, “Feature selection for high-dimensional data: A fast correlation-based
    filter selection,” 20h International Conference on Machine Learning, pp. 856–863, 2003.
    [31] S. Das, “Filters, wrappers and a boosting-based hybrid for feature selection,” 18th In-
    ternational Conference on Machine Learning, pp. 74–81, 2001.
    [32] A. Y. Ng, “On feature selection: Learning with exponentially many irrelevant features
    as training examples,” 15th International Conference on Machine Learning, pp. 404–
    412, 1998.
    [33] E. Xing, M. Jordan, and R. Karp, “Feature selection for high-dimensional genomic
    microarray data,” 15th International Conference on Machine Learning, pp. 601–608,
    1998.
    [34] P. Langley, “Selection of relevant feature in machine learning,” The AAAI Fall Sympo-
    sium on Relevance, pp. 140–144, 1994.
    [35] J.Yan, B. Zhang, N. Liu, S. Yan, Q. Cheng, W. Fan, Q. Yang, W. Xi, and Z. Chen,
    “Effective and efficient dimensionality reduction for large-scale and streaming data pre-
    processing,” IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 3, pp.
    320–331, 2006.
    [36] I. T. Jolliffe, Principal Component Analysis.  Springer-Verlag, 1986.
    [37] A. M. Martinez and A. C. Kak, “PCA versus LDA,” IEEE Transactions on Pattern
    Analysis and Machine Intelligence, vol. 23, no. 2, 2001.
    [38] H. Park, M. Jeon, and J. B. Rosen, “Lower dimensional representation of text data based
    on centroids and least squares,” BIT Numberical Math, vol. 43, pp. 427–448, 2003.
    [39] S. T. Roweis and L. K. Saul, “Nonlinear dimensionality reduction by locally linear
    embedding,” Science, vol. 290, pp. 2323–2326, 2000.
    [40] J. B. Tenenbaum, V. de Silva, and J. C. Langford, “A global geometric framework for
    nonlinear dimensionality reduction,” Science, vol. 290, pp. 2319–2323, 2000.
    [41] M. Belkin and P. Niyogi, “Laplacian eigenmaps and spectral techniques for embedding
    and clustering,” Advances in Neural Information Processing Systems 14, 2002.
    [42] K. Hiraoka, K. Hidai, M. Hamahira, H. Mizoguchi, T. Mishima, and S. Yoshizawa,
    “Successive learning of linear discriminant analysis: Sanger-type algorithm,” in 14th
    International Conference on Pattern Recognition, 2000, pp. 2664–2667.
    [43] J. Weng, Y. Zhang, and W. S. Hwang, “Candid covariance-free incremental principal
    component analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence,
    vol. 25, no. 8, pp. 1034–1040, 2003.
    [44] J. Yan, B. Y. Zhang, S. C. Yan, Z. Chen, W. G. Fan, Q. Yang, W. Y. Ma, and Q. S.
    Cheng, “Immc: Incremental maximum margin criterion,” in 10th ACM SIGKDD Inter-
    national Conference on Knowledge Discovery and Data Mining, 2004, pp. 725–730.
    [45] L. D. Baker and A. McCallum, “Distributional clustering of words for text classification,”
    in 21st Annual International ACM SIGIR, 1998, pp. 96–103.
    [46] R. Bekkerman, R. El-Yaniv, N. Tishby, and Y. Winter, “Distributional word clusters vs.
    words for text categorization,” Journal of Machine Learning Research, vol. 1, pp. 1–48,
    2002.
    [47] M. C. Dalmau and O. W. M. Flórez, “Experimental results of the signal processing
    approach to distributional clustering of terms on reuters-21578 collection,” in 29th Eu-
    ropean Conference on IR Research, 2007, pp. 678–681.
    [48] I. S. Dhillon, S. Mallela, and R. Kumar, “A divisive infromation-theoretic feature clus-
    tering algorithm for text classification,” Journal of Machine Learning Research, vol. 3,
    pp. 1265–1287, 2003.
    [49] D. Ienco and R. Meo, “Exploration and reduction of the feature space by hierarchical
    clustering,” in 2008 SIAM Conference on Data Mining, 2008, pp. 577–587.
    [50] N. Slonim and N. Tishby, “The power of word clusters for text classification,” in 23rd
    European Colloquium on Information Retrieval Research (ECIR), 2001.
    [51] F. Pereira, N. Tishby, and L. Lee, “Distributional clustering of englishwords,” in 31st
    Annual Meeting of ACL, 1993, pp. 183–190.
    [52] H. Al-Mubaid and S. A. Umair, “A new text categorization technique using distributional
    clustering and learning logic,” IEEE Transactions on Knowledge and Data Engineering,
    vol. 18, no. 9, pp. 1156–1165, 2006.
    [53] R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval.  Addison Wesley,
    1999.
    [54] M. R. Boutell, J. Luo, X. Shen, and C. M. Brown, “Learning multi-label scene classifi-
    cation,” Pattern Recognition, vol. 37, no. 9, pp. 1757–1771, 2004.
    [55] A. Elisseeff and J. Weston, “A kernel method for multi-labelled classification,” Advances
    in Neural Information Processing Systems 14, MIT Press, Cambridge, pp. 681–687,
    2002.
    [56] J. J. Rocchio, “Relevance feedback in information retrieval,” In G. Salton (Ed.), The
    SMART retrieval system: Experiments in automatic document processing, pp. 313–323,
    1971.
    [57] T. Mitchell, Machine Learning.  McGraw-Hill, 1997.
    [58] S. Tan, “Neighbor-weighted k-nearest neighbor for unbalanced text corpus,” Expert Sys-
    tems with Applications, vol. 28, no. 4, pp. 667–671, 2005.
    [59] S. Tan, “An effective refinement strategy for KNN text classifier,” Expert Systems with
    Applications, vol. 30, no. 2, pp. 290–298, 2006.
    [60] Y. Yang and C. G. Chute, “An example-based mapping method for text categorization
    and retrieval,” ACM Transactions on Information Systems, vol. 12, no. 3, pp. 252–277,
    1994.
    [61] D. A. Hull, “Improving text retrieval for the routing problem using latent semantic
    indexing,” ACM International Conference on Research and Development in Information
    Retrieval, pp. 282–289, 1994.
    [62] D. W. Aha, “Lazy learning: Special issue editorial,” Artificial Intelligence Review, vol. 11,
    no. 1-5, pp. 7–10, 1997.
    [63] D. Lewis and M. Ringuette, “A comparison of two learning algorithms for text catego-
    rization,” Third Annual Symposium on Document Analysis and Information Retrieval,
    pp. 81–93, 1994.
    [64] I. J. Good, The Estimation of Probabilities: An Essay on Modern Bayesian Methods.
    MIT Press, 1965.
    [65] J. R. Quinlan, “Induction of decision trees,” Machine Learning, vol. 1, pp. 81–106, 1986.
    [66] J. R. Quinlan, C4.5: Programs for Machine Learning.  Morgan Kaufmann, 1993.
    [67] N. Fuhr and C. Buckley, “A probabilistic learning approach for document indexing,”
    ACM Transactions on Information Systems, vol. 9, no. 3, pp. 223–248, 1991.
    [68] C. Apté, F. J. Damerau, and S. M. Weiss, “Automated learning of decision rules for text
    categorization,” ACM Transactions on Information Systems, vol. 12, no. 3, pp. 233–251,
    1994.
    [69] W. W. Cohen and Y. Singer, “Context-sensitive learning methods for text categoriza-
    tion,” ACM Transactions on Information Systems, vol. 17, no. 2, pp. 141–173, 1999.
    [70] T. Joachims, “Text categorization with support vector machines: Learning with many
    relevant features,” European Conference on Machine Learning, pp. 137–142, 1998.
    [71] S. T. Dumais, J. Platt, D. Heckerman, and M. Sahami, “Inductive learning algorithms
    and representation for text categorization,” 7th ACM International Conference on In-
    formation and Knowledge Management, pp. 148–155, 1998.
    [72] G. Tsoumakas and I. Katakis, “Multi-label classification: An overview,” International
    Journal of Data Warehousing and Mining, vol. 3, no. 3, pp. 1–13, 2007.
    [73] G. Tsoumakas, I. Katakis, and I. Vlahavas, Mining Multi-label Data.  O. Maimon, L.
    Rokach (Ed.), Springer, 2nd edition, 2010.
    [74] A. McCallum, “Multi-label text classification with a mixture model trained by EM,”
    Working Notes of the AAAI’99 Workshop on Text Learning, 1999.
    [75] R. E. Schapire and Y. Singer, “BoosTexter: A boosting-based system for text catego-
    rization,” Machine Learning, vol. 39, no. 2-3, pp. 135–168, 2000.
    [76] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete
    data via the EM algorithm,” Journal of the Royal Statistics Society-B, vol. 39, no. 1,
    pp. 1–38, 1977.
    [77] Y. Freund and R. E. Schapire, “A decision-theoretic generalization of on-line learning
    and an application to boosting,” Journal of Computer and System Sciences, vol. 55,
    no. 1, pp. 119–139, 1997.
    [78] F. D. Comité, R. Gilleron, and M. Tommasi, “Learning multi-label altenating decision
    tree from texts and data,” Lecture Notes in Computer Science, vol. 2734, pp. 35–49,
    2003.
    [79] Y. Freund and L. Mason, “The alternating decision tree learning algorithm,” 16th In-
    ternational Conference on Machine Learning, pp. 124–133, 1999.
    [80] M. L. Zhang and Z. H. Zhou, “Multilabel neural networks with applications to func-
    tional genomics and text categorization,” IEEE Transactions on Knowledge and Data
    Engineering, vol. 18, no. 10, pp. 1338–1351, 2006.
    [81] M. L. Zhang, “ML-RBF: RBF neural networks for multi-label learning,” Neural Pro-
    cessing Letters, vol. 29, no. 2, pp. 61–74, 2009.
    [82] M. L. Zhang and Z. H. Zhou, “ML-kNN: A lazy learning approach to multi-label learn-
    ing,” Pattern Recognition, vol. 40, no. 7, pp. 2038–2048, 2007.
    [83] M. L. Zhang, J. M. Peña, and V. Robles, “Feature selection for multi-label naive bayes
    classification,” Information Sciences, vol. 179, no. 19, pp. 3218–3229, 2009.
    [84] M. Jeon, H. Park, and J. B. Rosen, “Dimension reduction based on centroids and least
    squares for efficient processing of text data,” Technical Report MN TR 01-010, Univ. of
    Minnesota, Minneapolis, 2003.
    [85] P. Howland and H. Park, “Generalizing discriminant analysis using the generalized sin-
    gular value decomposition,” IEEE Transactions on Pattern Analysis and Machine In-
    telligence, vol. 26, pp. 995–1006, 2004.
    [86] S. Diplaris, G. Tsoumakas, P. Mitkas, and I. Vlahavas, “Protein classification with
    multiple algorithms,” Panhellenic Conference on Informatics, vol. 3746, pp. 448–456,
    2005.
    [87] T. Goncalves and P. Quaresma, “A preliminary approach to the multilabel classification
    problem of portuguese juridical documents,” in 11th Portuguese Conference on Artificical
    Intelligence, 2003.
    [88] B. Lauser and A. Hotho, “Automatic multi-label subject indexing in a multilingual
    environment,” in 7th European Conference in Research and Advanced Technology for
    Digital Libraries, 2003.
    [89] T. Li and M. Ogihara, “Detecting emotion in music,” in Internation Symposium on
    Music Information Retreval, 2003.
    [90] A. Clare and R. D. King, “Knowledge discovery in multi-label phenotype data,” in 5th
    European Conference on Principles of Data Mining and Knowledge Discovery, 2001.
    [91] D. H. Widyantoro and J. Yen, “A fuzzy similarity approach in text classification task,”
    IEEE International Conference on Fuzzy Systems, pp. 653–658, 2000.
    [92] R. Saraco˘glu, K. T‥ut‥unc‥u, and N. Allahverdi, “A new approach on search for similar
    documents with multiple categories using fuzzy clustering,” Expert Systems with Appli-
    cations, vol. 34, no. 4, pp. 2545–2554, 2008.
    [93] J. Yen and R. Langari, Fuzzy Logic–Intelligence, Control, and Information.  Upper
    Saddle River, NJ, USA: Prentice-Hall, 1999.
    [94] J. S. Wang and C. S. G. Lee, “Self-adaptive neurofuzzy inference systems for classification
    applications,” IEEE Transactions on Fuzzy Systems, vol. 10, no. 6, pp. 790–802, 2002.
    [95] C.-S. Ouyang, W.-J. Lee, and S.-J. Lee, “A TSK-type neuro-fuzzy network approach
    to system modeling problems,” IEEE Transactions on Systems, Man, and Cybernetics
    Part B: Cybernetics, vol. 35, no. 4, pp. 751–767, 2005.
    [96] C. Cortes and V. Vapnik, “Support-vector network,” Machine Learning, vol. 20, no. 3,
    pp. 273–297, 1995.
    [97] B. Schölkopf and A. J. Smola, Learning with Kernels: Support Vector Machines, Regu-
    larization, Optimization, and Beyond.  MIT Press, Cambridge, MA, USA, 2001.
    [98] J. Shawe-Taylor and N. Cristianini, Kernel Methods for Pattern Analysis.  Cambridge
    University Press, Cambridge, UK, 2004.
    [99] J. Han and M. Kamber, Data Mining: Concepts and Techniques.  Morgan Kaufmann,
    2006.
    [100] S. P. Lloyd, “Least squares quantization in PCM,” IEEE Transactions on Information
    Theory, vol. 28, pp. 128–137, 1957.
    [101] J. MacQueen, “Some methods for classification and analysis of multivariate observa-
    tions,” Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp.
    281–297, 1967.
    [102] L. Kaufman and P. J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster
    Analysis.  John Wiley & Sons, 1990.
    [103] Z. Huang, “Extensions to the k-means algorithm for clustering large data sets with
    categorical values,” Data Mining and Knowledge Discovery, vol. 2, pp. 283–304, 1998.
    [104] T. Zhang, R. Ramakrishnan, and M. Livny, “BIRCH: An efficient data clustering method
    for very large databases,” ACM-SIGMOD International Conference on Management of
    Data, pp. 103–114, 1996.
    [105] S. Guha, R. Rastogi, and K. Shim, “Cure: An efficient clustering algorithm for large
    databases,” ACM-SIGMOD International Conference on Management of Data, pp. 73–
    84, 1998.
    [106] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, “A density-based algorithm for discovering
    clusters in large spatial databases,” International Conference on Knowledge Discovery
    and Data Mining, pp. 226–231, 1996.
    [107] A. Hinneburg and D. A. Keim, “An efficient approach to clustering in large multime-
    dia databases with noise,” International Conference on Knowledge Discovery and Data
    Mining, pp. 58–65, 1998.
    [108] S.-J. Lee and C.-S. Ouyang, “A neuro-fuzzy system modeling with self-constructing
    rule generation and hybrid SVD-based learning,” IEEE Transactions on Fuzzy Systems,
    vol. 11, no. 3, pp. 341–353, 2003.
    [109] W. Wang, J. Yang, and R. Muntz, “STING: A statistical information grid approach to
    spatial data mining,” International Conference on Very Large Data Bases, pp. 186–195,
    1997.
    [110] G. Sheikholeslami, S. Chatterjee, and A. Zhang, “WaveCluster: A muti-resolution clus-
    tering approach for very large spatial databases,” International Conference on Very
    Large Data Bases, pp. 428–439, 1998.
    [111] S. L. Lauritzen, “The EM algorithm for graphical association models with missing data,”
    Computational Satistics and Data Analysis, vol. 19, pp. 191–201, 1995.
    [112] G. H. Golub and C. F. V. Loan, Matrix Computations.  Baltimore, MD, USA: The
    Johns Hopkins University Press, 1996.
    [113] D. D. Lewis, Y. Yang, T. Rose, and F. Li, “RCV1: A new benchmark collection for text
    categorization research,” Journal of Machine Learning Research, vol. 5, pp. 361–397,
    2004.
    [114] “The cadê web directory, http://www.cade.com.br/.”
    [115] C. C. Chang and C. J. Lin, “Libsvm: A library for support vector machines,” software
    available at http://www.csie.ntu.edu.tw/∼cjlin/libsvm, 2001.
    [116] Y. Yang and X. Liu, “A re-examination of text categorization methods,” in ACM SIGIR
    Conference, 1999, pp. 42–49.
    [117] G. Tsoumakas, I. Katakis, and I. Vlahavas, “Mining multi-label data,” Data Mining and
    Knowledge Discovery Handbook (draft of preliminary accepted chapter), 2009.
    [118] Http://web.ist.utl.pt/∼acardoso/datasets/.
    [119] K. Nigam, A. K. McCallum, S. Thrun, and T. M. Mitchell, “Learning to classify text
    from labeled and unlabeled documents,” Proceedings of 15th National Conference on
    Artificial Intelligence, 1998.
    [120] J. Pestian, C. Brew, P. Matykiewicz, D. Hovermale, N. Johnson, K. B. Cohen, and
    W. Duch, “A shared task involving multi-label classification of clinical free text,”
    BioNLP 2007: Biological, translational, and clinical language processing., pp. 97–104,
    2007.
    [121] N. Ueda and K. Saito, Parametric Mixture Models for Multi-label Text.  MIT Press,
    Cambridge, MA, 2003.
    [122] “http://cse.seu.edu.cn/people/zhangml/resources.htm.”
    [123] “http://mulan.sourceforge.net/datasets.html.”
    口試委員
  • 吳志宏 - 召集委員
  • 侯俊良 - 委員
  • 歐陽振森 - 委員
  • 潘欣泰 - 委員
  • 蔡賢亮 - 委員
  • 賴智錦 - 委員
  • 李錫智 - 指導教授
  • 口試日期 2011-07-04 繳交日期 2011-08-08

    [回到前頁查詢結果 | 重新搜尋]


    如有任何問題請與論文審查小組聯繫