Responsive image
博碩士論文 etd-0802100-144455 詳細資訊
Title page for etd-0802100-144455
論文名稱
Title
應用模糊法則歸納法探採分類知識
Using Fuzzy Rule Induction for Mining Classification Knowledge
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
70
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2000-07-28
繳交日期
Date of Submission
2000-08-02
關鍵字
Keywords
資料探採、知識探索、分類知識、模糊法則歸納
Classification Knowledge, Data Mining, Fuzzy Rule Induction, Knowledge Discovery
統計
Statistics
本論文已被瀏覽 5807 次,被下載 10726
The thesis/dissertation has been browsed 5807 times, has been downloaded 10726 times.
中文摘要
隨著電腦化的時代來臨,大型資料庫的應用也日漸普遍,企業每天產生大量的資料並儲存於資料庫中。若能在資料庫中找出有用的知識應用將有助於企業的管理工作與日常作業之運行。於是如何在龐大的資料庫中探採出適用的知識遂成為近來重要的研究議題。在不同的知識種類中,分類知識是被廣泛應用的一種。大部分歸納演算法所得到的分類知識,都是以明確的型態表示。然而,人類思維的方式卻是傾向於模糊式的思考。本研究提出一個探採分類知識的方法,以模糊法則的型態表示出來。整個方法從資料的準備到法則的修裁共有五個步驟,其中應用RITIO演算法做歸納學習。對於類別的推論則是透過本研究所提出之模糊推理的機制來達成。本研究將所提的方法應用在數個資料庫上做測試,結果顯示,所提方法產生的分類知識有較佳的分類能力,因而驗證本研究所提方法的適用性。
Abstract
With the computerization of businesses, more and more data are generated and stored in databases for many business applications. Finding interesting patterns among those data may lead to useful knowledge that provides competitive advantage in business. Knowledge discovery in database has thus become an important issue to help business acquire knowledge that assists managerial and operational work. Among many types of knowledge, classification knowledge is widely used. Most classification rules learned by induction algorithms are in the crisp form. Fuzzy linguistic representation of rules, however, is much closer to the way human reasons. The objective of this research is to propose a method to mine classification knowledge from the database with fuzzy descriptions. The procedure contains five steps, starting from data preparation to rule pruning. A rule induction algorithm, RITIO, is employed to generate the classification rules. Fuzzy inference mechanism that includes fuzzy matching and output reasoning is specified to yield the output class. An experiment is conducted using several databases to show advantages of this work. The proposed method is justified with good system performance. It can be easily implemented in various business applications on classification tasks.
目次 Table of Contents
TABLE of CONTENTS
CHAPTER 1 Introduction 1
1.1 OVERVIEW 1
1.2 MOTIVATION AND OBJECTIVE OF THE RESEARCH 2
1.3 ORGANIZATION OF THE THESIS 3
CHATPER 2 Literature Review 5
2.1 KDD AND DATA MINING 5
2.2 KNOWLEDGE TYPES IN DATA MINING 7
2.3 INDUCTIVE LEARNING 9
2.4 METHODS OF ATTRIBUTE DISCRETIZATION 11
2.5 FUZZY SET CONCEPT 13
2.6 INDUCTIVE LEARNING WITH FUZZY SET THEORY 15
Chapter 3 Fuzzy Rule System 17
3.1 DATA PREPARATION 17
3.2 FUZZY REGION AND MEMBERSHIP FUNCTION DETERMINATION 20
3.3 GENERATING FUZZY EXAMPLES AND INDUCTIVE LEARNING 22
3.4 RULE PRUNING 24
3.5 FUZZY INFERENCE 25
3.6 MIXED TYPES OF ATTRIBUTES 28
Chapter 4 Experiments and Results 30
4.1 AN ILLUSTRATED EXAMPLE 30
4.2 DATA RESOURCES 38
4.3 EXPERIMENTS 43
Chapter 5 Conclusions 49
5.1 CONCLUDING REMARKS 49
5.2 FUTURE WORK 49
Reference 51
APPENDIX A 58
APPENDIX B 61


參考文獻 References
Abe, S. and M. S. Lan, A classifier using fuzzy rules extracted directly form numerical data, Second IEEE International Conference on Fuzzy Systems, San Francisco, pp.1191-1198, 1993.

Agrawal, R., T., Imielinski, and A. Swami, Mining Associaiton Rules between Sets of Items in Large Databases, Proceeding of the ACM-SIGMOD 1993 International Conference on Management of Data, Washington D.C., pp. 207-216, May 1993.

Agrawal, R., K.-I. Lin, H.S. Sawhney, and K. Shim, Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Database, Proceeding of 21th International Conference Very Large Data Bases, pp.490-501, Sept 1995.

Agrawal, R. and R. Srikant, Fast Algorithms for Mining Association Rules in Large Databases, Proceeding 20th International Conference Very Large Data Bases, pp. 478-499, Sept 1994.

Anderberg, M., Cluster Analysis for Applications, Academic Press, 1973.

Bieber, M., and J. Wan, “Backtracking in a Multiple-Window Hypertext Environment,” Proceeding of the ACM European Conference Hypermedia Technology, pp. 158-166, 1994.

Brachman, R. and T. Anand, The Process of Knowledge Discovery in Databases: A Human Centered Approach, in AKDDM, AAAI/MIT Press, pp.37-58, 1996.

Caramel, E., S. Crawford, and H. Chen, “Browsing in Hypertext: A Cognitive Study,” IEEE Transactions System, Man, and Cybernetics, vol. 22, no. 5, pp.865-883, Sept 1992.

Catlett, J., Megainduction: machine learning on very large databases, PhD thesis, University of Sydney, 1991a.

Catlett, J., On changing continuous attributes into ordered discrete attributes, in Y. Kodratoff, ed., Proceedings of the European Working Session on Learning, Berlin, Germany: Springer-Verlag, pp.164-178, 1991b.

Chak, C. K., M. Palaniswami, and G. Feng, Implementation of Fuzzy Systems, in Fuzzy Logic and Expert Systems Applications, ACADEMIC Press, pp. 57- 121, 1998.

Chan, C. C., C. Batur and A. Srinivasan, Determination of quantization intervals in rule based model for dynamic systems, Proceeding of the IEEE Conference on System, Man, and Cybernetics, Charlottesvile, Virginia, pp. 1719-1723, 1911.

Chan, P.K., W. Fan, A. L. Prodromidis and S. J. Stolfo, Distributed data mining in credit card fraud detection, IEEE Intelligent Systems, Vol. 14, Issue 6, Nov.-Dec., pp.67-74, 1999.

Chang, T.-M. and Y.-W. Yih, Generating Fuzzy Rule-Based System from Examples, Proceedings of the 1996 Asian Fuzzy Systems Symposium on Soft Computing in Intelligent Systems and Information Processing, pp. 37-42, 1996.

Chen, Ming-syan, Jiawei Han, and Philip S. Yu, Data mining: an overview from a database perspective, IEEE Transactions on Knowledge and Data Engineering, vol. 8(6), pp.866-883, 1996.

Chiu, D. K. Y., B. Cheung, and A. K. C. Wong, Information synthesis based on hierarchical entropy discretiazation, Journal of Experimental and Theoretical Artificial Intelligence, vol. 2, pp. 117-129, 1990.

Chmielewski, M. R. and J. W. Grzymala-Busse, Global discretization of continuous attributes as preprocessing for machine learning, Third International Workshop on Rough Sets and Soft Computing, pp.294-341, 1994.
Clark, P. E., and R. Boswell, “Rule Induction with CN2: Some Recent Improvements,” Proceeding of Fifth European Working Session on Learning, pp.151-163. Porto, Portugal: Springer-Verlag, 1991.

Decker, P., Making a Difference with Data Mining, Banking Strategies, Chicago, Vol.74, Issue 2, pp.10-12, 1998.

Dougherty, J. F., R. Kohavi and M. Sahami, Supervised and Unsupervised Discretization of Continuous Features, Proceedings of the Twelfth International Conference on Machine Learning, Morgan Kaufmann Publishers, San Francisco, CA., pp. 194-202, 1995.

Faloutos, C., M. Ranganathan, and Y. Manolopoulos, “Fast Subsequence Matching in Time-Series Databases,” Proceeding of the ACM SIGMOD, Minneapolis, Minn., pp.419-429, May 1994.

Fanning, K., and K. O. Cogger, Neural Network Detection of Management Fraud Using Published Financial Data, Intelligent Systems in Accounting, Finance and Management, vol. 7, pp. 21-41, 1998.

Fayyad, U., D. Haussler, and P. Stolorz, “Mining scientific data,” Communication of the ACM, vol. 39, pp.51-57, 1996.

Fayyad, U. M., and K. B. Irani, Multi-interval discretization of continuous-valued attributes for classification learning, Proceeding of the 13th International Joint Conference on Artificial Intelligence, Morgan Kaufmann, pp. 1022-1027, 1993.

Fayyad. U., G. Piatetsky-Shapiro, and P. Smyth, From data mining to knowledge discovery: an overview, in Advances in Knowledge Discovery and Data Mining (Chapter 1) (Fayyad, Usama M. et al. Eds.), AAAI/ The MIT Press, 1996a.

Fayyad, U., G. Piatetsky-Shapiro, and P. Smyth, The KDD Process for Extracting Useful Knowledge from Volumes of Data. Communications of the ACM, vol. 39, No.11 pp.27-34, Nov 1996b.

Fisher, D., Improving inference thorough conceptual clustering, Proceeding of 1987 AAAI Conference, Seattle, Washington, pp. 461-465, July 1987.

Gams, M., M. Drobnic, and M. Ppetkovsek, “Learning from Examples—A Uniform View,” International Journal of Man-Machine Studies, vol. 34, pp.49-68, 1991.

Hall, C., Intelligent Data Mining at IBM: New Products and Applications, Intelligent Software Strategies, vol. 11, No. 9, pp. 1-16, 1996.

Han, J. and Y. Fu, “Exploration of the Power of Attribute-Oriented Induction in Data Mining,” U. Fayyad, G. Piatetsky-Shapiro, P. Smyth and R. Uthurusamy, eds. Advances in Knowledge Discovery and Data Mining, pp. 399-421, AAAI/MIT Press, 1996.

Harinarayan, V. J.D. Ullman, and A. Rajaraman, “Implementing Data Cubes Efficiently,” Proceeding of the ACM SIGMOD International Conference Management Data, pp.205-216, Montreal, canada, June 1996.

Holte, R. C., Very simple classification rules perform well on most commonly used datasets, Machine Learning, vol. 11, pp.63-90, 1993.

Hong, J., AE1: An extension matrix approximate method for the general covering problem, International Journal of Computer and Information Science, vol.14, No. 6, pp. 421-437,1985.

Jain, A. K. and R. C. Dubes, Algorithms for Clustering Data, Printice Hall, 1988.

Jeng, B.-C., Y.-M. Jeng and T.-P. Liang, FILM: a fuzzy inductive learning method for automated knowledge acqui4sition, Decision Support Systems, vol. 21, pp. 61-73, 1997.

Kerber, R., ChiMerge: Discretization of Numeric Attributes, AAAI-92 Proceedings, Tenth National Conference on Artificial Intelligence, pp.123-127, 1992.

Kim, C. J. and B. D. Russell, Automatic generation of membership function and fuzzy rule suing inductive learning, IFIS’93 The Third International Conference on Industrial Fuzzy Control and Intelligent Systems (Huston), pp. 93-96, 1993.

Klir, G. J. and B. Yuan, Fuzzy sets and fuzzy logic: theory and applications, Prentice Hall, 1995.

MacQueen, J. B., Some Methods for Classification and Analysis of Multivariate Observation, Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281-297, 1967.

Michalski, R. S., I. Mozetic, J. Hong, and N. Lavrac, “The Multi-Purpose Incremental Learning System AQ15 and Its Testing Application to Three Medical Domains,” Proceeding of Fifth National Conference on Artificial Intellegence, pp. 1041-1045, 1986.

Miller, R. J. and Y. Yang, “Association Rules over Interval Data,” Proceedings of ACM SIGMOD International Conference on Management of Data, Tucson, AZ, pp. 452-461, 1997.

Murphy, P. M. and D. W. Aha, UCI repository of machine learning databases, machine-readable data repository, Department of Information and Computer Science, University of California, Irvine, 1995.

Park, J.-S., M.-S. Chen and P.S. Yu, “An Effective Hash Algorithm for Mining Association Rules,” Proceeding of the ACM SIGMOD Conference on Management of Data, PP.175-186, May 1995.

Piatetsky-Shapiro, G. and W. J. Frawley, Knowledge Discovery in Databases, The AAAI/MIT Press, pp.1-27, 1991.

Quinlan, J. R., Induction of Decision Trees, Machine Learning, vol. 1, pp.81-106, 1986.

Quinlan, J. R., C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.

Shannon, C., A mathematical theory of communications, Bell Systems Technical Journal, vol. 27, pp. 379-423, 1948.

Spangler, W. E., J. H. May, L. G. Vargas, Choosing Data-Mining Methods for Multiple Classification: Representational and Performance Measurement Implications for Decision Support, Journal of Management Information Systems, vol. 16, pp. 37-62, 1999.

Stettiner, Y., D. Malah, and D. Chazan, “Dynamic Time Warping with Path Control and Nonloacl Cost,” Proceeding of 12th IAPR International Conference Pattern Recognition, pp.174-177, Oct 1994.

Sugeno, M. and T. Yasukawa, A fuzzy-logic-based approach to qualitative modeling, IEEE Transactions on Fuzzy Systems, vol.1, pp. 7-31, 1993.

Tam, K., and M. Kiang, Managerial Applications of Neural Networks: the Case of Bank Failure Prediction. Management Science, vol. 38, No. 7, pp. 926-947, 1992

Ting, K. M., Discretization of continuous-valued attributes and instance-based learning, Technical Report 491, University of Sydney, 1994.

Van de Merckt, T., Decision trees in numerical attribute spaces, Proceedings of the 13th International Joing Conference on Artificial Intelligence, pp. 1016-1021, 1993.

Wang, L. X. and J. M. Mendel, Generating fuzzy rules by learning from examples, Proceedings of the 1991 IEEE International Symposium on Intelligent Control, vol. 13, pp. 263-268, 1991.

Weiss, S. M., R. S. Galen and P. V. Tadepalli, Maximizing the predicative value of production rules, Artificial Intelligence, vol. 45, pp. 47-71, 1990.

Wu, X., Dealing With Noise and Real-Valued Attributes, in Knowledge Acquisition from Databases, (Chapter 6), Ablex, 1995.

Wu, X. and D. Urpani, Induction by attribute elimination, IEEE Transactions on Knowledge and Data Engineering, Vol. 11, Issue 5, Sept.-Oct., pp. 805 –812, 1999.
Zadeh, L. A., Fuzzy sets, Information and Control, Vol. 8, pp. 338-353, 1965.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內校外完全公開 unrestricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code