Responsive image
博碩士論文 etd-0828112-142127 詳細資訊
Title page for etd-0828112-142127
論文名稱
Title
具有時間顆粒階層之時序性資料挖掘
Temporal Data Mining with a Hierarchy of Time Granules
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
150
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2012-07-19
繳交日期
Date of Submission
2012-08-28
關鍵字
Keywords
資料挖掘、關聯規則挖掘、時序關聯規則、項目銷售期間、時間顆粒概念階層
a hierarchy of time granules, item lifespan, temporal association rules, association-rule mining, Data mining
統計
Statistics
本論文已被瀏覽 5702 次,被下載 96
The thesis/dissertation has been browsed 5702 times, has been downloaded 96 times.
中文摘要
資料挖掘的目的在於如何從資料庫中,擷取出感興趣以及具有意義的樣式。在實際應用裡,每筆交易除了包括項目外,也包含該交易的發生時間。為了處理包含時間資訊的資料,時序性資料挖掘技術被提出用來找出時序性關聯規則。大部份既存的相關研究皆是僅考慮不同項目展售期去找出一般性時序關聯規則。然而,這樣的方法可能會遺失掉一些有意義的資訊,像是一個項目在整個展售期裡不是一個頻繁項目,但是,它在局部的展售期裡也許是一個頻繁項目。為了處理這問題,因時間概念階層能讓使用者定義出適當的時間顆粒,所以被應用至時序資料探勘裡。在本篇論文中,我們提出一個從時序資料庫裡尋找具時間顆粒階層之時序關聯規則的問題,並且針對三種不同項目展售期定義提出相對應的三個演算法來完成探勘。第一種定義是針對每個項目在一個時間顆粒裡的第一筆交易出現的時間至最後時間;第二種定義是針對每個項目在一個時間顆粒裡,從它起始展售的時間至最後時間做為評估;第三種定義是依據每個項目在一個時間顆粒裡的實際展售期間做為評估區間。最後,實驗評估則是比較在不同項目展售期定義下的三個提出的演算法的效能表現;以及,在考慮與不考慮加入時間顆粒階層概念的情況下的規則差異評估。
Abstract
Data mining techniques have been widely applied to extract desirable knowledge from existing databases for specific purposes. In real-world applications, a database usually involves the time periods when transactions occurred and exhibition periods of items, in addition to the items bought in the transactions. To handle this kind of data, temporal data mining techniques are thus proposed to find temporal association rules from a database with time. Most of the existing studies only consider different item lifespans to find general temporal association rules, and this may neglect some useful information. For example, while an item within the whole exhibition period may not be a frequent one, it may be frequent within part of this time. To deal with this, the concept of a hierarchy of time is thus applied to temporal data mining along with suitable time granules, as defined by users. In this thesis, we thus handle the problem of mining temporal association rules with a hierarchy of time granules from a temporal database, and also propose three novel mining algorithms for different item lifespan definitions. In the first definition, the lifespan of an item in a time granule is calculated from the first appearance time to the end time in the time granule. In the second definition, the lifespan of an item in a time granule is evaluated from the publication time of the item to the end time in the time granule. Finally, in the third definition, the lifespan of an item in a time granule is measured by its entire exhibition period. The experimental results on a simulation dataset show the performance of the three proposed algorithms under different item lifespan definitions, and compare the mined temporal association rules with and without consideration of the hierarchy of time granules under different parameter settings.
目次 Table of Contents
論文審定書 i
致謝 iii
中文摘要 iv
Abstract v
CONTENTS vii
List of Figures ix
List of Tables x
Chapter 1. Introduction 1
Chapter 2. Related Works 5
2.1 Association-Rule Mining 5
2.2 Hierarchy Data Mining 6
2.3 Temporal Association-Rule Mining 7
Chapter 3. Hierarchy Temporal Association Rule Mining with First Appearance Periods of Products 9
3.1 Problem Statement and Definitions 9
3.2 The Proposed Three-Phase Algorithm for Mining Hierarchical Temporal Association Rules (TP-HTAR) 15
3.3 An Example of TP-HTAR 20
3.4 The Proposed Three-Phase Mining Algorithm with a Predicting Strategy (TPP) 33
3.5 An Example of TPP 39
3.6 The Proposed Three-Phase Mining Algorithm with an Improved Strategy (TPI) 51
3.7 An Example of TPI 57
Chapter 4. Hierarchical Temporal Association Rule Mining with Start Exhibition Period Information of Products 71
4.1 Problem Statement and Definitions 71
4.2 The Proposed Three-Phase Mining Algorithm with an Improved Strategy (TPI) 77
4.3 An Example of TPI 83
Chapter 5. Hierarchical Temporal Association Rule Mining with Entire Lifespans of Products 95
5.1 Problem Statement and Definitions 95
5.2 The Proposed Three-Phase Mining Algorithm with an Improved Strategy (TPI) 101
5.3 An Example of TPI 107
Chapter 6. Experimental Evaluation 120
6.1 Experimental Datasets 120
6.2 Evaluation on Scanning Effect for the Three Strategies 121
6.3 Efficiency Evaluation 124
6.4 Evaluation on a Real Database, Foodmart 126
Chapter 7. Conclusions 131
References 134
VITA 138
參考文獻 References
[1] R. Agrawal and R. Srikant, “Fast algorithm for mining association rules,” The International Conference on Very Large Data Bases, pp. 487-499, 1994.
[2] R. Agrawal and R. Srikant, “Mining sequential patterns,” The 11th International Conference on Data Engineering, pp. 3-14, Mar. 1995.
[3] R. Agrawal, R. Srikant and Q. Vu, “Mining association rules with item constraints,” The 3th International Conference on Knowledge Discovery in Databases and Data Mining, Newport Beach, California, 1997.
[4] J. M. Ale and G. H. Rossi, “An approach to discovering temporal association rules,” The 2000 ACM Symposium on Applied Computing, pp. 294-300, 2000.
[5] C. J. Chu, Vincent S. Tseng, and T. Liang, “Mining temporal rare utility itemsets in large databases using relative utility thresholds,” International Journal of Innovative Computing, Information and Control, Vol. 4, No. 8, pp. 2775-2792, 2008.
[6] C. Y. Chang, M. S. Chen, and C. H. Lee, “Mining general temporal association rules for items with different exhibition periods,” The Third IEEE International Conference on Data Mining, pp. 59-66, 2002.
[7] Y. L. Chen, K. Tang, R. J. Shen, and Y. H. Hu, “Market basket analysis in a multiple store environment,” Decision Support Systems, Vol. 40, No. 2, pp. 339-354, 2005.
[8] J. Han, Y. Cai, and N. Cercone, “Data-driven discovery of quantitative rules in relational databases,” IEEE Transactions on Knowledge and Data Engineering, Vol. 5, No. 1, pp. 29-40, 1993.
[9] J. Han and Y. Fu, “Discovery of multiple-level association rules from large databases,” The 21th International Conference on Very Large Data Bases, pp. 420-431, 1995.
[10] J. Han and Y. Fu, “Exploration of the power of attribute-oriented induction in data mining,” Knowledge Discovery and Data Mining, pp. 399-421, 1996.
[11] IBM Quest Data Mining Project, “Quest Synthetic Data Generation Code,” http://www.almaden.ibm.com/cs/quest/syndata.html, 1996.
[12] M. Kamber, L. Winstone, W. Gong, S. Cheng, and J. Han, “Generalization and decision tree induction: efficient classification in data mining,” The 7th International Workshop on Research Issues in Data Engineering (RIDE '97) High Performance Database Management for Large-Scale Applications, pp. 111-120, 1997.
[13] C. H. Lee, C. R. Lin, and M. S. Chen, “On mining general temporal association rules in a publication database,” The 2001 IEEE International Conference on Data Mining, pp. 337-344, 2001.
[14] G. C. Lan, C. H. Chen, T. P. Hong, and S. B. Lin. “A fuzzy approach for mining general temporal association rules in a publication database,” The Hybrid Intelligent Systems (HIS’11), pp. 611-615, 2011.
[15] G. C. Lan, C. H. Chen, T. P. Hong, and S. B. Lin. “Mining fuzzy temporal association rules with effective lifespans of items,” The Cross-Strait Conference on Information Science and Technology and iCube, 2011.
[16] G. C. Lan, T. P. Hong, and Vincent S. Tseng, “Discovery of high utility itemsets from on-shelf time periods of products,” Expert Systems with Application, Vol. 38, No. 5, pp. 5851-5857, 2011.
[17] Y. Lu, “Concept hierarchy in data mining: specification, generation and implementation,” Department of Computer Science, Simon Fraser University, 1997.
[18] Y. Li, P. Ning, X. S. Wang, and S. Jajodia, “Discovering calendar-based temporal association rules,” Data & Knowledge Engineering, Vol. 44, No. 2, pp. 193-218, 2003.
[19] Microsoft Corporation, Example Database FoodMart of Microsoft Analysis services.
[20] B. Ozden, S. Ramaswamy and A. Silberschatz. “Cyclic association rules,” The 14th International Conference on Data Engineering, Orlando, Florida, USA, pp. 12-421, 1998.
[21] J. F. Roddick and M. Spiliopoilou, “A survey of temporal knowledge discovery paradigms and methods,” IEEE Transactions on Knowledge and Data Engineering, Vol. 14, No. 4, pp. 750-767, 2002.
[22] R. Srikant and R. Agrawal, “Mining generalized association rules,” The 21st International Conference on Very Large Data Bases, Zurich, Swizerland, pp. 407-419, 1995.
[23] Vincent S. Tseng, C. J. Chu, and T. Liang, “Efficient mining of temporal high utility itemsets from data streams,” The ACM KDD Workshop on Utility-Based Data Mining, pp. 1105-1117, 2006.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code