Responsive image
博碩士論文 etd-0527113-133645 詳細資訊
Title page for etd-0527113-133645
論文名稱
Title
一個以MCountP-Tree來探勘空間資料集合中的最大空間共同位置樣式之方法
The MCountP-Tree for Mining Maximal Spatial Co-Location Patterns from Spatial Data Sets
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
92
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2013-06-19
繳交日期
Date of Submission
2013-06-27
關鍵字
Keywords
空間資料庫、空間資料探 勘、空間共同位置規則、空間共同位置樣式、共同位置規則
Spatial database, Spatial data mining, Spatial co-location rules, Spatial co-location patterns, Co-location rules
統計
Statistics
本論文已被瀏覽 5699 次,被下載 345
The thesis/dissertation has been browsed 5699 times, has been downloaded 345 times.
中文摘要
在最近幾年,地理資訊系統(GIS)發展快速並且在很多應用中扮演重要的角色。在這些大量的地理資訊當中,如何有效率在空間資料中探勘出最大共同位置樣式(the maximal co-locations patterns),己經是空間資料庫探勘的重要議題。其中應用的例子包括手機服務、疾病衛生管理、犯罪防治…等應用。大部份研究(the full-join, the partial-join, the join-less)探勘的方法都是利用join-based,即是Apriori-like的方式探勘出最大共同位置樣式。但是,利用Apriori-like的方法必須付出龐大的計算成本,因為Apriori-like的方法要探勘出長度為k的共同位置樣式時,必需先探勘出長度為(k-1)的共同位置樣式。為了減少計算的成本,Lizhen Wang等學者提出一種order-clique的方法探勘出最大共同位置樣式。這個方法不同於先前那些join-based的方法,因為他們會先找出最大共同位置樣式的候選值,再利用四種樹的資料結構來探勘出最大共同位置樣式且這方法改進了過去使用表格的方式來探勘資料。因此,order-clique的效能優於過去那些join-based的方法。但是,當門檻值在遞增時,order-clique的效能也許會不太好,因它們的方法並沒有刪除的策略。因此,在此論文中,我們提出了一種含有刪除的策略的新方法探勘出最大共同位置樣式。我們的方法可以比order-clique的方法更準確的探勘出最大共同位置樣式的候選值,主要是因為我們在長度為2的候選值當中加入了刪除的策略。在我們的方法當中,我們提出了四種樹的資料結構,其中包含CountP-tree、MCountP-tree、NeighborI-tree、和CoLI-tree。 CountP-tree的優點是先刪除長度為2的候選值,然而刪除的方法不同於過去那些join-based。而MCountP-tree則可以找出最大共同位置樣式的候選值,我們找到的候選值總是可以比order-clique的方法來的小。NeighborI-tree記錄了空間中所有點的鄰居關係。CoLI-tree是利用MCountP-tree及NeighborI-tree的結果建立而成,並且決定最終的結果。從我們的實驗結果,我們顯示出我們所提出來的方法不管是在密度高或者是密度低的空間資料庫做探勘,效率都優於order-clique方法。
Abstract
In recent years, the geographic information system (GIS) databases develop quickly
and play a significant role in many applications. How to efficient mine the maximal
co-location patterns in the explosive growth of spatial data is an important issue in
spatial data mining. The applications of spatial mining include mobile service request,
and public health, public safety. Most of researches (the full-join, the partial-join,
the join-less), join-based approaches, adopt the Apriori-like approach to mine the
maximal co-location patterns. However, the Apriori-like approach has very expensive
computation cost. Because the Apriori-like approach generate size-k prevalence
co-locations after size-(k - 1) prevalence co-locations. In order to decrease computation
cost of those join-based approaches, Lizhen Wang et al. have proposed an
order-clique approach for mining the maximal co-location patterns. This approach is
different from those join-based approaches, because it finds candidates of the maximal
co-locations candidates first. They use tree data structures to mine the maximal
co-location patterns, instead of table instances used in those join-based approaches.
Therefore, the performance of the order-clique approach is better than that of those
join-based approaches. However, when the threshold increases, the performance of
the order-clique approach would not be good due to no use of the pruning strategy.
Therefore, in this thesis, we propose a new approach with a pruning strategy to mine
the maximal co-location patterns. Our approach would be more accurate than the
order-clique approach to find the candidates of maximal co-location patterns, because
we use a pruning strategy in the candidates of size 2. In our approach, we propose
four tree data structures which include the CountP -tree, the MCountP -tree, the
NeighborI-tree, and the CoLI-tree. The advantage of the CountP -tree is to prune
the size-2 candidates of the maximal co-location patterns, which is different from
pruning instances as used in those join-based approaches. The MCountP -tree can
show the candidates of the maximal co-location patterns. The number of candidates
of the maximal co-location patterns founded by our approach is smaller than that
founded by the order-clique approach. The NeighborI-tree records every instance
relation. The CoLI-tree is built from the result of the the MCountP -tree by referring
to the NeighborI-tree to decide the final result. From our simulation results, we
show that our proposed approach is more efficient than the order-clique approach no
matter the data set is sparse or dense.
目次 Table of Contents
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Spatial Association Rules . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Spatial Co-Location Patterns . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.6 Organization of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . 11
2. A Survey of Approaches for Mining Spatial Co-Location Patterns 12
2.1 The Full Join Approach to Mining Co-Location Patterns . . . . . . . 12
2.2 The Joinless Approach to Mining Co-Location Patterns . . . . . . . . 15
2.3 An Order-Clique-Based Approach to Mining Maximal Co-Location
Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.1 Generating Candidate Maximal Co-Locations . . . . . . . . . 17
2.3.2 Identifying Co-Location Table Instances . . . . . . . . . . . . 19
3. The Spatial Co-location Patterns Approach . . . . . . . . . . . . . 23
3.1 Data Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 The Input of the Spatial Database . . . . . . . . . . . . . . . . . . . . 24
3.3 The Processing of the Proposal Approach . . . . . . . . . . . . . . . . 27
3.4 A Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4. Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.1 The Performance Model . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
參考文獻 References
[1] R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules
in Large Databases,” Proc. of the 20th Int. Conf. on Very Large Data Bases,
pp. 487–499, 1994.
[2] M. Celik, J. M. Kang, and S. Shekhar, “Zonal Co-Location Pattern Discovery
with Dynamic Parameters,” Proc. of the 7th IEEE Int. Conf. on Data Mining,
pp. 433–438, 2007.
[3] B. R. Dai and M. Y. Lin, “Efficiently Mining Dynamic Zonal Co-Location Patterns
Based on Maximal Co-Locations,” Proc. of IEEE 11th Int. Conf. on Data
Mining Workshops, pp. 861–868, 2011.
[4] W. Ding, C. Eick, J.Wang, and X. Yuan, “A Framework for Regional Association
Rule Mining in Spatial Datasets,” Proc. of the 6th Int. Conf. on Data Mining,
pp. 851–856, 2006.
[5] G. Fang, J. Xiong, X. L. Du, and X. B. Tang, “Frequent Neighboring Class Set
Mining,” Proc. of the 7th Int. Conf. on Fuzzy Systems and Knowledge Discovery,
pp. 1442–1445, 2010.
[6] T. Hu, S. Y. Sung, H. Xiong, and Q. Fu, “Discovery of Maximum Length Frequent
Itemsets,” Information Sciences, Vol. 178, No. 1, pp. 69–87, Jan. 2008.
[7] Y. Huang, S. Shekhar, and H. Xiong, “Discovering Colocation Patterns from
Spatial Data Sets: A General Approach,” IEEE Trans. on Knowledge and Data
Engineering, Vol. 16, No. 12, pp. 1472–1485, Dec. 2004.
[8] Y. Huang and P. Zhang, “On the Relationships Between Clustering and Spatial
Co-Location Pattern Mining,” Proc. of the 18th IEEE Int. Conf. on Tools with
Artificial Intelligence, pp. 513–522, 2006.
[9] K. S. Kim, Y. Kim, and U. Kim, “Maximal Cliques Generating Algorithm for
Spatial Co-Location Pattern Mining,” Secure and Trust Computing, Data Management
and Applications, pp. 241–250, 2011.
[10] Y. Morimoto, “Mining Frequent Neighboring Class Sets in Spatial Databases,”
Proc. of the 7th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data
Mining, pp. 353–358, 2001.
[11] S. Shekhar and Y. Huang, “Discovering Spatial Co-Location Patterns: A Summary
of Results,” Proc. of the 7th Int. Symposium on Advances in Spatial and
Temporal Databases, pp. 236–256, 2001.
[12] S. Shekhar, P. Zhang, Y. Huang, and R. R. Vatsavai, “Trends in Spatial Data
Mining,” Data Mining: Next Generation Challenges and Future Directions,
AAAI/MIT Press, pp. 357–380, 2004.
[13] F. Verhein and G. Al-Naymat, “Fast Mining of Complex Spatial Co-Location
Patterns Using GLIMIT,” Proc. of the 7th IEEE Int. Conf. on Data Mining
Workshops, pp. 679–684, 2007.
[14] Y. Wan, J. Zhou, and F. Bian, “CODEM: A Novel Spatial Co-Location and
De-location Patterns Mining Algorithm,” Proc. of the 5th Int. Conf. on Fuzzy
Systems and Knowledge Discovery, pp. 576 –580, 2008.
[15] L. Wang, Y. Bao, J. Lu, and J. Yip, “A New Join-less Approach for Co-Location
Pattern Mining,” Proc. of CIT 8th IEEE Int. Conf. on Computer and Information
Technology, pp. 197–202, 2008.
[16] L. Wang, K. Xie, T. Chen, and X. Ma, “Efficient Discovery of Multilevel Spatial
Association Rules Using Partitions,” Information Software Technology, Vol. 47,
No. 13, pp. 829–840, Oct. 2005.
[17] L.Wang, L. Zhou, J. Lu, and J. Yip, “An Order-Clique-Based Approach for Mining
Maximal Co-Locations,” Information Sciences, Vol. 179, No. 19, pp. 3370–
3382, Sept. 2009.
[18] J. S. Yoo and M. Bow, “Mining Top-k Closed Co-Location Patterns,” Proc. of
IEEE Int. Conf. on Spatial Data Mining and Geographical Knowledge Services,
pp. 100–105, 2011.
[19] J. S. Yoo and M. Bow, “Mining Maximal Co-Located Event Sets,” Proc. of the
15th Pacific-Asia Conf. on Advances in Knowledge Discovery and Data Mining,
pp. 351–362, 2011.
[20] J. S. Yoo and M. Bow, “Mining Spatial Colocation Patterns: A Different Framework,”
Data Min. Knowledge Discovery, Vol. 24, No. 1, pp. 159–194, Jan. 2012.
[21] J. S. Yoo and J. Hwang, “A Framework for Discovering Spatio-Temporal Cohesive
Networks,” Proc. of the 12th Pacific-Asia Conf. on Advances in Knowledge
Discovery and Data Mining, pp. 1056–1061, 2008.
[22] J. S. Yoo and S. Shekhar, “A Joinless Approach for Mining Spatial Colocation
Patterns,” IEEE Trans. on Knowledge and Data Engineering, Vol. 18, No. 10,
pp. 1323–1337, Oct. 2006.
[23] J. S. Yoo, S. Shekhar, J. Smith, and J. P. Kumquat, “A Partial Join Approach
for Mining Co-Location Patterns,” Proc. of the 12th Annual ACM Int. Workshop
on Geographic Information Systems, pp. 241–249, 2004.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code