Responsive image
博碩士論文 etd-0525118-085901 詳細資訊
Title page for etd-0525118-085901
論文名稱
Title
一個以不重疊方式於增量資料庫中挖掘空間共同樣式的方法
A Non-overlapping Approach to Mining Spatial Co-location Patterns in the Incremental Spatial Database
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
79
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2018-06-29
繳交日期
Date of Submission
2018-07-03
關鍵字
Keywords
空間共同位置樣式、空間共同位置規則、空間資料庫、空間資料探勘、增量資料庫
Spatial Database, Incremental Database, Spatial Co-location Patterns, Spatial Co-location Rules, Spatial Data Mining
統計
Statistics
本論文已被瀏覽 5661 次,被下載 0
The thesis/dissertation has been browsed 5661 times, has been downloaded 0 times.
中文摘要
隨著近年來,資訊量爆炸性地增加,巨量資料等學科研究已經是個不可避免的研究議題。如何從大量資料中挖掘出我們所感興趣的資訊並加以利用將是重要且必須的。而在增量資料庫(Incremental Database)中尋找鄰近空間的共同位置樣式(Spatial Co-location Pattern),更是一門有趣且值得探討的議題。增量資料庫指的是在原始資料庫(Original Database)中,資料隨時間新增或減少後,整個資料庫變大的情況。由於在現實社會中,資料是會隨著時間變化的,它可以廣泛地應用在許多領域,其中包括基於位置的服務(LBS, Location-based services)、環境生態學、甚至是商業行為模式。在此篇論文中,我們只考慮資料新增的情況。許多探勘空間共同的演算法都是針對傳統空間資料庫,不需要去考慮過程中產生的候選集以及更新其參與值的問題,Yoo等學者提出EUCOLOC演算法來探勘增量資料庫中鄰近空間的共同位置樣式,這個方法可能會在原始資料庫和新增資料後的增量資料庫中,記錄了重複的資料,並產生許多候選集(Candidate Instance)。再透過檢查其子集是否存在去刪除部分候選集,並在過程中,更新他們每個共同位置樣式的參與值(Participation index)。但是,我們發現到在他們的方法中,除了要花費較多的儲存空間去儲存資料庫中的點與彼此的關係,也多產生的許多不必要的候選集(也就是:非增量資料參與的候選集)。在此篇論文中,我們提出不重疊演算法(Non-Overlapping algorithm)來更新共同位置樣式並對其進行資料挖掘。所謂的不重疊演算法指的是能避免在新舊資料庫中重複記錄資料的情況,換句話說,不會有資料記錄在增量資料庫時,也同時被記錄在原始資料庫中。我們提出的演算法有幾個優點。首先,我們透過有星號註記的優先權重新排列關係,以便使用較少的空間來存儲資料,同時還可以避免生成非增量候選實例。其次,我們在生成size-k的候選實例之前會檢查其子集的關係。如此一來,我們可以完全避免生成non-clique的實例。因此,我們的方法可以避免生成非增量候選實例和非集合實例。此外,我們也能避免儲存重複的數據。從我們的實驗結果中顯示出,我們的不重疊演算法所需要的時間跟產生的候選集數量都會比EUCOLOC演算法少。

(關鍵詞:增量資料庫,空間共同位置樣式,空間共同位置規則,空間資料庫,空間資料探勘)
Abstract
With the explosive increase of the amount of information in recent years, the research on the big data has become an inevitable topic of the research. How to mine the information which we are interested in and use it a lot will become a really important and necessary issue. Looking for the spatial co-location pattern that appears frequently nearby over an incremental database also becomes an interesting and essential topic. An incremental database refers to the fact that the entire database becomes larger after the data has been inserted or deleted over time in the original database. It can be widely used in many areas due to the changes in data over time, including the location-based services (LBS), environmental ecology, and also the business behavior patterns. In this project, we only consider the insertion of the data. Many spatial co-location pattern mining approaches are for traditional spatial databases. Therefore, they do not need to consider candidate instances generated and update their participation index in the process. Yoo et al. have proposed the EUCOLOC algorithm to mine co-location patterns in the incremental database. This method may record the duplicated data both in the original database and the incremental database during inserting the new data. They generate many candidate instances, and then delete some of these candidate instances by checking whether their subsets exist or not. Finally, they update the participation index in the process and update co-location patterns. However, this method not only needs large storage to store points in the database and their relationships with each other, but also generates many unnecessary candidate instances (eg, non-incremental candidate instances). In this project, we plan to propose the non-overlapping approach to mine the colocation patterns for incremental database. The non-overlapping algorithm refers to the situation that can avoid duplicated edges recorded in the original and incremental database. In other words, no edge will be recorded in the original database when it is recorded in the incremental database. First, we plan to propose approach that rearrange the relations by the asterisk annotation priority in order to use less storage to store data information and also can avoid generating the non-incremental candidate instances. Second, we plan to check the relation of the subset before generating size-k candidate instances. In this way, we can avoid generating non-clique instances completely. Therefore, our method, which expect to avoid generating non-incremental candidate instances and non-clique candidate instances, as well as to avoid storing duplicate data, resulting in better performance of spatial co-location pattern mining.

(Keywords: Incremental Database, Spatial Co-location Patterns, Spatial Co-location
Rules, Spatial Database, Spatial Data Mining)
目次 Table of Contents
[THESIS VALIDATION LETTER+i]
[ACKNOWLEDGEMENTS+ii]
[ABSTRACT (CHINESE)+iii]
[ABSTRACT (ENGLISH)+ iv]
[LIST OF FIGURES+vii]
[LIST OF TABLES+x]
[1. Introduction+1]
[1.1 Spatial Association Rules+1]
[1.2 Spatial Co-location Pattern Mining+3]
[1.3 Applications+6]
[1.4 Related Works+7]
[1.5 Motivation+9]
[1.6 Organization of the Thesis+11]
[2. A Survey of Spatial Co-Location Patterns Mining Approaches+12]
[2.1 The Join-Based Approach to Mining Co-Location Patterns+12]
[2.2 The Partial Join and the Joinless Approach to Mining Co-Location Patterns+15]
[2.2.1 The Partial Join Approach+15]
[2.2.2 The Joinless Approach+15]
[2.3 MAXColoc Algorithm to Mining Maximal Co-located Event Sets+18]
[2.3.1 Candidate Generation+18]
[2.3.2 Candidate Pruning+19]
[2.3.3 Candidate Instance Filtering+20]
[2.4 A Sparse-Graph and Condensed Tree-Based Approach to Mining Maximal Co-Location Patterns+21]
[2.4.1 Maximal Co-Locations Candidate Generation+23]
[2.4.2 Condensed Instance Tree Construction+24]
[2.5 Effectively Updating Co-location Patterns in Evolving Spatial Database+26]
[2.5.1 Incremental Co-location Patterns Mining+26]
[2.5.2 Generating Candidate Instances and Updating Co-location Patterns+28]
[3. A Non-Overlapping Approach to Mining Spatial Co-location Patterns in the Incremental Spatial Database+30]
[3.1 The Problem Statement of the Spatial Database+30]
[3.2 The Preprocessing Step for the Input of the Spatial Database+32]
[3.3 The Proposed Approach+36]
[3.4 A Comparison+49]
[4. Performance+52]
[4.1 The Performance Model+52]
[4.2 Experiment Results+53]
[5. Conclusion+63]
[5.1 Summary+63]
[5.2 Future Work+64]
[BIBLIOGRAPHY+65]
參考文獻 References
[1] R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules
in Large Databases,” Proc. of the 20th Int. Conf. on Very Large Data Bases,
pp. 487–499, 1994.
[2] X. Bao, L. Wang, and J. Zhao, “Mining Top-k-Size Maximal Co-location Patterns,”
Proc. of the Int. Conf. on Computer, Information and Telecommunication
Systems, pp. 1–6, 2016.
[3] D. Eppstein, M. L¨offler, and D. Strash, “Listing All Maximal Cliques in Sparse
Graphs in Near-Optimal Time,” Proc. of the 21th Int. Symposium on Algorithms
and Computation, pp. 403–414, 2010.
[4] J. He, Q. He, F. Qian, and Q. Chen, “Incremental Maintenance of Discovered
Spatial Colocation Patterns,” Proc. IEEE Int. Conf. on Data Mining Workshops,
ICDM Workshops 2008, pp. 399–407, 2008.
[5] Y. Huang, J. Pei, and H. Xiong, “Mining Colocation Patterns with Rare Events
from Spatial Data Sets,” GeoInformatica, Vol. 10, No. 3, pp. 239–260, Sept. 2006.
[6] Y. Huang, S. Shekhar, and H. Xiong, “Discovering Colocation Patterns from
Spatial Data Sets: A General Approach,” IEEE Trans. on Knowledge and Data
Engineering, Vol. 16, No. 12, pp. 1472–1485, Dec. 2004.
[7] Y. Huang and P. Zhang, “On the Relationships between Clustering and Spatial
Co-location Pattern Mining,” Proc. of the 18th IEEE Int. Conf. on Tools with
Artificial Intelligence, pp. 513–520, 2006.
[8] K. Koperski and J. Han, “Discovery of spatial association rules in geographic
information databases,” Proceedings of the 4th International Symposium on Advances
in Spatial Databases, pp. 47–66, 1995.
[9] S. Kwan Kim, Y. Kim, and U. Kim, “Maximal Cliques Generating Algorithm
for Spatial Co-location Pattern Mining,” Proc. of the 8th FIRA Int. Conf. on
Secure and Trust Computing, Data Management and Applications, pp. 241–250, 2011.
[10] J. Lu, L. Wang, Y. Fang, and X. Bao, “A Novel Method on Incremental Mining
of Spatial Co-locations,” Proc. of the 3th IEEE Int. Conf. on Big Data and Smart
Computing, pp. 69–76, Jan 2016.
[11] R. Nehri and M. Nagori, “Spatial Co-location Patterns Mining,” International
Journal of Computer Applications, Vol. 93, No. 12, pp. 21–25, May 2014.
[12] S. Shekhar and Y. Huang, “Discovering Spatial Co-location Patterns : A Summary
of Results,” Proc. of the 7th Int. Symposium on Advances in Spatial and
Temporal Databases, pp. 236–256, 2001.
[13] L. Wang, Y. Bao, J. Lu, and J. Yip, “A New Join-less Approach for Colocation
Pattern Mining,” Proc. of the 8th IEEE Int. Conf. on Computer and Information
Technology, CIT 2008, pp. 197–202, 2008.
[14] L. Wang, L. Zhou, J. Lu, and J. Yip, “An Order-Clique-Based Approach for
Mining Maximal Co-locations,” Information Sciences, Vol. 179, No. 19, pp. 3370–
3382, Sept. 2009.
[15] S. Yang, L. Wang, X. Bao, and J. Lu, “A Framework for Mining Spatial High
Utility Co-Location Patterns,” Proc. of the 12th Int. Conf. on Fuzzy Systems
and Knowledge Discovery, pp. 595–601, 2015.
[16] X. Yao, L. Peng, L. Yang, and T. Chi, “A Fast Space-Saving Algorithm for Maximal
Co-location Pattern Mining,” Expert Systems with Applications, Vol. 63,
pp. 310–323, Nov. 2016.
[17] J. S. Yoo, D. Boulware, and D. Kimmey, “A Parallel Spatial Co-location Mining
Algorithm Based on MapReduce,” Proc. of IEEE Int. Congress on Big Data,
BigData Congress, pp. 25–31, 2014.
[18] J. S. Yoo and M. Bow, “Finding N-most Prevalent Colocated Event Sets,” Proc.
of the 11th Int. Conf. on Data Warehousing and Knowledge Discovery, pp. 415–427, 2009.
[19] J. S. Yoo and M. Bow, “Mining Maximal Co-Located Event Sets,” Proc. of the
15th Pacific-Asia Conf. on Advances in Knowledge Discovery and Data Mining,
pp. 351–362, 2011.
[20] J. S. Yoo and M. Bow, “Mining Top-k Closed Co-location Patterns,” Proc. of
IEEE Int. Conf. on Spatial Data Mining and Geographical Knowledge Services, pp. 100–105, 2011.
[21] J. S. Yoo and M. Bow, “Mining Spatial Colocation Patterns: A Different Framework,”
Data Mining and Knowledge Discovery, Vol. 24, No. 1, pp. 159–194, Jan. 2012.
[22] J. S. Yoo and S. Shekhar, “A Joinless Approach for Mining Spatial Colocation
Patterns,” IEEE Trans. on Knowledge and Data Engineering, Vol. 18, No. 10,
pp. 1323–1337, Oct. 2006.
[23] J. S. Yoo, S. Shekhar, and M. Celik, “A Joinless Approach for Colocation Pattern
Mining: A Summary of Results,” Proc. of IEEE Int. Conf. on Data Mining,
ICDM, pp. 813–816, 2005.
[24] J. S. Yoo, S. Shekhar, J. Smith, and J. P. Kumquat, “A Partial Join Approach
for Mining Co-location Patterns,” Proc. of the 12th Annual ACM Int. Workshop
on Geographic Information Systems, pp. 241–249, 2004.
[25] J. S. Yoo and H. Vasudevan, “Effectively Updating Co-location Patterns in
Evolving Spatial Databases,” Proc. of the 6th Int. Conf. on Pervasive Patterns
and Applications, pp. 96–99, 2014.
[26] W. Yu, “Spatial Co-location Pattern Mining for Location-Based Services in Road
Networks,” Expert Systems with Applications, Vol. 46, pp. 324–335, March 2016.
[27] R. L. Zala, B. B. Mehta, and M. R. Zala, “A Survey on Spatial Co-location
Patterns Discovery from Spatial Datasets,” International Journal of Computer
Trends and Technology, Vol. 7, No. 3, pp. 137–142, Jan. 2014.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code