論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available
論文名稱 Title |
在空間資料庫中以個數順序實例樹來挖掘最大共同樣式的方法 The Count-Ordered Instances-Tree for Discovering Maximal Co-Location Patterns in Spatial Databases |
||
系所名稱 Department |
|||
畢業學年期 Year, semester |
語文別 Language |
||
學位類別 Degree |
頁數 Number of pages |
83 |
|
研究生 Author |
|||
指導教授 Advisor |
|||
召集委員 Convenor |
|||
口試委員 Advisory Committee |
|||
口試日期 Date of Exam |
2017-06-16 |
繳交日期 Date of Submission |
2017-06-23 |
關鍵字 Keywords |
空間共同位置規則、空間資料探勘、空間資料庫、最大共同樣式、空間共同位置樣式 Spatial Data Mining, Spatial Database, Spatial Co-location Rules, Maximal Co-location Patterns, Spatial Co-location Patterns |
||
統計 Statistics |
本論文已被瀏覽 5646 次,被下載 31 次 The thesis/dissertation has been browsed 5646 times, has been downloaded 31 times. |
中文摘要 |
在最近幾年,資訊量快速地增加,如何能有效地利用蒐集到的大量資料,從中找出潛在但是對我們有幫助的資訊這將是很重要且必須的。而尋找頻繁出現在鄰近空間的共同樣式,便是空間探勘中一種令人感興趣的議題。它可廣泛地應用於許多領域,包括手機服務和交通管理。許多探勘空間共同的演算法(the full-join, the partial-join, the join-less)都採用join-based 演算法,即是Apriori-like的方法,產生長度為k的共同樣式前,必須先探勘出長度為(k-1)的共同樣式。Xiaojing Yao等學者提出一種sparse-graph and condensed tree-based (簡稱SGCT)的方法來探勘出最大共同位置樣式,這個方法不同於先前的那些方法。它會先利用無方向圖先探勘出最大共同位置樣式(Maximal co-location patterns)的候選集,再用condensed-tree結構來儲存空間中候選集的實例,這方法改善了過去使用表格或是建樹的方式來探勘出候選集,因此SCGT的效能及儲存空間優於其他兩種找最大共同樣式的演算法order-clique-based和MAXColoc。但是,在資料量越來越大的時候,SGCT可能會儲存到非常多節點的樹,因為他們的方法中建樹的順序是依字母排序。在此論文中,我們提出了一種刪除的新策略來過濾沒有超過門檻值的候選集。再則,我們利用9D-SPA中的公式將每個事件(event)之間的關係用不同的唯一值來表示,以利於我們在儲存的表格中方便查詢所需要的資訊。我們用個數順序實例樹來記錄空間中找到的候選集間彼此的關係。我們的方法總是可以比SGCT方法使用更少的節點來找到候選人的實例,主要是因為我們在產生個數順序實例樹時,有先依照彼此關係數的排列來排序建樹的先後。從我們的實驗結果,顯示出我們的方法不論是在密度高或是密度低的空間資料庫中做探勘,所花的時間及儲存都會較SGCT方法少。 |
Abstract |
In recent years, the accumulation of information bursts in an amazing speed. It is important and necessary to discover the potential information that would be useful and then collect them efficiently. Looking for the spatial co-location that appears frequently in nearby space is an interesting issue. It is widely used in many areas, including mobile phone services and traffic management. Some algorithms like the full-join, the partial-join, the join-less are commonly used in spatial mining. They all apply the join-based algorithms which are followed by Apriori-like method. It must generate the size-k prevalence co-locations after size-(k-1) prevalence co-locations. Xiaojing Yao et al. propose a method called sparse-graph and condensed tree-based (SGCT) algorithm to mine the maximal co-location patterns, and this approach is different from the others mentioned above. It uses an undirected graph to mine the candidates of maximal co-location first, then uses a condensed-tree structure to store instance cliques of the candidates in the spatial database. This method improves other methods which used the table or tree to discover the candidate sets in the past. Therefore, the performance is better and requires less storage than the other two algorithms: the order-clique-based and the MAXColoc. However, as the amount of data grows, the SGCT algorithm may store large number of nodes in the process of generating the tree. Because the order of generating tree in their method is always in alphabetical order. In this thesis, we propose a new strategy which will consider the number of instances of each event. In our approach, we delete the candidate whose participation index is less than the defined threshold before we construct the tree for mining. Moreover, we apply the formula used in 9D-SPA to derive the unique value for representing each event pair relation and use it as the key value of the hash function, thus it is more convenient for us to check whether an instance of an event pair exists or not by the hash structure. We propose a Count-Ordered Instances-tree to record the candidates of relation sets. In our method, we can find out the instances from candidate cliques, which uses less number of nodes than SGCT algorithm to find the instances of candidates. The major reason is that when building the Count- Ordered Instances-tree, we sort the order of those instances by the number of relations in the increasing order. From our experimental results, we show that our approach needs shorter time and costs less storage space than the SGCT method both in dense and sparse spatial datasets. |
目次 Table of Contents |
[THESIS VALIDATION LETTER+i] [ACKNOWLEDGEMENTS+ii] [ABSTRACT (CHINESE)+iii] [ABSTRACT (ENGLISH)+iv] [LIST OF FIGURES+vii] [LIST OF TABLES+x] [1. Introduction+1] [1.1 Spatial Association Rules+1] [1.2 Spatial Co-location Pattern Mining+3] [1.3 Applications+5] [1.4 Related Works+7] [1.5 Motivation+9] [1.6 Organization of the Thesis+10] [2. A Survey of Approaches for Mining Spatial Co-Location Patterns+11] [2.1 The Join-Based Approach to Mining Co-Location Patterns+11] [2.2 The Partial Join and the Joinless Approach to Mining Co-Location Patterns+14] [2.2.1 The Partial Join Approach+14] [2.2.2 The Joinless Approach+14] [2.3 An Order-Clique-Based Approach to Mining Maximal Co-Location Patterns+17] [2.3.1 Generating Candidate Maximal Co-Locations+17] [2.3.2 Identifying Co-Location Table Instances+18] [2.4 A Sparse-Graph and Condensed Tree-Based Approach to Mining Maximal Co-Location Patterns+21] [2.4.1 Generation of Candidate Maximal Co-Locations+23] [2.4.2 Condensed Instance Tree Construction+24] [3. The Count-Ordered Instances-Tree for Mining Spatial Co-location Patterns+26] [3.1 The Preprocessing Step for the Input of the Spatial Database+26] [3.2 The Proposed Method+31] [3.3 A Comparison+49] [4. Performance+51] [4.1 The Performance Model+51] [4.2 Experiment Results+52] [5. Conclusion+65] [5.1 Summary+65] [5.2 Future Work+66] [BIBLIOGRAPHY+67] |
參考文獻 References |
[1] R. AgrawalandR.Srikant,“FastAlgorithmsforMiningAssociationRules in LargeDatabases,” Proc.ofthe20thInt.Conf.onVeryLargeDataBases, pp. 487–499,1994. [2] X. Bao,L.Wang,andJ.Zhao,“MiningTop-k-SizeMaximalCo-LocationPat- terns,” Int. Conf.onComputer,InformationandTelecommunicationSystems, pp. 1–6,2016. [3] M. Celik,J.M.Kang,andS.Shekhar,“ZonalCo-LocationPatternDiscovery with DynamicParameters,” Proc.ofthe7thIEEEInt.Conf.onDataMining, pp. 433–438,2007. [4] B. R.DaiandM.Y.Lin,“EfficientlyMiningDynamicZonalCo-LocationPat- terns BasedonMaximalCo-Locations,” Proc.ofthe11thIEEEInt.Conf.on Data MiningWorkshops, pp.861–868,2011. [5] W. Ding,C.Eick,J.Wang,andX.Yuan,“AFrameworkforRegionalAssociation Rule MininginSpatialDatasets,” Proc.ofthe6thInt.Conf.onDataMining, pp. 851–856,2006. [6] D. Eppstein,M.L¨offler,andD.Strash,“ListingAllMaximalCliquesinSparse Graphs inNear-OptimalTime,” Proc.ofthe21thInt.SymposiumonAlgorithms and Computation, pp.403–414,2010. [7] G. Fang,J.Xiong,X.L.Du,andX.B.Tang,“FrequentNeighboringClassSet Mining,” Proc.ofthe7thInt.Conf.onFuzzySystemsandKnowledgeDiscovery, pp. 1442–1445,2010. [8] T. Hu,S.Y.Sung,H.Xiong,andQ.Fu,“DiscoveryofMaximumLengthFre- quentItemsets,” Information Sciences, Vol.178,No.1,pp.69–87,Jan.2008. [9] P.W.HuangandC.H.Lee,“ImageDatabaseDesignBasedon9D-SPARep- resentationforSpatialRelations,” IEEE TransactionsonKnowledgeandData Engineering, Vol.16,No.12,pp.1486–1496,Dec.2004. [10] Y. Huang,S.Shekhar,andH.Xiong,“DiscoveringColocationPatternsfrom Spatial DataSets:AGeneralApproach,” IEEE Trans.onKnowledgeandData Engineering, Vol.16,No.12,pp.1472–1485,Dec.2004. [11] Y. HuangandP.Zhang,“OntheRelationshipsbetweenClusteringandSpatial Co-LocationPatternMining,” Proc.ofthe18thIEEEInt.Conf.onToolswith ArtificialIntelligence, pp.513–522,2006. [12] K. S.Kim,Y.Kim,andU.Kim,“MaximalCliquesGeneratingAlgorithmfor Spatial Co-LocationPatternMining,” Proc.ofthe8thFIRAInt.Conf.onSecure and TrustComputing,DataManagementandApplications, pp.241–250,2011. [13] K. KoperskiandJ.Han,“DiscoveryofSpatialAssociationRulesinGeographic Information Databases,” Proc.ofthe4thInt.SymposiumonAdvancesinSpatial Databases, pp.47–66,1995. [14] Y. Morimoto,“MiningFrequentNeighboringClassSetsinSpatialDatabases,” Proc.ofthe7thACMSIGKDDInt.Conf.onKnowledgeDiscoveryandData Mining, pp.353–358,2001. [15] S. ShekharandY.Huang,“DiscoveringSpatialCo-LocationPatterns:ASum- mary ofResults,” Proc.ofthe7thInt.SymposiumonAdvancesinSpatialand TemporalDatabases, pp.236–256,2001. [16] F. VerheinandG.Al-Naymat,“FastMiningofComplexSpatialCo-Location Patterns UsingGLIMIT,” Proc.ofthe7thIEEEInt.Conf.onDataMining Workshops, pp.679–684,2007. [17] Y. Wan,J.Zhou,andF.Bian,“CODEM:ANovelSpatialCo-Locationand De-locationPatternsMiningAlgorithm,” Proc.ofthe5thInt.Conf.onFuzzy Systems andKnowledgeDiscovery, pp.576–580,2008. [18] L. Wang,Y.Bao,J.Lu,andJ.Yip,“ANewJoin-lessApproachforCo-Location Pattern Mining,” Proc.ofthe8thIEEEInt.Conf.onComputerandInformation Technology, pp.197–202,2008. [19] L. Wang,K.Xie,T.Chen,andX.Ma,“EfficientDiscoveryofMultilevelSpatial AssociationRulesUsingPartitions,” Information SoftwareTechnology, Vol.47, No. 13,pp.829–840,Oct.2005. [20] L.Wang,L.Zhou,J.Lu,andJ.Yip,“AnOrder-Clique-BasedApproachforMin- ing MaximalCo-Locations,” Information Sciences, Vol.179,No.19,pp.3370– 3382, Sept.2009. [21] S. Yang,L.Wang,X.Bao,andJ.Lu,“AFrameworkforMiningSpatialHigh UtilityCo-LocationPatterns,” Proc.ofthe12thInt.Conf.onFuzzySystems and KnowledgeDiscovery, pp.595–601,2015. [22] X. Yao,L.Peng,L.Yang,andT.Chi,“AFastSpace-SavingAlgorithmforMax- imal Co-LocationPatternMining,” ExpertSystemswithApplications, Vol.63, pp. 310–323,Nov.2016. [23] J. S.Yoo,D.Boulware,andD.Kimmey,“AParallelSpatialCo-LocationMining Algorithm BasedonMapReduce,” Proc.ofIEEEInt.CongressonBigData, pp. 25–31,2014. [24] J. S.YooandM.Bow,“MiningTop-kClosedCo-LocationPatterns,” Proc.of IEEE Int.Conf.onSpatialDataMiningandGeographicalKnowledgeServices, pp. 100–105,2011. [25] J. S.YooandM.Bow,“MiningMaximalCo-LocatedEventSets,” Proc.ofthe 15th Pacific-AsiaConf.onAdvancesinKnowledgeDiscoveryandDataMining, pp. 351–362,2011. [26] J. S.YooandM.Bow,“MiningSpatialColocationPatterns:ADifferentFrame- work,” Data MiningandKnowledgeDiscovery, Vol.24,No.1,pp.159–194,Jan. 2012. [27] J. S.YooandS.Shekhar,“AJoinlessApproachforMiningSpatialColocation Patterns,” IEEE Trans.onKnowledgeandDataEngineering, Vol.18,No.10, pp. 1323–1337,Oct.2006. [28] J. S.Yoo,S.Shekhar,J.Smith,andJ.P.Kumquat,“APartialJoinApproach for MiningCo-LocationPatterns,” Proc.ofthe12thAnnualACMInt.Workshop on GeographicInformationSystems, pp.241–249,2004. [29] W. Yu,“SpatialCo-LocationPatternMiningforLocation-BasedServicesin Road Networks,” ExpertSystemswithApplications, Vol.46,pp.324–335,March 2016. |
電子全文 Fulltext |
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。 論文使用權限 Thesis access permission:自定論文開放時間 user define 開放時間 Available: 校內 Campus: 已公開 available 校外 Off-campus: 已公開 available |
紙本論文 Printed copies |
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。 開放時間 available 已公開 available |
QR Code |