Responsive image
博碩士論文 etd-0522117-094303 詳細資訊
Title page for etd-0522117-094303
論文名稱
Title
在空間資料庫中以個數順序實例樹來挖掘最大共同樣式的方法
The Count-Ordered Instances-Tree for Discovering Maximal Co-Location Patterns in Spatial Databases
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
83
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2017-06-16
繳交日期
Date of Submission
2017-06-23
關鍵字
Keywords
空間共同位置規則、空間資料探勘、空間資料庫、最大共同樣式、空間共同位置樣式
Spatial Data Mining, Spatial Database, Spatial Co-location Rules, Maximal Co-location Patterns, Spatial Co-location Patterns
統計
Statistics
本論文已被瀏覽 5646 次,被下載 31
The thesis/dissertation has been browsed 5646 times, has been downloaded 31 times.
中文摘要
在最近幾年,資訊量快速地增加,如何能有效地利用蒐集到的大量資料,從中找出潛在但是對我們有幫助的資訊這將是很重要且必須的。而尋找頻繁出現在鄰近空間的共同樣式,便是空間探勘中一種令人感興趣的議題。它可廣泛地應用於許多領域,包括手機服務和交通管理。許多探勘空間共同的演算法(the full-join, the partial-join, the join-less)都採用join-based 演算法,即是Apriori-like的方法,產生長度為k的共同樣式前,必須先探勘出長度為(k-1)的共同樣式。Xiaojing Yao等學者提出一種sparse-graph and condensed tree-based (簡稱SGCT)的方法來探勘出最大共同位置樣式,這個方法不同於先前的那些方法。它會先利用無方向圖先探勘出最大共同位置樣式(Maximal co-location patterns)的候選集,再用condensed-tree結構來儲存空間中候選集的實例,這方法改善了過去使用表格或是建樹的方式來探勘出候選集,因此SCGT的效能及儲存空間優於其他兩種找最大共同樣式的演算法order-clique-based和MAXColoc。但是,在資料量越來越大的時候,SGCT可能會儲存到非常多節點的樹,因為他們的方法中建樹的順序是依字母排序。在此論文中,我們提出了一種刪除的新策略來過濾沒有超過門檻值的候選集。再則,我們利用9D-SPA中的公式將每個事件(event)之間的關係用不同的唯一值來表示,以利於我們在儲存的表格中方便查詢所需要的資訊。我們用個數順序實例樹來記錄空間中找到的候選集間彼此的關係。我們的方法總是可以比SGCT方法使用更少的節點來找到候選人的實例,主要是因為我們在產生個數順序實例樹時,有先依照彼此關係數的排列來排序建樹的先後。從我們的實驗結果,顯示出我們的方法不論是在密度高或是密度低的空間資料庫中做探勘,所花的時間及儲存都會較SGCT方法少。
Abstract
In recent years, the accumulation of information bursts in an amazing speed. It is
important and necessary to discover the potential information that would be useful
and then collect them efficiently. Looking for the spatial co-location that appears
frequently in nearby space is an interesting issue. It is widely used in many areas,
including mobile phone services and traffic management. Some algorithms like the
full-join, the partial-join, the join-less are commonly used in spatial mining. They all
apply the join-based algorithms which are followed by Apriori-like method. It must
generate the size-k prevalence co-locations after size-(k-1) prevalence co-locations.
Xiaojing Yao et al. propose a method called sparse-graph and condensed tree-based
(SGCT) algorithm to mine the maximal co-location patterns, and this approach is
different from the others mentioned above. It uses an undirected graph to mine the
candidates of maximal co-location first, then uses a condensed-tree structure to store
instance cliques of the candidates in the spatial database. This method improves
other methods which used the table or tree to discover the candidate sets in the past.
Therefore, the performance is better and requires less storage than the other two
algorithms: the order-clique-based and the MAXColoc. However, as the amount of
data grows, the SGCT algorithm may store large number of nodes in the process of
generating the tree. Because the order of generating tree in their method is always in
alphabetical order. In this thesis, we propose a new strategy which will consider the
number of instances of each event. In our approach, we delete the candidate whose
participation index is less than the defined threshold before we construct the tree
for mining. Moreover, we apply the formula used in 9D-SPA to derive the unique
value for representing each event pair relation and use it as the key value of the hash
function, thus it is more convenient for us to check whether an instance of an event
pair exists or not by the hash structure. We propose a Count-Ordered Instances-tree
to record the candidates of relation sets. In our method, we can find out the instances
from candidate cliques, which uses less number of nodes than SGCT algorithm to
find the instances of candidates. The major reason is that when building the Count-
Ordered Instances-tree, we sort the order of those instances by the number of relations
in the increasing order. From our experimental results, we show that our approach
needs shorter time and costs less storage space than the SGCT method both in dense
and sparse spatial datasets.
目次 Table of Contents
[THESIS VALIDATION LETTER+i]
[ACKNOWLEDGEMENTS+ii]
[ABSTRACT (CHINESE)+iii]
[ABSTRACT (ENGLISH)+iv]
[LIST OF FIGURES+vii]
[LIST OF TABLES+x]
[1. Introduction+1]
[1.1 Spatial Association Rules+1]
[1.2 Spatial Co-location Pattern Mining+3]
[1.3 Applications+5]
[1.4 Related Works+7]
[1.5 Motivation+9]
[1.6 Organization of the Thesis+10]
[2. A Survey of Approaches for Mining Spatial Co-Location Patterns+11]
[2.1 The Join-Based Approach to Mining Co-Location Patterns+11]
[2.2 The Partial Join and the Joinless Approach to Mining Co-Location Patterns+14]
[2.2.1 The Partial Join Approach+14]
[2.2.2 The Joinless Approach+14]
[2.3 An Order-Clique-Based Approach to Mining Maximal Co-Location Patterns+17]
[2.3.1 Generating Candidate Maximal Co-Locations+17]
[2.3.2 Identifying Co-Location Table Instances+18]
[2.4 A Sparse-Graph and Condensed Tree-Based Approach to Mining Maximal Co-Location Patterns+21]
[2.4.1 Generation of Candidate Maximal Co-Locations+23]
[2.4.2 Condensed Instance Tree Construction+24]
[3. The Count-Ordered Instances-Tree for Mining Spatial Co-location Patterns+26]
[3.1 The Preprocessing Step for the Input of the Spatial Database+26]
[3.2 The Proposed Method+31]
[3.3 A Comparison+49]
[4. Performance+51]
[4.1 The Performance Model+51]
[4.2 Experiment Results+52]
[5. Conclusion+65]
[5.1 Summary+65]
[5.2 Future Work+66]
[BIBLIOGRAPHY+67]
參考文獻 References
[1] R. AgrawalandR.Srikant,“FastAlgorithmsforMiningAssociationRules
in LargeDatabases,” Proc.ofthe20thInt.Conf.onVeryLargeDataBases,
pp. 487–499,1994.
[2] X. Bao,L.Wang,andJ.Zhao,“MiningTop-k-SizeMaximalCo-LocationPat-
terns,” Int. Conf.onComputer,InformationandTelecommunicationSystems,
pp. 1–6,2016.
[3] M. Celik,J.M.Kang,andS.Shekhar,“ZonalCo-LocationPatternDiscovery
with DynamicParameters,” Proc.ofthe7thIEEEInt.Conf.onDataMining,
pp. 433–438,2007.
[4] B. R.DaiandM.Y.Lin,“EfficientlyMiningDynamicZonalCo-LocationPat-
terns BasedonMaximalCo-Locations,” Proc.ofthe11thIEEEInt.Conf.on
Data MiningWorkshops, pp.861–868,2011.
[5] W. Ding,C.Eick,J.Wang,andX.Yuan,“AFrameworkforRegionalAssociation
Rule MininginSpatialDatasets,” Proc.ofthe6thInt.Conf.onDataMining,
pp. 851–856,2006.
[6] D. Eppstein,M.L¨offler,andD.Strash,“ListingAllMaximalCliquesinSparse
Graphs inNear-OptimalTime,” Proc.ofthe21thInt.SymposiumonAlgorithms
and Computation, pp.403–414,2010.
[7] G. Fang,J.Xiong,X.L.Du,andX.B.Tang,“FrequentNeighboringClassSet
Mining,” Proc.ofthe7thInt.Conf.onFuzzySystemsandKnowledgeDiscovery,
pp. 1442–1445,2010.
[8] T. Hu,S.Y.Sung,H.Xiong,andQ.Fu,“DiscoveryofMaximumLengthFre-
quentItemsets,” Information Sciences, Vol.178,No.1,pp.69–87,Jan.2008.
[9] P.W.HuangandC.H.Lee,“ImageDatabaseDesignBasedon9D-SPARep-
resentationforSpatialRelations,” IEEE TransactionsonKnowledgeandData
Engineering, Vol.16,No.12,pp.1486–1496,Dec.2004.
[10] Y. Huang,S.Shekhar,andH.Xiong,“DiscoveringColocationPatternsfrom
Spatial DataSets:AGeneralApproach,” IEEE Trans.onKnowledgeandData
Engineering, Vol.16,No.12,pp.1472–1485,Dec.2004.
[11] Y. HuangandP.Zhang,“OntheRelationshipsbetweenClusteringandSpatial
Co-LocationPatternMining,” Proc.ofthe18thIEEEInt.Conf.onToolswith
ArtificialIntelligence, pp.513–522,2006.
[12] K. S.Kim,Y.Kim,andU.Kim,“MaximalCliquesGeneratingAlgorithmfor
Spatial Co-LocationPatternMining,” Proc.ofthe8thFIRAInt.Conf.onSecure
and TrustComputing,DataManagementandApplications, pp.241–250,2011.
[13] K. KoperskiandJ.Han,“DiscoveryofSpatialAssociationRulesinGeographic
Information Databases,” Proc.ofthe4thInt.SymposiumonAdvancesinSpatial
Databases, pp.47–66,1995.
[14] Y. Morimoto,“MiningFrequentNeighboringClassSetsinSpatialDatabases,”
Proc.ofthe7thACMSIGKDDInt.Conf.onKnowledgeDiscoveryandData
Mining, pp.353–358,2001.
[15] S. ShekharandY.Huang,“DiscoveringSpatialCo-LocationPatterns:ASum-
mary ofResults,” Proc.ofthe7thInt.SymposiumonAdvancesinSpatialand
TemporalDatabases, pp.236–256,2001.
[16] F. VerheinandG.Al-Naymat,“FastMiningofComplexSpatialCo-Location
Patterns UsingGLIMIT,” Proc.ofthe7thIEEEInt.Conf.onDataMining
Workshops, pp.679–684,2007.
[17] Y. Wan,J.Zhou,andF.Bian,“CODEM:ANovelSpatialCo-Locationand
De-locationPatternsMiningAlgorithm,” Proc.ofthe5thInt.Conf.onFuzzy
Systems andKnowledgeDiscovery, pp.576–580,2008.
[18] L. Wang,Y.Bao,J.Lu,andJ.Yip,“ANewJoin-lessApproachforCo-Location
Pattern Mining,” Proc.ofthe8thIEEEInt.Conf.onComputerandInformation
Technology, pp.197–202,2008.
[19] L. Wang,K.Xie,T.Chen,andX.Ma,“EfficientDiscoveryofMultilevelSpatial
AssociationRulesUsingPartitions,” Information SoftwareTechnology, Vol.47,
No. 13,pp.829–840,Oct.2005.
[20] L.Wang,L.Zhou,J.Lu,andJ.Yip,“AnOrder-Clique-BasedApproachforMin-
ing MaximalCo-Locations,” Information Sciences, Vol.179,No.19,pp.3370–
3382, Sept.2009.
[21] S. Yang,L.Wang,X.Bao,andJ.Lu,“AFrameworkforMiningSpatialHigh
UtilityCo-LocationPatterns,” Proc.ofthe12thInt.Conf.onFuzzySystems
and KnowledgeDiscovery, pp.595–601,2015.
[22] X. Yao,L.Peng,L.Yang,andT.Chi,“AFastSpace-SavingAlgorithmforMax-
imal Co-LocationPatternMining,” ExpertSystemswithApplications, Vol.63,
pp. 310–323,Nov.2016.
[23] J. S.Yoo,D.Boulware,andD.Kimmey,“AParallelSpatialCo-LocationMining
Algorithm BasedonMapReduce,” Proc.ofIEEEInt.CongressonBigData,
pp. 25–31,2014.
[24] J. S.YooandM.Bow,“MiningTop-kClosedCo-LocationPatterns,” Proc.of
IEEE Int.Conf.onSpatialDataMiningandGeographicalKnowledgeServices,
pp. 100–105,2011.
[25] J. S.YooandM.Bow,“MiningMaximalCo-LocatedEventSets,” Proc.ofthe
15th Pacific-AsiaConf.onAdvancesinKnowledgeDiscoveryandDataMining,
pp. 351–362,2011.
[26] J. S.YooandM.Bow,“MiningSpatialColocationPatterns:ADifferentFrame-
work,” Data MiningandKnowledgeDiscovery, Vol.24,No.1,pp.159–194,Jan.
2012.
[27] J. S.YooandS.Shekhar,“AJoinlessApproachforMiningSpatialColocation
Patterns,” IEEE Trans.onKnowledgeandDataEngineering, Vol.18,No.10,
pp. 1323–1337,Oct.2006.
[28] J. S.Yoo,S.Shekhar,J.Smith,andJ.P.Kumquat,“APartialJoinApproach
for MiningCo-LocationPatterns,” Proc.ofthe12thAnnualACMInt.Workshop
on GeographicInformationSystems, pp.241–249,2004.
[29] W. Yu,“SpatialCo-LocationPatternMiningforLocation-BasedServicesin
Road Networks,” ExpertSystemswithApplications, Vol.46,pp.324–335,March
2016.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code