國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,高效率重複樣式探勘演算法,An Efficient Algorithm for Mining Repeating Patterns

論文名稱 Title	高效率重複樣式探勘演算法 An Efficient Algorithm for Mining Repeating Patterns
系所名稱 Department	資訊工程學系 Department of Computer Science and Engineering
畢業學年期 Year, semester	104 學年度第 2 學期 The spring semester of Academic Year 104	語文別 Language	英文 English
學位類別 Degree	碩士 Master	頁數 Number of pages	53
研究生 Author	廖志峰 Zhi-Feng Liao
指導教授 Advisor	洪宗貝 Tzung-Pei Hong
召集委員 Convenor	林文揚 Wen-yang Lin
口試委員 Advisory Committee	江明朝, 陳俊豪 Ming-chao Chiang; Chun Hao Chen
口試日期 Date of Exam	2016-07-13	繳交日期 Date of Submission	2016-08-18
關鍵字 Keywords	重複樣式索引、後綴樹、重複樣式、快速重複樣式探勘、資料探勘 data mining, fast mining of repeating patterns, repeating pattern, suffix tree, repeating-pattern-index
統計 Statistics	本論文已被瀏覽 5754 次，被下載 44 次 The thesis/dissertation has been browsed 5754 times, has been downloaded 44 times.

中文摘要
重複樣式是由一群可辨別的重複元素所組成，在現實生活中，其實有很多的應用，如在音樂序列和醫學領域的序列裡，會含有價值的重複樣式。在許多應用中重複樣式會隱藏在序列裡被視為隱性的知識，因此過去幾十年中，如何有效地獲取有價值的重複樣式一直是一個熱門議題。近幾年來，雖然提出了一些相關的研究來處理這一類的問題，但其表現仍無法獲得需要處理大量數據用戶的滿意。針對這個問題，在本文中我們提出了一個高效率的演算法，名為快速重複樣式探勘，其利用我們所設計的快速樣式索引來展現重複樣式探勘的效能。由於快速樣式索引可以顯現出每個樣式的位置資訊，因此能夠有效提供快速重複樣式的探勘。而一個給定序列只需掃描該序列一次就可以發現重複樣式，並不需要重複的掃描。最後，實驗結果也證明我們演算法的執行性能比其他的演算法還要好。
Abstract
A repeating pattern is composed of identifiable elements that repeat in a predictable manner. In our real life, there are lots of applications such as musical and medical sequences containing valuable repeating patterns. The repeating patterns hidden in sequences might be viewed as implicit knowledge for identification of objects. Hence, how to efficiently and effectively retrieve the valuable repeating patterns has been a hot issue in the last decades. Although a number of studies were proposed to deal with this issue, the performance cannot still earn users' satisfactions if large data are processed. To aim at this issue, in this thesis, we propose an efficient algorithm named Fast Mining of Repeating Patterns (FMRP) that achieves high performance of mining repeating patterns by our designed Quick-Pattern-Index (QPI). This index can provide the proposed FMRP algorithm with an effective support because it contains the information of pattern positions. Without scanning a given sequence iteratively, the repeating patterns can be discovered by only one scan of the sequence. The experimental results reveal that our proposed algorithm performs better than the compared method in finding the repeating patterns.

目次 Table of Contents
論文審定書 i 致謝 ii 摘要 iii Abstract iv Contents v List of Figures vii List of Tables viii CHAPTER 1 Introduction 1 1.1 Background 1 1.2 Motivation 4 1.3 Contribution 5 1.4 Thesis Organization 5 CHAPTER 2 Related Work 7 2.1 RP-tree Approach 7 2.2 Using Suffix Tree to Search Non-Trivial Repeating Patterns 14 2.3 Applications of Repeating Patterns 17 CHAPTER 3 The Proposed Method 20 3.1 Overview 20 3.2 Illustrative Examples 24 CHAPTER 4 Empirical Study 29 4.1 Experimental Settings for Data Generations 29 4.2 Experimental Results on Different Data Settings 30 4.3 Experimental Comparisons between Suffix Tree and the Proposed Method 36 CHAPTER 5 Conclusion and Future Work 37 5.1 Conclusion 37 5.2 Future Work 38 References 39

參考文獻 References
[1] W. Aguilar, Y. Frauel, F. Escolano, M. E. Martinez-Perez, A. Espinosa-Romero, and M. A. Lozano, “A Robust Graph Transformation Matching for Non-Rigid Registration,” Image and Vision Computing, Vol. 27, Issue 7, pp. 897-910, 2009. [2] S. Blackburn and D. DeRoure, “A Tool for Content-based Navigation of Music,” The Sixth ACM International Conference on Multimedia, pp. 361-368, 1998. [3] V. Bakhmutova, V. D. Gusev, and T. N. Tikova, “The Search for Adaptations in Song Melodies,” Computer Music Journal, Vol. 21, No. 1, pp. 58-67, 1997. [4] M. A. Bartsch, and G. H. Wakefield, “To Catch a Chorus: Using Chroma-Based Representation for Audio Thumb-nailing,” The 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics, pp. 15-19, 2001. [5] C. Le Brese, J. J. Zou, and Brian Uy, “An Improved ASIFT Algorithm for Matching Repeating Patterns,” The 17th IEEE International Conference on Image Processing, pp. 2949-2952, 2010. [6] K. R. Castleman, Digital Image Processing, Prentice-Hall, 1979. [7] E. Chew, “Modeling Tonality: Applications to Music Cognition,” The Twenty-Third Annual Conference of the Cognitive Science, pp. 206-211, 2001. [8] J. C. C. Chen, and A. L. P. Chen, “Query by Rhythm: An Approach for Song Retrieval in Music Databases,” The 8th IEEE International Workshop on Research Issues in Data Engineering, pp. 139-146, 1998. [9] M. Cooper, and J. Foote, “Automatic Music Summarization via Similarity Analysis,” The 2002 IEEE International Conference on Music Information Retrieval, pp. 81-85, 2002. [10] M. T. Chen, and J. Seiferas, “Efficient and Elegant Subword-Tree Construction,” Combinatorial Algorithms on Words, Springer, pp. 97-107, 1984. [11] A. L. P. Chen, M. Chang, J. Chen, J. L. Hsu, C.H. Hsu, and S. Y. S. Hua, “Query by Music Segments: An Efficient Approach for Song Retrieval,” The 2000 IEEE International Conference on Multimedia and Expo, Vol. 2, pp. 873-876, 2000. [12] S. Friedman, and I. Stamos, “Online Detection of Repeated Structures in Point Clouds of Urban Scenes for Compression and Registration,” International Journal of Computer Vision, Vol. 102, Issue 1-3, pp. 112-128, 2013. [13] D. Gusfield, Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology, Press Syndicate of the University of Cambridge, 1997. [14] L. V. Gool, G. Zeng, P. Wonka, and P. Muller, “Image-Based Procedural Modeling of Facades,” The ACM SIGGRAPH Conference on Computer Graphics, pp. 63-130, 2007. [15] A. Ghias, J. Logan, D. Chamberlin, and B. C. Smith, “Query by Humming: Musical Information Retrieval in an Audio Database,” The Third ACM International Conference on Multimedia, pp. 231-236, 1995. [16] B. J. Han, E. Hwang, and S. Rho, “An Efficient Voice Transcription Scheme for Music Retrieval,” The 2007 IEEE International Conference on Multimedia and Ubiquitous Engineering , pp. 28-26, 2007. [17] J. L. Hsu, C. C. Liu, and L. P. Chen, “Discovering Non-Trivial Repeating Patterns in Music Data,” IEEE Transactions on Multimedia, Vol. 3, No. 3, pp. 311-325, 2001. [18] C. L. Krumbansl, Cognitive Foundations of Musical Pitch, Press of Oxford University, 1990. [19] D. G. Lowe, “Object Recognition from Local Scale-Invariant Features,” The Seventh IEEE International Conference on Computer Vision, Vol. 2, pp. 1150-1157, 1999. [20] J. Liu, E. Psarakis, and I. Stamos, “Automatic Kronecker Product Model Based Detection of Repeated Patterns in 2D Urban Images,” The 2013 IEEE International Conference on Computer Vision, pp. 401-408, 2013. [21] C. C. Liu, J. L. Hsu, and L. P. A. Chen “Efficient Theme and Non-Trivial Repeating Pattern Discovering in Music Databases,” The 15th IEEE International Conference on Data Engineering, pp. 14-21, 1999. [22] Y. L. Lo, and W. L. Li “Linear Time for Discovering Non-trivial Repeating Patterns in Music Databases,” The 2004 IEEE International Conference on Multimedia and Expo, Vol. 1, pp. 293-296, 2004. [23] E. McCreight, “A Space-Economical Suffix Tree Construction Algorithm,” Journal of the ACM, Vol. 23, No. 2, pp. 262-271, 1976. [24] Y. F. Ma, L. Lu, H. J. Zhang, and M .J. Li, “A User Attention Model for Video Summarization,” The Tenth ACM International Conference on Multimedia, pp. 533-542, 2002. [25] J. M. Morel, and G. Yu, “ASIFT: A New Framework for Fully Affine Invariant Image Comparison,” SIAM Journal on Imaging Sciences, Vol. 2, Issue 2, pp. 438-469, 2009. [26] M. A. Bartsch, and G. H. Wakefield, “Audio Thumb-Nailing of Popular Music Using Chroma-based representations,” IEEE Transactions on Multimedia, Vol. 7, Issue 1, pp. 96-104, 2005. [27] S. Pfeiffer, S. Fischer, and W. Effelsberg, “Automatic Audio Content Analysis,” The Fourth ACM International Conference on Multimedia, pp. 21-30, 1996. [28] Y. Rui, T. S. Huang, and S. Mehrotra, “Constructing Table-of-Content for Videos,” Multimedia Systems, Vol. 7, Issue 5, pp. 359-368, 1999. [29] A. Singh, (2014). Ukkonen’s suffix tree construction. Retrieved from http://www.geeksforgeeks.org/ukkonens-suffix-tree-construction-part-6/. [30] O. Teboul, I. Kokkinos, L. Simon, P. Koutsourakis, and N. Paragios, “Shape Grammar Parsing via Reinforcement Learning,” The 2015 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2273-2280, 2011. [31] E. Ukkonen, “On-Line Construction of Suffix Tree,” Algorithmica, Vol. 14, Issue 3, pp. 249-260, 1995. [32] P. Weiner, “Linear Pattern Matching Algorithms,” The 1973 IEEE Annual Symposium on Switching and Automata Theory, pp. 1-11, 1973. [33] D. William, and E. Brown, Theoretical Foundations of Music, Wadsworth Publishing Company, 1978. [34] Z. Wei, and J. Kosecka, “Generalized RANSAC Framework for Relaxed Correspondence Problems,” The Third IEEE International Symposium on 3D Data Processing, Visualization, and Transmission, pp. 14-16, 2006. [35] M. Wang, L. Lu, and H. H. Zhang “Repeating Pattern Discovery from Acoustic Musical Signals,” The 2004 IEEE International Conference on Multimedia and Expo, Vol. 3, pp. 2019-2022, 2004. [36] R. G. Xiao, Y. Y. Wang, H. Pan, and F. Wu, “Automatic Video Summarization by Spatio-Temporal Analysis and Non-Trivial Repeating Pattern Detection,” The 2008 IEEE Congress on Image and Signal Processing, pp. 555-559, 2008. [37] P. Zhao, T. Fang, J. Xiao, H. Zhang, Q. Zhao, and L. Quan, “Rectilinear Parsing of Architecture in Urban Environment,” The 2010 IEEE Computer Vision and Pattern Recognition, pp. 342-349, 2010.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0703116-150911.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS