國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,使用差異空間之時間序列分類演算法,An Algorithm for Time Series Classification Using Dissimilarity Space

論文名稱 Title	使用差異空間之時間序列分類演算法 An Algorithm for Time Series Classification Using Dissimilarity Space
系所名稱 Department	資訊工程學系 Department of Computer Science and Engineering
畢業學年期 Year, semester	106 學年度第 1 學期 The fall semester of Academic Year 106	語文別 Language	英文 English
學位類別 Degree	碩士 Master	頁數 Number of pages	65
研究生 Author	洪書凱 Shu-Kai Hung
指導教授 Advisor	楊昌彪 Chang-Biau Yang
召集委員 Convenor	郭美惠 Mei-Hui Guo
口試委員 Advisory Committee	謝孫源, 黃國璽, 洪宗貝 Sun-Yuan Hsieh; Kuo-Si Huang; Tzung-Pei Hong
口試日期 Date of Exam	2017-08-31	繳交日期 Date of Submission	2017-09-07
關鍵字 Keywords	差異空間、動態時間扭曲、整體學習、時間序列、支持向量 SVM, Time series, Dynamic time warping, Dissimilarity space, Ensemble
統計 Statistics	本論文已被瀏覽 5724 次，被下載 81 次 The thesis/dissertation has been browsed 5724 times, has been downloaded 81 times.

中文摘要
摘要時間序列分類問題的目的是分類一個新的時間序列一個合適的類別，然後給這個時間序列一個相應的類別標籤。這個問題已經研究了幾十年。在本論文中，結合概念的差異空間和DTW重心平均，我們構建特徵向量從訓練集中檢索。由於七個距離函數用於測量兩個時間序列之間的距離，我們得到七種類型的特徵向量。同七種特徵向量，我們構建了七個SVM（支持向量機）分類器。另外，為了提高分類精度，我們使用行為 - 知識空間（BKS）方法來構建整體分類器，其中每一個都是由七個SVM分類中的三個組成。我們的實驗數據集從UCR網站下載。實驗結果表明，一個SVM分類器，我們比結果提高了3％的精度Jain和Stephan，也使用了不同的空間。對於合奏分類器，最好的組合是DTWW-LCS-ED和DTW-LCS-ED。該整體分類器進一步提高了分類精確度。關鍵詞 : 時間序列、動態時間扭曲、差異空間、支持向量、整體學習
Abstract
ABSTRACT The time series classification problem is aimed to classify a new time series to a suitable class and then to give this time series a corresponding class label. This problem has been studied for decades. In this thesis, by combining the concepts of dissimilarity space and DTW barycenter averaging, we build the feature vectors retrieve from the training set. Since seven distance functions are used for measuring the distance between two time series, we get seven types of feature vectors. With the seven types of feature vectors, we build seven SVM (support vector machine) classifiers. In addition, to improve the classification accuracy, we use the behavior-knowledge space (BKS) method to construct ensemble classifiers, each of which is constituted by every three of the seven SVM classifiers. Our experimental datasets were downloaded from the UCR web sites. As the experimental results show, with one single SVM classifier, we get about 3% improvement in accuracy over the result of Jain and Stephan, which also used the dissimilarity space. For the ensemble classifiers, the two best combinations are DTWW-LCS-ED and DTW-LCS-ED. The ensemble classifiers have further improvement in the classification accuracy. Keywords : Time series, Dynamic time warping, Dissimilarity space, SVM, Ensemble

目次 Table of Contents
TABLE OF CONTENTS Page THESIS AUTHORIZATION FORM . . . . . . . . . . . . . . . . . . . . i THANKS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x LIST OF SYMBOLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0 Chapter 2. Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2.1 The Distance Measurements . . . . . . . . . . . . . . . . . . . . . . . 2 2.1.1 The Euclidean Distance . . . . . . . . . . . . . . . . . . . . . 2 2.1.2 Dynamic Time Warping Distance . . . . . . . . . . . . . . . . 3 2.1.3 The Derivative Dynamic Time Warping Distance . . . . . . . 4 2.1.4 Dynamic Time Warping Distance with Warping Window . . . 5 2.1.5 Longest Common Subsequence . . . . . . . . . . . . . . . . . . 6 2.1.6 Longest Common Subsequence with at Least Length k . . . . 7 2.1.7 Variable Gap Longest Common Subsequence . . . . . . . . . . 7 2.2 The Threshold for LCS-like algorithms . . . . . . . . . . . . . . . . . 9 Page 2.3 DTW Barycenter Averaging . . . . . . . . . . . . . . . . . . . . . . . 10 2.4 The Dissimilarity Space of Time Series . . . . . . . . . . . . . . . . . 11 2.4.1 The Classier Learning in Dissimilarity Space . . . . . . . . . 12 2.5 Time Series Classication . . . . . . . . . . . . . . . . . . . . . . . . 13 2.5.1 The Representation-Based Classication . . . . . . . . . . . . 13 2.6 The Behavior Knowledge Space Method . . . . . . . . . . . . . . . . 13 Chapter 3. Our Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Chapter 4. Experimental Results . . . . . . . . . . . . . . . . . . . . . . 20 4.1 The Experimental Datasets . . . . . . . . . . . . . . . . . . . . . . . 20 4.2 The Performance Comparison . . . . . . . . . . . . . . . . . . . . . . 25 4.3 Classication with the Behavior Knowledge Space Method . . . . . . 29 Chapter 5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

參考文獻 References
BIBLIOGRAPHY [1] A. Bagnall, J. Lines, J. Hills, and A. Bostrom, "Time-series classification with COTE: the collective of transformation-based ensembles," IEEE Transactions on Knowledge and Data Engineering, Vol. 27, No. 9, pp. 2522-2535, 2015. [2] R. Bellman and R. Kalaba, "On adaptive control processes," IRE Transactions on Automatic Control, Vol. 4, No. 2, pp. 1-9, 1959. [3] G. Benson, A. Levy, S. Maimoni, D. Noifeld, and B. R. Shalom, "LCSk: a refined similarity measure," Theoretical Computer Science, Vol. 638, pp. 11-26, 2016. [4] K. Buza, A. Nanopoulos, and L. Schmidt-Thieme, "Time-series classification based on individualised error prediction," Computational Science and Engi- neering (CSE), 2010 IEEE 13th International Conference on, pp. 48-54, IEEE, 2010. [5] C.-C. Chang and C.-J. Lin, "Libsvm: a library for support vector machines," ACM Transactions on Intelligent Systems and Technology (TIST), Vol. 2, No. 3, p. 27, 2011. [6] P.-E. Danielsson, "Euclidean distance mapping," Computer Graphics and image processing, Vol. 14, No. 3, pp. 227-248, 1980. [7] H. Ding, G. Trajcevski, P. Scheuermann, X. Wang, and E. Keogh, "Querying and mining of time series data: experimental comparison of representations and distance measures," Proceedings of the VLDB Endowment, Vol. 1, No. 2, pp. 1542-1552, 2008. [8] G.-C. Guo, K.-S. Huang, and C.-B. Yang, "Time series classification based on the longest common subsequence similarity and ensemble learning.," Proceed- ings of Symposium on Digital Life Technologies Symposium, DLT2016, Ping- tung, Taiwan, pp. 84{91, June 2016. [9] J. D. Hamilton, Time Series Analysis, Vol. 2. Princeton University Press, Princeton, 1994. [10] J. A. Hartigan and M. A. Wong, "Algorithm as 136: A k-means clustering algo- rithm," Journal of the Royal Statistical Society. Series C (Applied Statistics), Vol. 28, No. 1, pp. 100-108, 1979. [11] D. S. Hirschberg, "Algorithms for the longest common subsequence problem," Journal of the ACM (JACM), Vol. 24, No. 4, pp. 664-675, 1977. [12] C.-Y. Ho, , C.-B. Yang, C.-H. Chang, C.-T. Tseng, and H.-H. Chen, "A tool preference selection method for rna secondary structure prediction by svm with statistical tests," Evolutionary Bioinformatics, Vol. 9, pp. 163-184, Apr. 2013. [13] C.-Y. Hor, C.-B. Yang, C.-H. Chang, C.-T. Tseng, and H.-H. Chen, "Flexible dynamic time warping for time series classification," Procedia Computer Science 51, International Conference On Computational Science, ICCS 2015, Vol. 51, pp. 2838-2842, June 2015. [14] Z. Huang, "Extensions to the k-means algorithm for clustering large data sets with categorical values," Data Mining and Knowledge Discovery, Vol. 2, No. 3, pp. 283-304, 1998. [15] F. Itakura, "Minimum prediction residual principle applied to speech recogni- tion," IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 23, No. 1, pp. 67-72, 1975. [16] B. J. Jain and S. Spiegel, "Time series classification in dissimilarity spaces," Proceedings of 1st International Workshop on Advanced Analytics and Learning on Temporal Data, Porto, Portugal, 2015. [17] N. Kapoor, A. Gandhi, A. Chaurasia, et al., "Central neurocytoma in the vermis of the cerebellum," Indian Journal of Pathology and Microbiology, Vol. 52, No. 1, p. 108, 2009. [18] E. Keogh, Q. Zhu, B. Hu, Y. Hao, X. Xi, L. Wei, and C. A. Ratanamahatana, "The UCR time series classification/clustering homepage." http://www.cs. ucr.edu/~eamonn/time_series_data/, 2011. [19] E. Keogh and S. Kasetty, "On the need for time series data mining benchmarks: a survey and empirical demonstration," Data Mining and Knowledge Discovery, Vol. 7, No. 4, pp. 349-371, 2003. [20] E. J. Keogh and M. J. Pazzani, "An enhanced representation of time series which allows fast and accurate classification, clustering and relevance feed- back.," Proceedings of the Fourth International Conference on Knowledge Dis- covery and Data Mining, Vol. 98, New York, USA, pp. 239-243, 1998. [21] E. J. Keogh and M. J. Pazzani, "Derivative dynamic time warping," Proceedings of the First SIAM International Conference on Data Mining, Vol. 1, Chicago, IL, USA, pp. 5-7, Apr. 2001. [22] L. Livi, A. Rizzi, and A. Sadeghian, "Optimized dissimilarity space embedding for labeled graphs," Information Sciences, Vol. 266, pp. 47-64, 2014. [23] E. Pekalska, R. P. Duin, and P. Paclik, "Prototype selection for dissimilarity- based classifiers," Pattern Recognition, Vol. 39, No. 2, pp. 189{208, 2006. [24] Y.-H. Peng and C.-B. Yang, "Finding the gapped longest common subse- quence by incremental suffix maximum queries," Information and Computation, Vol. 237, pp. 95{100, Oct. 2014. [25] F. Petitjean, A. Ketterlin, and P. Gancarski, "A global averaging method for dynamic time warping, with applications to clustering," Pattern Recognition, Vol. 44, No. 3, pp. 678-693, 2011. [26] S. Rani and G. Sikka, "Recent techniques of clustering of time series data: A survey," International Journal of Computer Applications, Vol. 52, No. 15, pp. 1-9, Aug. 2012. [27] S. Raudys and F. Roli, "The behavior knowledge space fusion method: Analysis of generalization error and strategies for performance improvement," Multiple Classifier Systems, pp. 160-160, 2003. [28] K. Riesen and H. Bunke, "Graph classification based on vector space embed- ding," International Journal of Pattern Recognition and Artificial Intelligence, Vol. 23, No. 06, pp. 1053-1081, 2009. [29] H. Sakoe and S. Chiba, "Dynamic programming algorithm optimization for spoken word recognition," IEEE transactions on acoustics, speech, and signal processing, Vol. 26, No. 1, pp. 43-49, 1978. [30] J. A. Suykens and J. Vandewalle, "Least squares support vector machine clas- sifiers," Neural processing letters, Vol. 9, No. 3, pp. 293-300, 1999. [31] Y. Ueki, Diptarama, M. Kurihara, Y. Matsuoka, K. Narisawa, R. Yoshinaka, H. Bannai, S. Inenaga, and A. Shinohara, "Longest common subsequence in at least k length order-isomorphic substrings," International Conference on Current Trends in Theory and Practice of Informatics, pp. 363-374, Springer, 2017. [32] Z. Xing, J. Pei, and E. Keogh, "A brief survey on sequence classification," ACM SIGKDD Explorations Newsletter, Vol. 12, No. 1, pp. 40-48, June 2010. 48

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0807117-143059.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS