Responsive image
博碩士論文 etd-0807117-143059 詳細資訊
Title page for etd-0807117-143059
論文名稱
Title
使用差異空間之時間序列分類演算法
An Algorithm for Time Series Classification Using Dissimilarity Space
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
65
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2017-08-31
繳交日期
Date of Submission
2017-09-07
關鍵字
Keywords
差異空間、動態時間扭曲、整體學習、時間序列、支持向量
SVM, Time series, Dynamic time warping, Dissimilarity space, Ensemble
統計
Statistics
本論文已被瀏覽 5724 次,被下載 81
The thesis/dissertation has been browsed 5724 times, has been downloaded 81 times.
中文摘要
摘 要
時間序列分類問題的目的是分類一個新的時間序列一個合適的類別,然後給這個時間序列一個相應的類別標籤。這個問題已經研究了幾十年。在本論文中,結合概念的差異空間和DTW重心平均,我們構建特徵向量從訓練集中檢索。由於七個距離函數用於測量兩個時間序列之間的距離,我們得到七種類型的特徵向量。同七種特徵向量,我們構建了七個SVM(支持向量機)
分類器。另外,為了提高分類精度,我們使用行為 - 知識空間(BKS)方法來構建整體分類器,其中每一個都是由七個SVM分類中的三個組成。我們的實驗數據集從UCR網站下載。實驗結果表明,一個SVM分類器,我們比結果提高了3%的精度Jain和Stephan,也使用了不同的空間。對於合奏分類器,最好的組合是DTWW-LCS-ED和DTW-LCS-ED。該整體分類器進一步提高了分類精確度。

關鍵詞 : 時間序列、動態時間扭曲、差異空間、支持向量、整體學習
Abstract
ABSTRACT

The time series classification problem is aimed to classify a new time series to a suitable class and then to give this time series a corresponding class label. This problem has been studied for decades. In this thesis, by combining the concepts of dissimilarity space and DTW barycenter averaging, we build the feature vectors retrieve from the training set. Since seven distance functions are used for measuring the distance between two time series, we get seven types of feature vectors. With the seven types of feature vectors, we build seven SVM (support vector machine) classifiers.
In addition, to improve the classification accuracy, we use the behavior-knowledge space (BKS) method to construct ensemble classifiers, each of which is constituted by every three of the seven SVM classifiers. Our experimental datasets were downloaded from the UCR web sites. As the experimental results show, with one single SVM classifier, we get about 3% improvement in accuracy over the result of Jain and Stephan, which also used the dissimilarity space.
For the ensemble classifiers, the two best combinations are DTWW-LCS-ED and DTW-LCS-ED. The ensemble classifiers have further improvement in the classification accuracy.


Keywords : Time series, Dynamic time warping, Dissimilarity space, SVM, Ensemble
目次 Table of Contents
TABLE OF CONTENTS
Page
THESIS AUTHORIZATION FORM . . . . . . . . . . . . . . . . . . . . i
THANKS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
LIST OF SYMBOLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0
Chapter 2. Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.1 The Distance Measurements . . . . . . . . . . . . . . . . . . . . . . . 2
2.1.1 The Euclidean Distance . . . . . . . . . . . . . . . . . . . . . 2
2.1.2 Dynamic Time Warping Distance . . . . . . . . . . . . . . . . 3
2.1.3 The Derivative Dynamic Time Warping Distance . . . . . . . 4
2.1.4 Dynamic Time Warping Distance with Warping Window . . . 5
2.1.5 Longest Common Subsequence . . . . . . . . . . . . . . . . . . 6
2.1.6 Longest Common Subsequence with at Least Length k . . . . 7
2.1.7 Variable Gap Longest Common Subsequence . . . . . . . . . . 7
2.2 The Threshold for LCS-like algorithms . . . . . . . . . . . . . . . . . 9
Page
2.3 DTW Barycenter Averaging . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 The Dissimilarity Space of Time Series . . . . . . . . . . . . . . . . . 11
2.4.1 The Classi er Learning in Dissimilarity Space . . . . . . . . . 12
2.5 Time Series Classi cation . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5.1 The Representation-Based Classi cation . . . . . . . . . . . . 13
2.6 The Behavior Knowledge Space Method . . . . . . . . . . . . . . . . 13
Chapter 3. Our Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Chapter 4. Experimental Results . . . . . . . . . . . . . . . . . . . . . . 20
4.1 The Experimental Datasets . . . . . . . . . . . . . . . . . . . . . . . 20
4.2 The Performance Comparison . . . . . . . . . . . . . . . . . . . . . . 25
4.3 Classi cation with the Behavior Knowledge Space Method . . . . . . 29
Chapter 5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
參考文獻 References
BIBLIOGRAPHY
[1] A. Bagnall, J. Lines, J. Hills, and A. Bostrom, "Time-series classification with
COTE: the collective of transformation-based ensembles," IEEE Transactions
on Knowledge and Data Engineering, Vol. 27, No. 9, pp. 2522-2535, 2015.
[2] R. Bellman and R. Kalaba, "On adaptive control processes," IRE Transactions
on Automatic Control, Vol. 4, No. 2, pp. 1-9, 1959.
[3] G. Benson, A. Levy, S. Maimoni, D. Noifeld, and B. R. Shalom, "LCSk: a
refined similarity measure," Theoretical Computer Science, Vol. 638, pp. 11-26,
2016.
[4] K. Buza, A. Nanopoulos, and L. Schmidt-Thieme, "Time-series classification
based on individualised error prediction," Computational Science and Engi-
neering (CSE), 2010 IEEE 13th International Conference on, pp. 48-54, IEEE,
2010.
[5] C.-C. Chang and C.-J. Lin, "Libsvm: a library for support vector machines,"
ACM Transactions on Intelligent Systems and Technology (TIST), Vol. 2, No. 3,
p. 27, 2011.
[6] P.-E. Danielsson, "Euclidean distance mapping," Computer Graphics and image
processing, Vol. 14, No. 3, pp. 227-248, 1980.
[7] H. Ding, G. Trajcevski, P. Scheuermann, X. Wang, and E. Keogh, "Querying
and mining of time series data: experimental comparison of representations
and distance measures," Proceedings of the VLDB Endowment, Vol. 1, No. 2,
pp. 1542-1552, 2008.
[8] G.-C. Guo, K.-S. Huang, and C.-B. Yang, "Time series classification based on
the longest common subsequence similarity and ensemble learning.," Proceed-
ings of Symposium on Digital Life Technologies Symposium, DLT2016, Ping-
tung, Taiwan, pp. 84{91, June 2016.
[9] J. D. Hamilton, Time Series Analysis, Vol. 2. Princeton University Press,
Princeton, 1994.
[10] J. A. Hartigan and M. A. Wong, "Algorithm as 136: A k-means clustering algo-
rithm," Journal of the Royal Statistical Society. Series C (Applied Statistics),
Vol. 28, No. 1, pp. 100-108, 1979.
[11] D. S. Hirschberg, "Algorithms for the longest common subsequence problem,"
Journal of the ACM (JACM), Vol. 24, No. 4, pp. 664-675, 1977.
[12] C.-Y. Ho, , C.-B. Yang, C.-H. Chang, C.-T. Tseng, and H.-H. Chen, "A tool
preference selection method for rna secondary structure prediction by svm with
statistical tests," Evolutionary Bioinformatics, Vol. 9, pp. 163-184, Apr. 2013.
[13] C.-Y. Hor, C.-B. Yang, C.-H. Chang, C.-T. Tseng, and H.-H. Chen, "Flexible
dynamic time warping for time series classification," Procedia Computer Science
51, International Conference On Computational Science, ICCS 2015, Vol. 51,
pp. 2838-2842, June 2015.
[14] Z. Huang, "Extensions to the k-means algorithm for clustering large data sets
with categorical values," Data Mining and Knowledge Discovery, Vol. 2, No. 3,
pp. 283-304, 1998.
[15] F. Itakura, "Minimum prediction residual principle applied to speech recogni-
tion," IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 23,
No. 1, pp. 67-72, 1975.
[16] B. J. Jain and S. Spiegel, "Time series classification in dissimilarity spaces,"
Proceedings of 1st International Workshop on Advanced Analytics and Learning
on Temporal Data, Porto, Portugal, 2015.
[17] N. Kapoor, A. Gandhi, A. Chaurasia, et al., "Central neurocytoma in the vermis
of the cerebellum," Indian Journal of Pathology and Microbiology, Vol. 52,
No. 1, p. 108, 2009.
[18] E. Keogh, Q. Zhu, B. Hu, Y. Hao, X. Xi, L. Wei, and C. A. Ratanamahatana,
"The UCR time series classification/clustering homepage." http://www.cs.
ucr.edu/~eamonn/time_series_data/, 2011.
[19] E. Keogh and S. Kasetty, "On the need for time series data mining benchmarks:
a survey and empirical demonstration," Data Mining and Knowledge Discovery,
Vol. 7, No. 4, pp. 349-371, 2003.
[20] E. J. Keogh and M. J. Pazzani, "An enhanced representation of time series
which allows fast and accurate classification, clustering and relevance feed-
back.," Proceedings of the Fourth International Conference on Knowledge Dis-
covery and Data Mining, Vol. 98, New York, USA, pp. 239-243, 1998.
[21] E. J. Keogh and M. J. Pazzani, "Derivative dynamic time warping," Proceedings
of the First SIAM International Conference on Data Mining, Vol. 1, Chicago,
IL, USA, pp. 5-7, Apr. 2001.
[22] L. Livi, A. Rizzi, and A. Sadeghian, "Optimized dissimilarity space embedding
for labeled graphs," Information Sciences, Vol. 266, pp. 47-64, 2014.
[23] E. Pekalska, R. P. Duin, and P. Paclik, "Prototype selection for dissimilarity-
based classifiers," Pattern Recognition, Vol. 39, No. 2, pp. 189{208, 2006.
[24] Y.-H. Peng and C.-B. Yang, "Finding the gapped longest common subse-
quence by incremental suffix maximum queries," Information and Computation,
Vol. 237, pp. 95{100, Oct. 2014.
[25] F. Petitjean, A. Ketterlin, and P. Gancarski, "A global averaging method for
dynamic time warping, with applications to clustering," Pattern Recognition,
Vol. 44, No. 3, pp. 678-693, 2011.
[26] S. Rani and G. Sikka, "Recent techniques of clustering of time series data:
A survey," International Journal of Computer Applications, Vol. 52, No. 15,
pp. 1-9, Aug. 2012.
[27] S. Raudys and F. Roli, "The behavior knowledge space fusion method: Analysis
of generalization error and strategies for performance improvement," Multiple
Classifier Systems, pp. 160-160, 2003.
[28] K. Riesen and H. Bunke, "Graph classification based on vector space embed-
ding," International Journal of Pattern Recognition and Artificial Intelligence,
Vol. 23, No. 06, pp. 1053-1081, 2009.
[29] H. Sakoe and S. Chiba, "Dynamic programming algorithm optimization for
spoken word recognition," IEEE transactions on acoustics, speech, and signal
processing, Vol. 26, No. 1, pp. 43-49, 1978.
[30] J. A. Suykens and J. Vandewalle, "Least squares support vector machine clas-
sifiers," Neural processing letters, Vol. 9, No. 3, pp. 293-300, 1999.
[31] Y. Ueki, Diptarama, M. Kurihara, Y. Matsuoka, K. Narisawa, R. Yoshinaka,
H. Bannai, S. Inenaga, and A. Shinohara, "Longest common subsequence in
at least k length order-isomorphic substrings," International Conference on
Current Trends in Theory and Practice of Informatics, pp. 363-374, Springer,
2017.
[32] Z. Xing, J. Pei, and E. Keogh, "A brief survey on sequence classification," ACM
SIGKDD Explorations Newsletter, Vol. 12, No. 1, pp. 40-48, June 2010.
48
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code