Responsive image
博碩士論文 etd-0801115-220910 詳細資訊
Title page for etd-0801115-220910
論文名稱
Title
相似序列的融合最長共同子序列之快速方法
Efficient Merged Longest Common Subsequence Algorithms for Similar Sequences
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
48
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2015-08-20
繳交日期
Date of Submission
2015-09-02
關鍵字
Keywords
支配、相似、區塊融合最長共同子序列、融合最長共同子序列、最長共同子序列
Block-merged Longest Common Subsequence, dominate, Merged Longest Common Subsequence, Longest Common Subsequence, similar
統計
Statistics
本論文已被瀏覽 5704 次,被下載 327
The thesis/dissertation has been browsed 5704 times, has been downloaded 327 times.
中文摘要
給予A跟B這對來源序列和一條目標序列T,融合最長共同子序列 (MLCS) 這問題是找出E(A, B)跟T的最長共同子序列 (LCS),E(A, B)可以經由融合A跟B序列得出。在這篇論文中,我們首先提出一個O(L(r-L+1)m)時間複雜度的演算法來解決融合最長共同子序列的問題,其中r跟L分別表示T序列的長度跟MLCS序列的長度,m表示A跟B兩序列較短的序列長度。從時間複雜度來看,當T跟E(A, B)非常相似時,我們的演算法會很有效率。在另一方面,對於不相似的情況也會很有效率。我們的演算法只要經過稍微的修改,也可以用O(L(r-L+1)m)的時間複雜度來解決融合最長共同子序列的變形問題 (區塊融合最長共同子序列)。實驗結果顯示當序列的相似度非常高,我們的演算法會比目前已知的MLCS跟BMLCS演算法快。
Abstract
Given a pair of merging sequences A and B and a target sequence T, the merged longest common subsequence (MLCS) problem is to find out a longest common subsequence (LCS) between sequences E(A, B) and T, where E(A, B) is obtained from merging two subsequences of A and B. In this thesis, we first propose an algorithm for solving the MLCS problem in O(L(r-L+1)m) time, where r and L denote the lengths of T and MLCS length, respectively, and m denotes the minimum length of A and B. From the time complexity, it is clear that our algorithm is extremely efficient when T and E(A, B) are very similar. On the other hand, it is also efficient when the similarity is extremely low. With slight modification, our algorithm can also solve another variant merged LCS problem, the block-merged LCS problem, in O(L(r-L+1)m) time. Experimental results show that our algorithms are faster than other previously published MLCS and BMLCS algorithms for sequences with high similarity.
目次 Table of Contents
TABLE OF CONTENTS

論文審定書 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i

DISSERTATION VERIFICATION FORM . . . . . . . . . . . . . . . . . . . . . . . . . . . ii

謝誌 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .v

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Chapter 2. Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.1 The Longest Common Subsequence Problem . . . . . . . . . . . . . . . . . 5
2.2 The LCS Algorithm of Nakatsu and Yajima . . . . . . . . . . . . . . . . . . . 6
2.3 The Merged Longest Common Subsequence Problem . . . . . . . . . . . 9
2.4 The Block-Merged Longest Common Subsequence Problem . . . . .10

Chapter 3. Our Merged LCS and Block-Merged LCS Algorithms . . . . . . . . . 13

3.1 Our Merged LCS Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Our Block-Merged LCS Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 22

Chapter 4. Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Chapter 5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
參考文獻 References
[1] H.-Y. Ann, C.-B. Yang, C.-T. Tseng, and C.-Y. Hor, “A fast and simple algorithm for computing the longest common subsequence of run-length encoded strings,” Information Processing Letters, Vol. 108, pp. 360-364, 2008.
[2] A. Apostolico, “Improving the worst-case performance of the Hunt-Szymanski strategy for the longest common subsequence of two strings," Information Processing Letters, Vol. 23, pp. 63-69, 1986.
[3] A. Apostolico, “Remark the HSU-DU new algorithm for the longest common subsequence problem," Information Processing Letters, Vol. 25, pp. 235-236,1987.
[4] A. Apostolico, S. Browne, and C. Guerra, “Fast linear-space computations of longest common subsequences," Theoretical Computer Science, Vol. 92, pp. 3-17, 1992.
[5] M. Crochemore, C. S. Iliopoulos, Y. J. Pinzon, and J. F. Reid, “A fast and practical bit-vector algorithm for the longest common subsequence problem," Information Processing Letters, Vol. 80, pp. 279-285, 2001.
[6] A. Danek and S. Deorowicz, “Bit-parallel algorithm for the block variant of the merged longest common subsequence problem," Advances in Intelligent Systems and Computing, Vol. 242, pp. 173-181, 2014.
[7] S. Deorowicz and A. Danek, “Bit-parallel algorithms for the merged longest common subsequence problem," International Journal of Foundations of Computer Science, Vol. 24, pp. 1281-1298, 2013.
[8] J. Guo and F. Hwang, “An almost-linear time and linear space algorithm for the longest common subsequence problem," Information Processing Letters, Vol. 94, pp. 131-135, 2005.
[9] D. S. Hirschberg, “A linear space algorithm for computing maximal common subsequences," Communications of the ACM, Vol. 18, No. 6, pp. 341-343, 1975.
[10] K.-S. Huang, C.-B. Yang, K.-T. Tseng, H.-Y. Ann, and Y.-H. Peng, “Efficient algorithms for finding interleaving relationship between sequences," Information Processing Letters, Vol. 105, pp. 188-193, 2008.
[11] J. W. Hunt and T. G. Szymanski, “A fast algorithm for computing longest common subsequences," Communications of the ACM, Vol. 20, No. 5, pp. 350-353, 1977.
[12] S. Kumar and C. Rangan, “A linear space algorithm for the LCS problem," Acta Informatica, Vol. 24, pp. 353-362, 1987.
[13] G. M. Landau, E. Myers, and M. Ziv-Ukelson, “Two algorithms for LCS consecutive suffix alignment," Combinatorial Pattern Matching, Vol. 3109, pp. 173-193, 2004.
[14] J. Liu, G. Huang, Y.Wang, and R. Lee, “Edit distance for a run-length-encoded string and an uncompressed string," Information Processing Letters, Vol. 105, pp. 12-16, 2007.
[15] M. Maes, “On a cyclic string-to-string correction problem," Information Processing Letters, Vol. 35, pp. 73-78, 1990.
[16] W. J. Masek, A faster algorithm computing string edit distance," Journal of Computer and System Sciences, Vol. 20, pp. 18-31, 1980.
[17] E. W. Myers, “An O(ND) di erence algorithm and its variations," Algorithmica, Vol. 1, pp. 251-266, 1986.
[18] N. Nakatsu, Y. Kambayashi, and S. Yajima, “A longest common subsequence algorithm suitable for similar text strings," Acta Informatica, Vol. 18, pp. 171-179, 1982.
[19] M. Pawlik and N. Augsten, “RTED: a robust algorithm for the tree edit distance," Proceedings of the VLDB Endowment, Vol. 5, No. 4, pp. 334-345, 2011.
[20] Y.-H. Peng, C.-B. Yang, K.-S. Huang, C.-T. Tseng, and C.-Y. Hor, “Efficient sparse dynamic programming for the merged LCS problem with block constraints," International Journal of Innovative Computing, Information and Control, Vol. 6, pp. 1935-1947, 2010.
[21] A. M. Rahman and M. S. Rahman, “Effective sparse dynamic programming algorithms for merged and block merged LCS problems," Journal of Computers, Vol. 9, No. 8, pp. 1743-1754, 2014.
[22] C. Rick, “Simple and fast linear space computation of longest common subsequence," Information Processing Letters, Vol. 75, pp. 275-281, 2000.
[23] E. Ukkonen, “Algorithm for approximate string matching," Information and Control, Vol. 64, pp. 100-118, 1985.
[24] R. Wagner and M. Fischer, “The string-to-string correction problem," Journal of the ACM, Vol. 21, No. 1, pp. 168-173, 1974.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code