國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,區塊編輯距離及相關問題之有效率演算法 ,Efficient Algorithms for the Block Edit Distance and Related Problems

論文名稱 Title	區塊編輯距離及相關問題之有效率演算法 Efficient Algorithms for the Block Edit Distance and Related Problems
系所名稱 Department	資訊工程學系 Department of Computer Science and Engineering
畢業學年期 Year, semester	98 學年度第 2 學期 The spring semester of Academic Year 98	語文別 Language	英文 English
學位類別 Degree	博士 Ph.D.	頁數 Number of pages	140
研究生 Author	安興彥 Hsing-Yen Ann
指導教授 Advisor	楊昌彪 Chang-Biau Yang
召集委員 Convenor	李家同 R. C. T. Lee
口試委員 Advisory Committee	趙坤茂, 蔣榮先, 王有禮, 洪宗貝, 張貿翔, 何錦文 Kun-Mao Chao; Jung-Hsien Chiang; Yue-Li Wang; Tzung-Pei Hong; M. S. Chang; Chin-Wen Ho
口試日期 Date of Exam	2010-04-28	繳交日期 Date of Submission	2010-05-18
關鍵字 Keywords	動態規劃、最長共同子序列、行程長度編碼、相似度、區塊編輯距離、演算法設計 run-length encoding, longest common subsequence, dynamic programming, similarity, design of algorithms, block edit distance
統計 Statistics	本論文已被瀏覽 5683 次，被下載 1425 次 The thesis/dissertation has been browsed 5683 times, has been downloaded 1425 times.

中文摘要
序列相似度計算始終是計算機領域的一個重要基礎，在過去數十年來，已被廣泛地探討。近年來，由於硬體計算能力的進步以及大量的生物資料的出現，又再度引起大家的重視。在本論文中，我們著重於計算兩序列之間的編輯距離，此處之編輯操作，除包含傳統字元操作外，亦包含區塊操作。前人之研究顯示，若可用之操作包含遞迴式區塊搬移，此問題將變成難題 (NP-hard)。在本論文中，我們針對在多項式時間內可求得最佳解之簡化版編輯距離問題作探討。包括：針對行程長度編碼後的字串計算最長共同子序列 (LCS of RLE strings)、針對行程長度編碼後的字串計算限制的最長共同子序列 (constrained LCS of RLE strings)。此外，本論文中亦針對某些簡化版區塊編輯距離問題提出多項式時間之演算法。給定兩條序列X與Y，其原始長度為n與m，行程長度編碼後的長度為N與M。我們提出一個淺顯易懂的簡易演算法，可在O(NM+min{p_1, p_2}) 時間內計算出X與Y之間的最長共同子序列，其中 p_1 與 p_2 分別為所有配對區塊之下邊界與右邊界的元素數量。此演算法改進了前人所提出的 O(nM, Nm) 時間演算法。若與另一類演算法來比較，此簡易演算法在部分情況下，也可勝過前人之 O(NM log NM) 與 O((N+M+q) log (N+M+q)) 的結果，其中 q 為所有配對區塊的數量。其次，本論文提出一個有效率演算法來解出限制的最長共同子序列。給定兩條序列X與Y以及一個限制序列P，限制的最長共同子序列問題為求出X與Y的某條子序列Z，使得P為Z的子序列，且Z為所有符合之序列中最長的。X、Y與P之原始長度分別為n、m與r，行程長度編碼後的長度分別為N、M與R。在本論文中，我們提出 O(NMr+min{q_1r+q_4, q_2r+q_5}) 時間的演算法，其中 q_1 與 q_2 分別為所有的部分配對長方體之南面牆壁與東面牆壁的元素數量，q_4 與 q_5 分別為所有的完全配對長方體底部之西角柱與北角柱的元素數量。當行程長度編碼有不錯之壓縮率時，本論文之演算法明顯優於前人提出之動態規劃演算法以及Hunt-Szymanski形式演算法。最後，本論文探討包含插入字元、刪除字元、複製區塊與刪除區塊的區塊編輯距離的變化形問題，並且按照不同的評估函數，定義三種變化型P(EIS, C)、P(EI, L)與P(EI, N)。我們提出有效率的方法，當經由適當的前處理後，此三種問題可以分別使用 O(nm)、O(nm log m) 與 O(nm^2) 時間的動態規劃演算法解出。
Abstract
Computing the similarity of two strings or sequences is one of the most important fundamental in computer field, and it has been widely studied for several decades. In the last decade, it gained the researchers' attentions again because of the improvements of the hardware computation ability and the presence of huge amount of data in biotechnology. In this dissertation, we pay attention to computing the edit distance between two sequences where the block-edit operations are involved in addition to the character-edit operations. Previous researches show that this problem is NP-hard if recursive block moves are allowed. Since we are interested in solving the editing problems by the polynomial-time optimization algorithms, we consider the simplified version of the edit distance problem. We first focus on the longest common subsequence (LCS) of run-length encoded (RLE) strings, where the runs can be seen as a class of simplified blocks. Then, we apply constraints to the problem, i.e. to find the constrained LCS (CLCS) of RLE strings. Besides, we show that the problems which involve block-edit operations can still be solved by the polynomial-time optimization algorithms if some restrictions are applied. Let X and Y be two sequences of lengths n and m, respectively. Also, let N and M, be the numbers of runs in the corresponding RLE forms of X and Y, respectively. In this dissertation, first, we propose a simple algorithm for computing the LCS of X and Y in O(NM + min{ p_1, p_2 }) time, where p_1 and p_2 denote the numbers of elements in the bottom and right boundaries of the matched blocks, respectively. This new algorithm improves the previously known time bound O(min{nM, Nm}) and outperforms the time bounds O(NM log NM) or O((N+M+q) log (N+M+q)) for some cases, where q denotes the number of matched blocks. Next, we give an efficient algorithm for solving the CLCS problem, which is to find a common subsequences Z of X and Y such that a given constrained sequence P is a subsequence of Z and the length of Z is maximized. Suppose X, Y and P are all in RLE format, and the lengths of X, Y and P are n, m and r, respectively. Let N, M and R be the numbers of runs in X, Y, and P, respectively. We show that by RLE, the CLCS problem can be solved in O(NMr + min{q_1 r + q_4, q_2 r + q_5 }) time, where q_1 and q_2 denote the numbers of elements in the south and east boundaries of the partially matched blocks on the first layer, respectively, and q_4 and q_5 denote the numbers of elements of the west and north pillars in the bottom boundaries of all fully matched cuboids in the DP lattice, respectively. When the input strings have good compression ratios, our work obviously outperforms the previously known DP algorithms and the Hunt-Szymanski-like algorithms. Finally, we consider variations of the block edit distance problem that involve character insertions, character deletions, block copies and block deletions, for two given sequences X and Y. In this dissertation, three variations are defined with different measuring functions, which are P(EIS, C), P(EI, L) and P(EI, N). Then we show that with some preprocessing, the minimum block edit distances of these three variations can be obtained by dynamic programming in O(nm), O(nm log m) and O(nm^2) time, respectively, where n and m are the lengths of X and Y.

目次 Table of Contents
LIST OF FIGURES iv LIST OF TABLES ix LIST OF SYMBOLS xi LIST OF ABBREVIATION xiii ABSTRACT xiv 1 Introduction 1 1.1 String Matching . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Sequence Comparison . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 Preliminaries 5 2.1 Hamming Distance . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 String Matching with k Mismatches . . . . . . . . . . . . . . . 6 2.3 The Longest Common Subsequence Problem . . . . . . . . . . 7 2.3.1 Original Dynamic Programming Algorithm for the LCS Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3.2 Solving the LCS Problem in Linear Space . . . . . . . 11 2.3.3 Hunt-Szymanski Algorithm for LCS Problem . . . . . . 12 2.4 The Edit Distance Problem . . . . . . . . . . . . . . . . . . . 15 2.4.1 Levenshtein Distance Problem . . . . . . . . . . . . . . 18 2.4.2 The Local Edit Distance Problem . . . . . . . . . . . . 19 2.4.3 The Substring Edit Distance Problem . . . . . . . . . . 20 2.5 The Constrained Longest Common Subsequence Problem . . . 21 2.6 The Run-length Encoding Scheme . . . . . . . . . . . . . . . . 24 2.7 Suffix Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.8 Lowest Common Ancestor (LCA) and Longest Common Prefix (LCP) . . . . . . . . . . . . . . . . . 31 2.9 The Range Minimum (Maximum) Query Problem . . . . . . . 33 3 Algorithms for Computing LCS of RLE Strings 35 3.1 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.1.1 Bunke and Csirik’s Algorithm . . . . . . . . . . . . . . 37 3.1.2 Apostolico et al.’s Algorithm . . . . . . . . . . . . . . . 42 3.1.3 Liu et al.’s Algorithm . . . . . . . . . . . . . . . . . . . 46 3.1.4 Mitchell’s Algorithm . . . . . . . . . . . . . . . . . . . 48 3.2 Our Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.2.1 The Basic Idea . . . . . . . . . . . . . . . . . . . . . . 51 3.2.2 The Algorithm . . . . . . . . . . . . . . . . . . . . . . 53 3.2.3 How Fast Is It . . . . . . . . . . . . . . . . . . . . . . . 57 3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4 Algorithms for Computing Constrained LCS of RLE Strings 60 4.1 The Properties of the Constrained LCS Problem for RLE Strings 61 4.2 A Simple Algorithm: Algorithm CLCS1 . . . . . . . . . . . . . 63 4.3 An Improved Algorithm: Algorithm CLCS2 . . . . . . . . . . 68 4.4 A Further Improved Algorithm: Algorithm CLCS3 . . . . . . 69 4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 5 The Block Edit Distance Problems 77 5.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 5.2 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 5.2.1 The NP-Hardness of Some Block Edit Problems . . . . 79 5.2.2 The Polynomial-Time Optimization Algorithms for Some Block Edit Problems . . . . . . . . . . . . . . . . . . . 84 5.2.3 Ukkonen’s Restricted Block Edit Problem . . . . . . . 84 5.3 Block Edit Operations . . . . . . . . . . . . . . . . . . . . . . 88 5.4 Our Block Edit Problems and Algorithms . . . . . . . . . . . 90 5.4.1 Problem 1 – P(EIS,C) . . . . . . . . . . . . . . . . . 90 5.4.2 Our Algorithm for P(EIS,C) . . . . . . . . . . . . . . 92 5.4.3 Problem 2 – P(EI, L) . . . . . . . . . . . . . . . . . . 101 5.4.4 Our Algoirhtm for P(EI,L) . . . . . . . . . . . . . . . 101 5.4.5 Problem 3 – P(EI,N) . . . . . . . . . . . . . . . . . . 103 5.4.6 Our Algoirhtm for P(EI,N) . . . . . . . . . . . . . . . 104 5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 6 Conclusions 107

參考文獻 References
[1] A. Amir, A. Apostolico, and M. Lewenstein, "Inverse pattern matching," Journal of Algorithms, Vol. 24, pp. 325-339, 1997. [2] A. Amir, M. Lewenstin, and E. Porat, "Faster algorithm for string matching with k mismatches," Journal of Algorithms, Vol. 50, pp. 257- 275, 2004. [3] H. Y. Ann, C. B. Yang, C. T. Tseng, and C. Y. Hor, "A fast and simple algorithm for computing the longest common subsequence of run-length encoded strings," Information Processing Letters, Vol. 108, pp. 360-364, 2008. [4] A. Apostolico, G. M. Landau, and S. Skiena, "Matching for run-length encoded strings," Journal of Complexity, Vol. 15, No. 1, pp. 4-16, 1999. [5] O. Arbell, G. M. Landau, and J. S. B. Mitchell, "Edit distance of runlength encoded strings," Information Processing Letters, Vol. 83, No. 6, pp. 307-314, 2002. [6] A. N. Arslan, "Regular expression constrained sequence alignment," Journal of Discrete Algorithms, Vol. 5, No. 4, pp. 647-661, 2007. [7] A. N. Arslan and ‥O. E?gecio?glu, "Algorithms for the constrained longest common subsequence problems," International Journal of Foundations Computer Science, Vol. 16, No. 6, pp. 1099-1109, 2005. [8] V. Bafna and P. A. Pevzner, "Genome rearrangements and sorting by reversals," 34th IEEE Symposium on Foundations of Computer Science, pp. 148-157, 1993. [9] M. A. Bender and M. Farach-Colton, "The LCA problem revisited," LATIN 2000: Theoretical Informatics, 4th Latin American Symposium, Punta del Este, Uruguay, pp. 88-94, 2000. [10] H. M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat, H. Weissing, I. N. Shindyalov, and P. E. Bourne, "The protein data bank," Nucleic Acids Research, Vol. 28, pp. 235-242, 2000. [11] C. Blum, M. J. Blesa, and M. L?opez-Ib?a?nez, "Beam search for the longest common subsequence problem," Computers and Operations Research, Vol. 36, No. 12, pp. 3178-3186, 2009. [12] H. Bunke and J. Csirik, "Edit distance of run-length coded strings," Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing, Kansas City, Missouri, United States, pp. 137-143, 1992. [13] H. Bunke and J. Csirik, "An algorithm for matching run-length coded strings," Computing, Vol. 50, No. 4, pp. 297-314, 1993. [14] H. Bunke and J. Csirik, "An improved algorithm for computing the edit distance of run-length coded strings," Information Processing Letters, Vol. 54, No. 2, pp. 93-96, 1995. [15] C. H. Chiang, "A genetic algorithm for the longest common subsequence of multiple sequences." http://etd.lib.nsysu.edu.tw/ETD-db/ETDsearch/search, Master's Thesis, Department of Computer Science and Engineering, National Sun Yat-Sen University, Kaohsiung, Taiwan, 2009. [16] F. Y. L. Chin, A. D. Santis, A. L. Ferrara, N. L. Ho, and S. K. Kim, "A simple algorithm for the constrained sequence problems," Information Processing Letters, Vol. 90, No. 4, pp. 175-179, 2004. [17] M. Chrobak, P. Kolman, and J. Sgall, "The greedy algorithm for the minimum common string partition problem," ACM Transactions on Algorithms, Vol. 1, No. 2, pp. 350-366, 2005. [18] Y. S. Chung, C. L. Lu, and C. Y. Tang, "Efficient algorithms for regular expression constrained sequence alignment," Information Processing Letters, Vol. 103, pp. 240-246, 2007. [19] R. Cole and R. Hariharan, "Approximate string matching: a simpler faster algorithm," Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms, pp. 463-472, 1998. [20] G. Cormodc and S. Muthukrishnan, "The string edit distance matching problem with moves," Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms, pp. 667-676, 2002. [21] P. J. de Rezende, D. T. Lee, and Y. F. Wu, "Rectilinear shortest paths with rectangular barriers," Proc. First Annual ACM Symposium on Computational Geometry, pp. 204-213, 1985. [22] M. Y. Eltabakh, W. K. Hon, W. A. R. Shah, and J. S. Vitter, "The SBC-tree: An index for run-length compressed sequences," Proceedings of the 11th International Conference on Extending Database Technology, Nantes, France, pp. 523-534, 2008. [23] F. Ergun, S. Muthukrishnan, and C. Sahinalp, "Comparing sequences with segment rearrangements," Proceedings of the FSTTCS'03, pp. 183- 194, 2003. [24] M. Farach-Colton, P. Ferragina, and S. Muthukrishnan, "On the sortingcomplexity of suffix tree construction," Journal of the ACM, Vol. 47, No. 6, pp. 987-1011, 2000. [25] J. Fayolle and M. D. Ward, "Analysis of the average depth in a suffix tree under a Markov model," International Conference on Analysis of Algorithms DMTCS Proc. AD, pp. 95-104, 2005. [26] E. Fredkin, "Trie memory," Communications of the ACM, Vol. 3, No. 9, pp. 490-500, 1960. [27] V. Freschi and A. Bogliolo, "Longest common subsequence between runlength- encoded strings: a new algorithm with improved parallelism," Information Processing Letters, Vol. 90, pp. 167-173, 2004. [28] C. W. Gibson, N. H. Thomson, W. R. Abrams, and J. Kirkham, "Nested genes: Biological implications and use of AFM for analysis," Gene, Vol. 350, No. 1, pp. 15-23, 2005. [29] O. Gotoh, "An improved algorithm for matching biological sequences," Journal of Molecular Biology, Vol. 162, No. 3, pp. 705-708, 1982. [30] Z. Gotthilf, D. Hermelin, and M. Lewenstein, "Constrained LCS: hardness and approximation," Combinatorial Pattern Matching, 19th Annual Symposium (CPM2008), Pisa, Italy, pp. 255-262, 2008. [31] D. Gusfield, Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, 1997. [32] R. W. Hamming, "Error detecting and error correcting codes," Bell System Technical Journal, Vol. 26, No. 2, pp. 147-160, 1950. [33] S. Hannenhalli, "Polynomial-time algorithm for computing translocation distance between genomes," Annual Symposium of Combinatorial Pattern Matching CPM, pp. 162-176, 1996. [34] D. Harel and R. E. Tarjan, "Fast algorithms for finding nearest common ancestors," SIAM Journal on Computing, Vol. 13, No. 2, pp. 338-355, 1984. [35] D. S. Hirschberg, "A linear space algorithm for computing maximal common subsequence," Communications of the ACM, Vol. 18, No. 65, pp. 341-343, 1975. [36] D. S. Hirschberg, "Algorithms for the longest common subsequence problem," Journal of the ACM, Vol. 24, No. 4, pp. 664-675, 1977. [37] K. S. Huang, C. B. Yang, K. T. Tseng, H. Y. Ann, and Y. H. Peng, "Algorithms for the merged-LCS problem and its variant with block constraint," Proc. of the 23rd Workshop on Combinatorial Mathematics and Computation Theory, Chang-Hua, Taiwan, pp. 232-239, 2006. [38] K. S. Huang, C. B. Yang, K. T. Tseng, Y. H. Peng, and H. Y. Ann, "Dynamic programming algorithms for the mosaic longest common subsequence problem," Information Processing Letters, Vol. 102, pp. 99-103, 2007. [39] J. W. Hunt and T. G. Szymanski, "A fast algorithm for computing longest common subsequences," Communications of the ACM, Vol. 20, No. 5, pp. 350-353, 1977. [40] J. W. Hunt and M. D. McIlroy, "An algorithm for differential file comparison," Computing Science Technical Report, Bell Laboratories, No. 41, 1976. [41] C. S. Iliopoulos and M. S. Rahman, "New efficient algorithms for the LCS and constrained LCS problems," Information Processing Letters, Vol. 106, No. 1, pp. 13-18, 2008. [42] T. Jiang and M. Li, "On the approximation of shortest common supersequences and longest common subsequences," SIAM Journal on Computing, Vol. 24, pp. 1122-1139, 1995. [43] H. Kaplan and N. Shafrir, "The greedy algorithm for edit distance with moves," Information Processing Letters, Vol. 97, No. 1, pp. 23-27, 2006. [44] J. Kececioglu and D. Sankoff, "Exact and approximation algorithms for the inversion distance between two permutations," In Proc. of 4th Ann. Symp. on Combinatorial Pattern Matching, Berlin, Vol. 684, pp. 87-105, 1993. [45] J. W. Kim, A. Amir, G. M. Landau, and K. Park, "Computing similarity of run-length encoded strings with affine gap penalty," Theoretical Computer Science, Vol. 395, No. 2-3, pp. 268-282, 2008. [46] D. E. Knuth, J. H. Morris Jr., and V. Pratt, "Fast pattern matching in strings," SIAM Journal on Computing, Vol. 6, No. 2, pp. 323-350, 1977. [47] G. M. Landau and U. Vishkin, "Fast parallel and serial approximate string matching," Journal of Algorithms, Vol. 10, No. 2, pp. 157-169, 1989. [48] R. C. Larson and V. O. Li, "Finding minimum rectilinear distance paths in the presence of barriers," Networks, Vol. 11, pp. 285-304, 1981. [49] B. Laundrie, J. S. Peterson, J. S. Baum, J. C. Chang, D. Fileppo, S. R. Thompson, and K. McCall, "Germline cell death is inhibited by P-element insertions disrupting the dcp-1/pita nested gene pair in Drosophila," Genetics, Vol. 165, pp. 1881-1888, 2003. [50] R. C. T. Lee, S. S. Tseng, R. C. Chang, and Y. T. Tsai, Introduction to the Design and Analysis of Algorithms. McGraw Hill, 2005. [51] V. Levenshtein, "Binary codes capable of correcting spurious insertions and deletions of ones," Problems of Information Transmission 1, pp. 8- 17, 1965. [52] J. J. Liu, G. S. Huang, Y. L. Wang, and R. C. T. Lee, "Edit distance for a run-length-encoded string and an uncompressed string," Information Processing Letters, Vol. 105, No. 1, pp. 12-16, 2007. [53] J. J. Liu, Y. L. Wang, and R. C. T. Lee, "Finding a longest common subsequence between a run-length-encoded string and an uncompressed string," Journal of Complexity, Vol. 24, No. 2, pp. 173-184, 2008. [54] D. Lopresti and A. Tomkins, "Block edit models for approximate string matching," Theoretical Computer Science, Vol. 181, pp. 159-179, 1997. [55] D. Maier, "The complexity of some problems on subsequences and supersequences," Journal of the ACM, Vol. 25, pp. 322-336, 1978. [56] V. M‥akinen, G. Navarro, and E. Ukkonen, "Approximate matching of run-length compressed strings," Algorithmica, Vol. 35, No. 4, pp. 347- 369, 2008. [57] A. Margalit and A. Rozenfeld, "Reducing the expected computational cost of template matching using run length representation," Pattern Recognition Letters, Vol. 11, pp. 255-265, 1990. [58] J. S. B. Mitchell, "L1 shortest paths among polygonal obstacles in the plane," Algorithmica, Vol. 8, pp. 55-88, 1992. [59] J. S. B. Mitchell, "A geometric shortest path problem, with application to computing a longest common subsequence in run-length encoded strings," Technical Report, Department of Applied Mathematics, SUNY Stony Brook, 1997. [60] S. Muthukrishnan and S. C. Sahinalp, "Approximate nearest neighbors and sequence comparison with block operations," Proceedings of the Symposium on the Theory of Computing (STOC 2000), pp. 416-424, 2000. [61] C. L. Peng, "An approach for solving the constrained longest common subsequence problem." http://etd.lib.nsysu.edu.tw/ETDdb/ ETDsearch/search, Master's Thesis, Department of Computer Science and Engineering, National Sun Yat-Sen University, Kaohsiung, Taiwan, 2003. [62] M. S. Rahman and C. S. Iliopoulos, "Algorithms for computing variants of the longest common subsequence problem," ISAAC, in: Lecture Notes in Computer Science, Vol. 4288, pp. 399-408, 2006. [63] C. Rick, "Simple and fast linear space computation of longest common subsequences," Information Processing Letters, Vol. 75, No. 6, pp. 275- 281, 2000. [64] K. Sayoood and E. F. Eds., Introduction to Data Compression. Morgan Kaufmann, Los Altos, CA, second ed., 2000. [65] D. Shapira and J. A. Storer, "Edit distance with move operations," Proceedings of the 13th Annual Symposium on Combinatorial Pattern Matching, Vol. 2373, pp. 85-98, 2002. [66] D. Shapira and J. A. Storer, "Large edit distance with multiple block operations," String Processing and Information Retrieval (SPIRE), Vol. 2857, pp. 369-377, 2003. [67] D. Shapira and J. A. Storer, "Edit distance with move operations," Journal of Discrete Algorithms, Vol. 5, No. 2, pp. 380-392, 2007. [68] S. J. Shyu and C. Y. Tsai, "Finding the longest common subsequence for multiple biological sequences by ant colony optimization," Computers and Operations Research, Vol. 36, No. 1, pp. 73-91, 2009. [69] C. Y. Tang, C. L. Lu, M. D. T. Chang, Y. T. Tsai, Y. J. Sun, K. M. Chao, J. M. Chang, Y. H. Chiou, C. M. Wu, H. T. Chang, and W. I. Chou, "Constrained multiple sequence alignment tool development and its application to rnase family alignment," Journal of Bioinformatics and Computational Biology, Vol. 1, pp. 267-287, 2003. [70] Y. T. Tsai, "The constrained longest common subsequence problem," Information Processing Letters, Vol. 88, No. 4, pp. 173-176, 2003. [71] E. Ukkonen, "Algorithms for approximate string matching," Information and Control, Vol. 64, pp. 100-118, 1985. [72] E. Ukkonen, "On-line construction of suffix trees," Algorithmica, Vol. 14, No. 3, pp. 249-260, 1995. [73] P. van Emde Boas, "Preserving order in a forest in less than logarithmic time and linear space," Information Processing Letters, Vol. 6, No. 3, pp. 80-82, 1977. [74] J. Vuillemin, "A unifying look at data structures," Communications of the ACM, Vol. 23, pp. 229-239, 1980. [75] R. Wagner and M. Fischer, "The string-to-string correction problem," Journal of the ACM, Vol. 21, No. 1, pp. 168-173, 1974. [76] P. Weiner, "Linear pattern matching algorithm," In Proceedings of the 14th Annual IEEE Symposium on Switching and Automata Theory, pp. 1-11, 1973. [77] H. Y. Weng, S. H. Shiau, K. S. Huang, and C. B. Yang, "A hybrid algorithm for the longest common subsequence problem," Proc. of the 26th Workshop on Combinatorial Mathematics and Computation Theory, Chiayi, Taiwan, pp. 122-129, 2009. [78] C. B. Yang and R. C. T. Lee, "Systolic algorithm for the longest common subsequence problem," Journal of the Chinese Institute of Engineers, Vol. 10, No. 6, pp. 691-699, 1987. [79] D. C. T. Yu, "Efficient algorithms for constrained sequence alignment problems." http://ethesys.lib.pu.edu.tw/ETD-db/ETD-search/search, Master's Thesis, Department of Computer Science and Information Management, Providence University, Taichung, Taiwan, 2003. [80] P. Yu, D. Ma, and M. Xu, "Nested genes in the human genome," Genomics, Vol. 86, No. 4, pp. 414-422, 2005. [81] J. Ziv and A. Lempel, "A universal algorithm for sequential data compression," IEEE Transactions on Information Theory, Vol. 23, No. 3, pp. 337-343, 1977.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：校內立即公開，校外一年後公開 off campus withheld 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0518110-225424.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS