Responsive image
博碩士論文 etd-0518110-225424 詳細資訊
Title page for etd-0518110-225424
論文名稱
Title
區塊編輯距離及相關問題之有效率演算法
Efficient Algorithms for the Block Edit Distance and Related Problems
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
140
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2010-04-28
繳交日期
Date of Submission
2010-05-18
關鍵字
Keywords
動態規劃、最長共同子序列、行程長度編碼、相似度、區塊編輯距離、演算法設計
run-length encoding, longest common subsequence, dynamic programming, similarity, design of algorithms, block edit distance
統計
Statistics
本論文已被瀏覽 5673 次,被下載 1425
The thesis/dissertation has been browsed 5673 times, has been downloaded 1425 times.
中文摘要
序列相似度計算始終是計算機領域的一個重要基礎,在過去數十年來,已被廣泛地探討。近年來,由於硬體計算能力的進步以及大量的生物資料的出現,又再度引起大家的重視。在本論文中,我們著重於計算兩序列之間的編輯距離,此處之編輯操作,除包含傳統字元操作外,亦包含區塊操作。前人之研究顯示,若可用之操作包含遞迴式區塊搬移,此問題將變成難題 (NP-hard)。在本論文中,我們針對在多項式時間內可求得最佳解之簡化版編輯距離問題作探討。包括:針對行程長度編碼後的字串計算最長共同子序列 (LCS of RLE strings)、針對行程長度編碼後的字串計算限制的最長共同子序列 (constrained LCS of RLE strings)。此外,本論文中亦針對某些簡化版區塊編輯距離問題提出多項式時間之演算法。

給定兩條序列X與Y,其原始長度為n與m,行程長度編碼後的長度為N與M。我們提出一個淺顯易懂的簡易演算法,可在O(NM+min{p_1, p_2}) 時間內計算出X與Y之間的最長共同子序列,其中 p_1 與 p_2 分別為所有配對區塊之下邊界與右邊界的元素數量。此演算法改進了前人所提出的 O(nM, Nm) 時間演算法。若與另一類演算法來比較,此簡易演算法在部分情況下,也可勝過前人之 O(NM log NM) 與 O((N+M+q) log (N+M+q)) 的結果,其中 q 為所有配對區塊的數量。

其次,本論文提出一個有效率演算法來解出限制的最長共同子序列。給定兩條序列X與Y以及一個限制序列P,限制的最長共同子序列問題為求出X與Y的某條子序列Z,使得P為Z的子序列,且Z為所有符合之序列中最長的。X、Y與P之原始長度分別為n、m與r,行程長度編碼後的長度分別為N、M與R。在本論文中,我們提出 O(NMr+min{q_1r+q_4, q_2r+q_5}) 時間的演算法,其中 q_1 與 q_2 分別為所有的部分配對長方體之南面牆壁與東面牆壁的元素數量,q_4 與 q_5 分別為所有的完全配對長方體底部之西角柱與北角柱的元素數量。當行程長度編碼有不錯之壓縮率時,本論文之演算法明顯優於前人提出之動態規劃演算法以及Hunt-Szymanski形式演算法。

最後,本論文探討包含插入字元、刪除字元、複製區塊與刪除區塊的區塊編輯距離的變化形問題,並且按照不同的評估函數,定義三種變化型P(EIS, C)、P(EI, L)與P(EI, N)。我們提出有效率的方法,當經由適當的前處理後,此三種問題可以分別使用 O(nm)、O(nm log m) 與 O(nm^2) 時間的動態規劃演算法解出。
Abstract
Computing the similarity of two strings or sequences is one of the most important fundamental in computer field, and it has been widely studied for several decades.
In the last decade, it gained the researchers' attentions again because of the improvements of the hardware computation ability and the presence of huge amount of data in biotechnology.
In this dissertation, we pay attention to computing the edit distance between two sequences where the block-edit operations are involved in addition to the character-edit operations.
Previous researches show that this problem is NP-hard if recursive block moves are allowed.
Since we are interested in solving the editing problems by the polynomial-time optimization algorithms, we consider the simplified version of the edit distance problem.
We first focus on the longest common subsequence (LCS) of run-length encoded (RLE) strings, where the runs can be seen as a class of simplified blocks.
Then, we apply constraints to the problem, i.e. to find the constrained LCS (CLCS) of RLE strings.
Besides, we show that the problems which involve block-edit operations can still be solved by the polynomial-time optimization algorithms if some restrictions are applied.

Let X and Y be two sequences of lengths n and m, respectively.
Also, let N and M, be the numbers of runs in the corresponding RLE forms of X and Y, respectively.
In this dissertation, first, we propose a simple algorithm for computing the LCS of X and Y in O(NM + min{ p_1, p_2 }) time, where p_1 and p_2 denote the numbers of elements in the bottom and right boundaries of the matched blocks, respectively.
This new algorithm improves the previously known time bound O(min{nM, Nm}) and outperforms the time bounds O(NM log NM) or O((N+M+q) log (N+M+q)) for some cases, where q denotes the number of matched blocks.

Next, we give an efficient algorithm for solving the CLCS problem, which is to find a common subsequences Z of X and Y such that a given constrained sequence P is a subsequence of Z and the length of Z is maximized.
Suppose X, Y and P are all in RLE format, and the lengths of X, Y and P are n, m and r, respectively.
Let N, M and R be the numbers of runs in X, Y, and P, respectively.
We show that by RLE, the CLCS problem can be solved in O(NMr + min{q_1 r + q_4, q_2 r + q_5 }) time, where q_1 and q_2 denote the numbers of elements in the south and east boundaries of the partially matched blocks on the first layer, respectively, and q_4 and q_5 denote the numbers of elements of the west and north pillars in the bottom boundaries of all fully matched cuboids in the DP lattice, respectively.
When the input strings have good compression ratios, our work obviously outperforms the previously known DP algorithms and the Hunt-Szymanski-like algorithms.

Finally, we consider variations of the block edit distance problem that involve character insertions, character deletions, block copies and block deletions, for two given sequences X and Y.
In this dissertation, three variations are defined with different measuring functions, which are P(EIS, C), P(EI, L) and P(EI, N).
Then we show that with some preprocessing, the minimum block edit distances of these three variations can be obtained by dynamic programming in O(nm), O(nm log m) and O(nm^2) time, respectively, where n and m are the lengths of X and Y.
目次 Table of Contents
LIST OF FIGURES iv
LIST OF TABLES ix
LIST OF SYMBOLS xi
LIST OF ABBREVIATION xiii
ABSTRACT xiv
1 Introduction 1
1.1 String Matching . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Sequence Comparison . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Preliminaries 5
2.1 Hamming Distance . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 String Matching with k Mismatches . . . . . . . . . . . . . . . 6
2.3 The Longest Common Subsequence Problem . . . . . . . . . . 7
2.3.1 Original Dynamic Programming Algorithm for the LCS
Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.2 Solving the LCS Problem in Linear Space . . . . . . . 11
2.3.3 Hunt-Szymanski Algorithm for LCS Problem . . . . . . 12
2.4 The Edit Distance Problem . . . . . . . . . . . . . . . . . . . 15
2.4.1 Levenshtein Distance Problem . . . . . . . . . . . . . . 18
2.4.2 The Local Edit Distance Problem . . . . . . . . . . . . 19
2.4.3 The Substring Edit Distance Problem . . . . . . . . . . 20
2.5 The Constrained Longest Common Subsequence Problem . . . 21
2.6 The Run-length Encoding Scheme . . . . . . . . . . . . . . . . 24
2.7 Suffix Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.8 Lowest Common Ancestor (LCA) and
Longest Common Prefix (LCP) . . . . . . . . . . . . . . . . . 31
2.9 The Range Minimum (Maximum) Query Problem . . . . . . . 33
3 Algorithms for Computing LCS of RLE Strings 35
3.1 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.1.1 Bunke and Csirik’s Algorithm . . . . . . . . . . . . . . 37
3.1.2 Apostolico et al.’s Algorithm . . . . . . . . . . . . . . . 42
3.1.3 Liu et al.’s Algorithm . . . . . . . . . . . . . . . . . . . 46
3.1.4 Mitchell’s Algorithm . . . . . . . . . . . . . . . . . . . 48
3.2 Our Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.2.1 The Basic Idea . . . . . . . . . . . . . . . . . . . . . . 51
3.2.2 The Algorithm . . . . . . . . . . . . . . . . . . . . . . 53
3.2.3 How Fast Is It . . . . . . . . . . . . . . . . . . . . . . . 57
3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4 Algorithms for Computing Constrained LCS of RLE Strings 60
4.1 The Properties of the Constrained LCS Problem for RLE Strings 61
4.2 A Simple Algorithm: Algorithm CLCS1 . . . . . . . . . . . . . 63
4.3 An Improved Algorithm: Algorithm CLCS2 . . . . . . . . . . 68
4.4 A Further Improved Algorithm: Algorithm CLCS3 . . . . . . 69
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5 The Block Edit Distance Problems 77
5.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.2 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.2.1 The NP-Hardness of Some Block Edit Problems . . . . 79
5.2.2 The Polynomial-Time Optimization Algorithms for Some
Block Edit Problems . . . . . . . . . . . . . . . . . . . 84
5.2.3 Ukkonen’s Restricted Block Edit Problem . . . . . . . 84
5.3 Block Edit Operations . . . . . . . . . . . . . . . . . . . . . . 88
5.4 Our Block Edit Problems and Algorithms . . . . . . . . . . . 90
5.4.1 Problem 1 – P(EIS,C) . . . . . . . . . . . . . . . . . 90
5.4.2 Our Algorithm for P(EIS,C) . . . . . . . . . . . . . . 92
5.4.3 Problem 2 – P(EI, L) . . . . . . . . . . . . . . . . . . 101
5.4.4 Our Algoirhtm for P(EI,L) . . . . . . . . . . . . . . . 101
5.4.5 Problem 3 – P(EI,N) . . . . . . . . . . . . . . . . . . 103
5.4.6 Our Algoirhtm for P(EI,N) . . . . . . . . . . . . . . . 104
5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6 Conclusions 107
參考文獻 References
[1] A. Amir, A. Apostolico, and M. Lewenstein, "Inverse pattern matching,"
Journal of Algorithms, Vol. 24, pp. 325-339, 1997.
[2] A. Amir, M. Lewenstin, and E. Porat, "Faster algorithm for string
matching with k mismatches," Journal of Algorithms, Vol. 50, pp. 257-
275, 2004.
[3] H. Y. Ann, C. B. Yang, C. T. Tseng, and C. Y. Hor, "A fast and simple
algorithm for computing the longest common subsequence of run-length
encoded strings," Information Processing Letters, Vol. 108, pp. 360-364,
2008.
[4] A. Apostolico, G. M. Landau, and S. Skiena, "Matching for run-length
encoded strings," Journal of Complexity, Vol. 15, No. 1, pp. 4-16, 1999.
[5] O. Arbell, G. M. Landau, and J. S. B. Mitchell, "Edit distance of runlength
encoded strings," Information Processing Letters, Vol. 83, No. 6,
pp. 307-314, 2002.
[6] A. N. Arslan, "Regular expression constrained sequence alignment,"
Journal of Discrete Algorithms, Vol. 5, No. 4, pp. 647-661, 2007.
[7] A. N. Arslan and ‥O. E?gecio?glu, "Algorithms for the constrained longest
common subsequence problems," International Journal of Foundations
Computer Science, Vol. 16, No. 6, pp. 1099-1109, 2005.
[8] V. Bafna and P. A. Pevzner, "Genome rearrangements and sorting by
reversals," 34th IEEE Symposium on Foundations of Computer Science,
pp. 148-157, 1993.
[9] M. A. Bender and M. Farach-Colton, "The LCA problem revisited,"
LATIN 2000: Theoretical Informatics, 4th Latin American Symposium,
Punta del Este, Uruguay, pp. 88-94, 2000.
[10] H. M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat,
H. Weissing, I. N. Shindyalov, and P. E. Bourne, "The protein data
bank," Nucleic Acids Research, Vol. 28, pp. 235-242, 2000.
[11] C. Blum, M. J. Blesa, and M. L?opez-Ib?a?nez, "Beam search for the
longest common subsequence problem," Computers and Operations Research,
Vol. 36, No. 12, pp. 3178-3186, 2009.
[12] H. Bunke and J. Csirik, "Edit distance of run-length coded strings," Proceedings
of the 1992 ACM/SIGAPP Symposium on Applied computing,
Kansas City, Missouri, United States, pp. 137-143, 1992.
[13] H. Bunke and J. Csirik, "An algorithm for matching run-length coded
strings," Computing, Vol. 50, No. 4, pp. 297-314, 1993.
[14] H. Bunke and J. Csirik, "An improved algorithm for computing the edit
distance of run-length coded strings," Information Processing Letters,
Vol. 54, No. 2, pp. 93-96, 1995.
[15] C. H. Chiang, "A genetic algorithm for the longest common subsequence
of multiple sequences." http://etd.lib.nsysu.edu.tw/ETD-db/ETDsearch/search, Master's Thesis, Department of Computer Science
and Engineering, National Sun Yat-Sen University, Kaohsiung, Taiwan,
2009.
[16] F. Y. L. Chin, A. D. Santis, A. L. Ferrara, N. L. Ho, and S. K. Kim, "A
simple algorithm for the constrained sequence problems," Information
Processing Letters, Vol. 90, No. 4, pp. 175-179, 2004.
[17] M. Chrobak, P. Kolman, and J. Sgall, "The greedy algorithm for the
minimum common string partition problem," ACM Transactions on Algorithms,
Vol. 1, No. 2, pp. 350-366, 2005.
[18] Y. S. Chung, C. L. Lu, and C. Y. Tang, "Efficient algorithms for regular
expression constrained sequence alignment," Information Processing
Letters, Vol. 103, pp. 240-246, 2007.
[19] R. Cole and R. Hariharan, "Approximate string matching: a simpler
faster algorithm," Proceedings of the ninth annual ACM-SIAM symposium
on Discrete algorithms, pp. 463-472, 1998.
[20] G. Cormodc and S. Muthukrishnan, "The string edit distance matching
problem with moves," Proceedings of the thirteenth annual ACM-SIAM
symposium on Discrete algorithms, pp. 667-676, 2002.
[21] P. J. de Rezende, D. T. Lee, and Y. F. Wu, "Rectilinear shortest paths
with rectangular barriers," Proc. First Annual ACM Symposium on
Computational Geometry, pp. 204-213, 1985.
[22] M. Y. Eltabakh, W. K. Hon, W. A. R. Shah, and J. S. Vitter, "The
SBC-tree: An index for run-length compressed sequences," Proceedings
of the 11th International Conference on Extending Database Technology,
Nantes, France, pp. 523-534, 2008.
[23] F. Ergun, S. Muthukrishnan, and C. Sahinalp, "Comparing sequences
with segment rearrangements," Proceedings of the FSTTCS'03, pp. 183-
194, 2003.
[24] M. Farach-Colton, P. Ferragina, and S. Muthukrishnan, "On the sortingcomplexity
of suffix tree construction," Journal of the ACM, Vol. 47,
No. 6, pp. 987-1011, 2000.
[25] J. Fayolle and M. D. Ward, "Analysis of the average depth in a suffix
tree under a Markov model," International Conference on Analysis of
Algorithms DMTCS Proc. AD, pp. 95-104, 2005.
[26] E. Fredkin, "Trie memory," Communications of the ACM, Vol. 3, No. 9,
pp. 490-500, 1960.
[27] V. Freschi and A. Bogliolo, "Longest common subsequence between runlength-
encoded strings: a new algorithm with improved parallelism,"
Information Processing Letters, Vol. 90, pp. 167-173, 2004.
[28] C. W. Gibson, N. H. Thomson, W. R. Abrams, and J. Kirkham, "Nested
genes: Biological implications and use of AFM for analysis," Gene,
Vol. 350, No. 1, pp. 15-23, 2005.
[29] O. Gotoh, "An improved algorithm for matching biological sequences,"
Journal of Molecular Biology, Vol. 162, No. 3, pp. 705-708, 1982.
[30] Z. Gotthilf, D. Hermelin, and M. Lewenstein, "Constrained LCS: hardness
and approximation," Combinatorial Pattern Matching, 19th Annual
Symposium (CPM2008), Pisa, Italy, pp. 255-262, 2008.
[31] D. Gusfield, Algorithms on Strings, Trees and Sequences: Computer
Science and Computational Biology. Cambridge University Press, 1997.
[32] R. W. Hamming, "Error detecting and error correcting codes," Bell
System Technical Journal, Vol. 26, No. 2, pp. 147-160, 1950.
[33] S. Hannenhalli, "Polynomial-time algorithm for computing translocation
distance between genomes," Annual Symposium of Combinatorial
Pattern Matching CPM, pp. 162-176, 1996.
[34] D. Harel and R. E. Tarjan, "Fast algorithms for finding nearest common
ancestors," SIAM Journal on Computing, Vol. 13, No. 2, pp. 338-355,
1984.
[35] D. S. Hirschberg, "A linear space algorithm for computing maximal
common subsequence," Communications of the ACM, Vol. 18, No. 65,
pp. 341-343, 1975.
[36] D. S. Hirschberg, "Algorithms for the longest common subsequence
problem," Journal of the ACM, Vol. 24, No. 4, pp. 664-675, 1977.
[37] K. S. Huang, C. B. Yang, K. T. Tseng, H. Y. Ann, and Y. H. Peng,
"Algorithms for the merged-LCS problem and its variant with block
constraint," Proc. of the 23rd Workshop on Combinatorial Mathematics
and Computation Theory, Chang-Hua, Taiwan, pp. 232-239, 2006.
[38] K. S. Huang, C. B. Yang, K. T. Tseng, Y. H. Peng, and H. Y. Ann, "Dynamic
programming algorithms for the mosaic longest common subsequence
problem," Information Processing Letters, Vol. 102, pp. 99-103,
2007.
[39] J. W. Hunt and T. G. Szymanski, "A fast algorithm for computing
longest common subsequences," Communications of the ACM, Vol. 20,
No. 5, pp. 350-353, 1977.
[40] J. W. Hunt and M. D. McIlroy, "An algorithm for differential file
comparison," Computing Science Technical Report, Bell Laboratories,
No. 41, 1976.
[41] C. S. Iliopoulos and M. S. Rahman, "New efficient algorithms for the
LCS and constrained LCS problems," Information Processing Letters,
Vol. 106, No. 1, pp. 13-18, 2008.
[42] T. Jiang and M. Li, "On the approximation of shortest common supersequences
and longest common subsequences," SIAM Journal on Computing,
Vol. 24, pp. 1122-1139, 1995.
[43] H. Kaplan and N. Shafrir, "The greedy algorithm for edit distance with
moves," Information Processing Letters, Vol. 97, No. 1, pp. 23-27, 2006.
[44] J. Kececioglu and D. Sankoff, "Exact and approximation algorithms for
the inversion distance between two permutations," In Proc. of 4th Ann.
Symp. on Combinatorial Pattern Matching, Berlin, Vol. 684, pp. 87-105,
1993.
[45] J. W. Kim, A. Amir, G. M. Landau, and K. Park, "Computing similarity
of run-length encoded strings with affine gap penalty," Theoretical
Computer Science, Vol. 395, No. 2-3, pp. 268-282, 2008.
[46] D. E. Knuth, J. H. Morris Jr., and V. Pratt, "Fast pattern matching in
strings," SIAM Journal on Computing, Vol. 6, No. 2, pp. 323-350, 1977.
[47] G. M. Landau and U. Vishkin, "Fast parallel and serial approximate
string matching," Journal of Algorithms, Vol. 10, No. 2, pp. 157-169,
1989.
[48] R. C. Larson and V. O. Li, "Finding minimum rectilinear distance paths
in the presence of barriers," Networks, Vol. 11, pp. 285-304, 1981.
[49] B. Laundrie, J. S. Peterson, J. S. Baum, J. C. Chang, D. Fileppo,
S. R. Thompson, and K. McCall, "Germline cell death is inhibited
by P-element insertions disrupting the dcp-1/pita nested gene pair in
Drosophila," Genetics, Vol. 165, pp. 1881-1888, 2003.
[50] R. C. T. Lee, S. S. Tseng, R. C. Chang, and Y. T. Tsai, Introduction to
the Design and Analysis of Algorithms. McGraw Hill, 2005.
[51] V. Levenshtein, "Binary codes capable of correcting spurious insertions
and deletions of ones," Problems of Information Transmission 1, pp. 8-
17, 1965.
[52] J. J. Liu, G. S. Huang, Y. L. Wang, and R. C. T. Lee, "Edit distance for
a run-length-encoded string and an uncompressed string," Information
Processing Letters, Vol. 105, No. 1, pp. 12-16, 2007.
[53] J. J. Liu, Y. L. Wang, and R. C. T. Lee, "Finding a longest common
subsequence between a run-length-encoded string and an uncompressed
string," Journal of Complexity, Vol. 24, No. 2, pp. 173-184, 2008.
[54] D. Lopresti and A. Tomkins, "Block edit models for approximate string
matching," Theoretical Computer Science, Vol. 181, pp. 159-179, 1997.
[55] D. Maier, "The complexity of some problems on subsequences and supersequences,"
Journal of the ACM, Vol. 25, pp. 322-336, 1978.
[56] V. M‥akinen, G. Navarro, and E. Ukkonen, "Approximate matching of
run-length compressed strings," Algorithmica, Vol. 35, No. 4, pp. 347-
369, 2008.
[57] A. Margalit and A. Rozenfeld, "Reducing the expected computational
cost of template matching using run length representation," Pattern
Recognition Letters, Vol. 11, pp. 255-265, 1990.
[58] J. S. B. Mitchell, "L1 shortest paths among polygonal obstacles in the
plane," Algorithmica, Vol. 8, pp. 55-88, 1992.
[59] J. S. B. Mitchell, "A geometric shortest path problem, with application
to computing a longest common subsequence in run-length encoded
strings," Technical Report, Department of Applied Mathematics, SUNY
Stony Brook, 1997.
[60] S. Muthukrishnan and S. C. Sahinalp, "Approximate nearest neighbors
and sequence comparison with block operations," Proceedings of the
Symposium on the Theory of Computing (STOC 2000), pp. 416-424,
2000.
[61] C. L. Peng, "An approach for solving the constrained longest
common subsequence problem." http://etd.lib.nsysu.edu.tw/ETDdb/
ETDsearch/search, Master's Thesis, Department of Computer Science
and Engineering, National Sun Yat-Sen University, Kaohsiung, Taiwan,
2003.
[62] M. S. Rahman and C. S. Iliopoulos, "Algorithms for computing variants
of the longest common subsequence problem," ISAAC, in: Lecture Notes
in Computer Science, Vol. 4288, pp. 399-408, 2006.
[63] C. Rick, "Simple and fast linear space computation of longest common
subsequences," Information Processing Letters, Vol. 75, No. 6, pp. 275-
281, 2000.
[64] K. Sayoood and E. F. Eds., Introduction to Data Compression. Morgan
Kaufmann, Los Altos, CA, second ed., 2000.
[65] D. Shapira and J. A. Storer, "Edit distance with move operations,"
Proceedings of the 13th Annual Symposium on Combinatorial Pattern
Matching, Vol. 2373, pp. 85-98, 2002.
[66] D. Shapira and J. A. Storer, "Large edit distance with multiple block
operations," String Processing and Information Retrieval (SPIRE),
Vol. 2857, pp. 369-377, 2003.
[67] D. Shapira and J. A. Storer, "Edit distance with move operations,"
Journal of Discrete Algorithms, Vol. 5, No. 2, pp. 380-392, 2007.
[68] S. J. Shyu and C. Y. Tsai, "Finding the longest common subsequence
for multiple biological sequences by ant colony optimization," Computers
and Operations Research, Vol. 36, No. 1, pp. 73-91, 2009.
[69] C. Y. Tang, C. L. Lu, M. D. T. Chang, Y. T. Tsai, Y. J. Sun, K. M.
Chao, J. M. Chang, Y. H. Chiou, C. M. Wu, H. T. Chang, and W. I.
Chou, "Constrained multiple sequence alignment tool development and
its application to rnase family alignment," Journal of Bioinformatics
and Computational Biology, Vol. 1, pp. 267-287, 2003.
[70] Y. T. Tsai, "The constrained longest common subsequence problem,"
Information Processing Letters, Vol. 88, No. 4, pp. 173-176, 2003.
[71] E. Ukkonen, "Algorithms for approximate string matching," Information
and Control, Vol. 64, pp. 100-118, 1985.
[72] E. Ukkonen, "On-line construction of suffix trees," Algorithmica, Vol. 14,
No. 3, pp. 249-260, 1995.
[73] P. van Emde Boas, "Preserving order in a forest in less than logarithmic
time and linear space," Information Processing Letters, Vol. 6, No. 3,
pp. 80-82, 1977.
[74] J. Vuillemin, "A unifying look at data structures," Communications of
the ACM, Vol. 23, pp. 229-239, 1980.
[75] R. Wagner and M. Fischer, "The string-to-string correction problem,"
Journal of the ACM, Vol. 21, No. 1, pp. 168-173, 1974.
[76] P. Weiner, "Linear pattern matching algorithm," In Proceedings of the
14th Annual IEEE Symposium on Switching and Automata Theory,
pp. 1-11, 1973.
[77] H. Y. Weng, S. H. Shiau, K. S. Huang, and C. B. Yang, "A hybrid
algorithm for the longest common subsequence problem," Proc. of the
26th Workshop on Combinatorial Mathematics and Computation Theory,
Chiayi, Taiwan, pp. 122-129, 2009.
[78] C. B. Yang and R. C. T. Lee, "Systolic algorithm for the longest common
subsequence problem," Journal of the Chinese Institute of Engineers,
Vol. 10, No. 6, pp. 691-699, 1987.
[79] D. C. T. Yu, "Efficient algorithms for constrained sequence alignment
problems." http://ethesys.lib.pu.edu.tw/ETD-db/ETD-search/search,
Master's Thesis, Department of Computer Science and Information
Management, Providence University, Taichung, Taiwan, 2003.
[80] P. Yu, D. Ma, and M. Xu, "Nested genes in the human genome," Genomics,
Vol. 86, No. 4, pp. 414-422, 2005.
[81] J. Ziv and A. Lempel, "A universal algorithm for sequential data compression,"
IEEE Transactions on Information Theory, Vol. 23, No. 3,
pp. 337-343, 1977.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內立即公開,校外一年後公開 off campus withheld
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code