Responsive image
博碩士論文 etd-0912105-124156 詳細資訊
Title page for etd-0912105-124156
論文名稱
Title
最長共同子序列與相關問題之回顧
A Survey of the Longest Common Subsequence Problem and Its Related Problems
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
90
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2005-06-29
繳交日期
Date of Submission
2005-09-12
關鍵字
Keywords
最長共同子序列
LCS, Longest Common Subsequence
統計
Statistics
本論文已被瀏覽 5765 次,被下載 1895
The thesis/dissertation has been browsed 5765 times, has been downloaded 1895 times.
中文摘要
中文摘要
在電腦最佳化問題及生物領域中,最長共同子序列都是一個非常典型
的問題。在過去數十年中,非常多相關的論文被發表及討論。在這一
篇論文中,我們將做一個完整的相關回顧,包括最長共同子序列以及
其相關問題,並且在每一個問題中介紹一些有效率的演算法。在每個
問題的演算法中,我們都會給相關的時間複雜度及空間複雜度。
Abstract
ABSTRACT
The longest common subsequence (LCS) problem is a classical problem both in com-
binational optimization and computational biology. During the past few decades,
a considerable number of studies have been focused on LCS and its related prob-
lems. In this thesis, we shall present a complete survey on LCS and its related
problems, and review some e±cient algorithms for solving these problems. We shall
also give the de‾nition to each problem, present some theorems, and illustrate time
complexity and space complexity of the algorithms for solving these problems.
目次 Table of Contents
TABLE OF CONTENTS
Page
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0
Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Chapter 2. The Longest Common Subsequence Problem . . . . . . . 4
2.1 The Longest Common Subsequence Problem . . . . . . . . . . . . . . 4
2.2 Hunt and Szymanki's Algorithm . . . . . . . . . . . . . . . . . . . . . 7
2.3 Hirschberg's Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 Some Algorithms of LCS . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5 Linear Space Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 18
Chapter 3. Related Problems of LCS . . . . . . . . . . . . . . . . . . . 30
3.1 The Cyclic String Correction Problem . . . . . . . . . . . . . . . . . 30
3.2 The Longest Increasing Subsequence Problem . . . . . . . . . . . . . 34
3.3 The Constrained Longest Common Subsequence Problem . . . . . . . 37
Chapter 4. The Edit Distance Problem . . . . . . . . . . . . . . . . . . 46
4.1 The Edit Distance Problem . . . . . . . . . . . . . . . . . . . . . . . 46
4.2 Other Operations in the Edit Distance Problem . . . . . . . . . . . . 48
Chapter 5. Sequence Alignment . . . . . . . . . . . . . . . . . . . . . . . 53
5.1 Global Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Page
5.2 Local Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.3 Affine Gap Penalty . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.4 Multiple Sequences Alignment . . . . . . . . . . . . . . . . . . . . . . 63
Chapter 6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
參考文獻 References
[1] R. A. Abagyan and S. Batalov, Do aligned sequences share the same fold?,"
Journal of Molecular Biology, Vol. 273, No. 1, pp. 355-368, 1997.
[2] P. Agarwal and D. J. States, Comparative accuracy of methods for protein
sequence similarity search," Bioinformatics, Vol. 14, No. 1, pp. 40-47, 1998.
[3] A. V. Aho, D. S. Hirschberg, and J. D. Ullman, Bounds on the complexity
of the longest common subsequence problem," Journal of the ACM, Vol. 23,
No. 1, pp. 1-12, 1976.
[4] M. H. Albert, A. Golynski, A. M. Hamel, A. Lopez-Ortiz, S. S. Rao, and
M. A. Safari, Longest increasing subsequences in sliding windows," Theoret-
ical Computer Science, Vol. 321, No. 2-3, pp. 405-414, 2004.
[5] S. F. Altschul, A protein alignment scoring system sensitive to all evolution-
ary distances," Journal of Molecular Evolution, Vol. 36, No. 3, pp. 290-300,
1993.
[6] S. F. Altschul and G. Gish, Local alignment statistics," Methods Enzymol.,
Vol. 266, pp. 460-480, 1996.
[7] S. F. Altschul, W.Gish, W. Miller, E. W. Myers, and D. J. Lipman, Basic
local alignment search tool," Journal of Molecular Biology, Vol. 215, No. 3,
pp. 403-410, 1990.
[8] C. E. R. Alves, E. N. Ca'ceres, and S. W. Song, An all-substrings common
subsequence algorithm," Eletronic Notes in Discrete Mathematics, Vol. 19,
pp. 133-139, 2005.
[9] A. Apostolico, Improving the wort-case performance of the Hunt-Szymanski
strategy for the longest comon subsequence of two string," Information
Processing Letters, Vol. 23, No. 2, pp. 63-69, 1986.
69
[10] A. Apostolico, Remark on the Hsu-Du new algorithm for the longest common
subsequence problem," Information Processing Letters, Vol. 25, pp. 235-236q,
1987.
[11] A. Apostolico, S. Browne, and C. Guerra, Fast linear-space computations of
longest common subsequences," Theoretical Computer Science, Vol. 92, pp. 3-
17, 1992.
[12] A. Apostolico and C. Guerra, The longest common subsequence problem
revisited," Algorithmica, No. 2, pp. 315-336, 1987.
[13] P. Argo, M. Vingron, and G. Vogt, Protein sequence comparsion: Methods
and significance," Protein Engineering, Vol. 4, No. 4, pp. 375-383, 1991.
[14] P. Argos, A sensitive procedure to compare amino acid sequences," Journal
of Molecular Biology, Vol. 193, No. 2, pp. 385-396, 1987.
[15] T. K. Attwood and D. J. Parry-Smith, Introduction to Bioinformatics. Pren-
tice Hall, 1999.
[16] R. Baeza-Yates and G. H. Gonnet, A new approach to text searching," Com-
munications of the ACM, Vol. 35, No. 10, pp. 74-82, 1992.
[17] V. Bafn and P. Pevzner, Sorting by transpositions," SIAM Journal on Dis-
crete Mathematics, Vol. 11, No. 2, pp. 224-240, 1998.
[18] V. Bafna and P. Pevzner, Genome rearrangements and sorting by reversals,"
SIAM Journal of Computing, Vol. 25, No. 2, pp. 172-289, 1996.
[19] T. L. Bailey and C. Elkan, The value of prior knowledge in discovering motifs
with MEME.," Proceeding of International Conference on Intelligent Systems
for Molecular Biology, pp. 21-29, 1995.
[20] G. J. Barton and M. J. E. Sternberg, A strategy for the rapid multiple align-
ment of protein sequences: confidence levels from tertiary structure compar-
isons," Journal of Molecular Biology, Vol. 198, pp. 327-337, 1987.
[21] L. Bergroth, H. Hakonen, and T. Raita, A survey of longest common subse-
quence algorithms," Proceedings of the Seventh International Symposium on
String Processing Information Retrieval (SPIRE'00), pp. 39-48, 2000.
[22] L. Bergroth, H. Hakonen, and T.Raita, New approximation algorithm for
longest common subsequence," Proceeding of the 5th of the International Sym-
posium on String Processing information Retrieval, pp. 32-40, 1998.
70
[23] S. Bespamyatnikh and M. Segal, Enumerating longest increasing subse-
quences and patience sorting," Information Processing Letters, Vol. 76, No. 1-
2, pp. 7-11, 2000.
[24] P. Bonizzoni and G. D. Vedova, The complexity of multiple sequence align-
ment with SP-score that is a metric," Theoretical Computer Science, Vol. 259,
No. 1, pp. 63-79, 2001.
[25] E. A. Breimer, M. K. Goldberg, and D. T. Lim, A learning algorithm for the
longest common subsequence problem," Journal of Experimental Algorithmics
(JEA), Vol. 8, No. 2.1, 2003.
[26] H. Bunke and U. Buhler, Applications of approximate string matching to 2D
shape recognition," Pattern Recognition, Vol. 26, No. 12, pp. 1797-1812, 1993.
[27] F. Y. L. Chin and C. K. Poon, A fast algorithm for computing longest com-
mon subsequences of small alphabet size," Information Processing Letters,
Vol. 13, No. 1, pp. 463-469, 1981.
[28] F. Chin and C. K. Poon, Performance analysis of some simple heuristics for
computing longest common subsequences," Algorithmica, Vol. 12, pp. 293-311,
1994.
[29] F. Y. L. Chin, N. L. Ho, T. W. Lam, P. W. H. Wong, and M. Y. chan, E±-
cient constrained multiple sequence alignment with performance guarantee,"
Journal of Bioinformatics and Computational Biology (JBCB), Vol. 3, No. 1,
pp. 1-18, 2005.
[30] F. Y. L. Chin, A. D. Santis, A. L. Ferrara, N. L. Ho, and S. K. Kim, A sim-
ple algorithm for the constrained sequence problems," Information Processing
Letters, Vol. 90, No. 4, pp. 175-179, 2004.
[31] D. A. Christie, A 3/2-approximation algorithm for sorting by reversals," in
the Proceeding of the 9th Annual ACM-SIAM Symposium on Discrete Algo-
rithms, pp. 244-252, 1998.
[32] D. A. Christie, Genome rearrangment problems," PhD thesis, University of
Glasgow, Scotland, 1998.
[33] V. Chvatal and D. Sankoff, Longest common subsequences of two random
sequences," Journal of Applied Probability, Vol. 12, pp. 306-315, 1975.
71
[34] J. F. Collins, A. F. Coulson, and A. Lyall, The significance of protein sequence
similarities," Computer Applications in the Biosciences, Vol. 4, pp. 67-71,
1988.
[35] G. Cormode and S. Muthukrishnan, The string edit distance matching prob-
lem with moves," In Proceedings of the 13th Annual ACM-SIAM Symposium
On Discrete Mathematics (SODA), p. 667-676, 2002.
[36] G. Cormode and S. Muthukrishnan, The string edit distance matching prob-
lem with moves," In Procceding of the 11th Symposium on Discrete Algorithms
(SODA'00), pp. 197-206, 2000.
[37] M. Crochemore, C. S. Iliopoulos, and Y. J. Pinzon, Speeding-up Hirschberg
and Hunt-szymanski LCS algorithms," Fundamenta Informaticae, Vol. 56,
No. 1,2, pp. 89-103, 2003.
[38] M. Crochemore, C. S. Iliopoulos, Y. J. Pinzon, and J. F. Reid, A fast and
practical bit-vector algorithm for the longest common subsequence problem,"
Proceedings of the11th Australasian Workshop On Combinatorial Algorithms,
pp. 75-86, 2000.
[39] A. L. Delcher, S. Kasif, R. D. Fleischmann, J. Peterson, O. While, and S. L.
Salzberg, Alignment of whole genomes," Nucleic Acids Research, Vol. 27,
No. 11, pp. 2369-2376, 1999.
[40] R. Dixon and T. Martin, Automatic speech and speaker recognition," IEEE
Press, 1979.
[41] M. Dorigo and L. M. Gambardella, Ant colony system: A cooperative learn-
ing approach to the traveling salesman problem," IEEE Transactions on Evo-
lutionary Computation, Vol. 1, No. 1, pp. 53-66, 1997.
[42] M. Dorigo, V. Maniezzo, and A. Colorni, The ant system: Optimization by
a colony of cooperating agents," IEEE Transactions on Systems, Man, and
Cybernetics - Part B, Vol. 26, No. 1, pp. 29-42, 1996.
[43] S. R. Eddy, Profile hidden markov models," Bioinformatics, Vol. 14, pp. 755-
763, 1998.
[44] D. Eppstein, Z. Galil, R. Giancarlo, and G. F. Ltaliano, Sparse dynamic
promming I: Linear cost functions," Journal of the ACM, Vol. 39, pp. 519-
545, 1992.
72
[45] D. F. Feng and R. F. Doolittle, Progresearchsive sequence alignment as a
prerequisite to correct phylogenetic trees," Journal of Molecular Evolution,
Vol. 25, pp. 351-360, 1987.
[46] W. M. Fitch and T. F. Smith, Optimal sequences alignments," Vol. 80,
pp. 1382-1386, 1983.
[47] M. L. Fredman, On computing the length of longest increasing subsequences,"
Discrete Mathematics, Vol. 11, No. 1, pp. 29-35, 1975.
[48] M. Gerstein and M. Levitt, Using iterative dynamic programming to obtain
accurate pairwise and multiple alignments of protein structures," In Proceed-
ings of the Fourth International Conference on Intelligent Systems in Molec-
ular Biology, Menlo Park, CA, AAAI Press, 1996.
[49] R. C. Gonzalez and R. E. Woods, Digital Image Processing. Addison-Wesley,
1992.
[50] O. Gotoh, Significant improvement in accuracy of multiple protein sequence
alignments by iterative refinements as assessed by reference to structural align-
ments," Journal of Molecular Biology, Vol. 264, pp. 823-838, 1996.
[51] R. I. Greenberg, Fast and simple computation of all longest common subse-
quences," Information Processing Letters, Vol. 1, 2002.
[52] R. I. Greenberg, Bounds on the number of longest common subsequences,"
Computing Research Repository, Vol. 2, 2003.
[53] J. Y. Guo and F. K. Hwang, An almost-linear time and linear space algo-
rithm for the longest common subsequence problem," Information Processing
Letters, Vol. 94, No. 3, pp. 131-135, 2005.
[54] D. Gusfield, Algorithm on Strings, Trees and Sequences. Cambridge University
Press, 1997.
[55] T. Hartman and R. Shamir, A simpler 1.5-approximation algorithm for sort-
ing by transpositions," Combinatorial Pattern Matching 14th Annual Sympo-
sium, pp. 156-169, 2003.
[56] S. Heniko
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內立即公開,校外一年後公開 off campus withheld
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code