Responsive image
博碩士論文 etd-0821103-204917 詳細資訊
Title page for etd-0821103-204917
論文名稱
Title
利用二級結構預測蛋白質三級結構
Protein Structure Prediction Based on Secondary Structure Alignment
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
45
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2003-06-20
繳交日期
Date of Submission
2003-08-21
關鍵字
Keywords
蛋白質、對齊、預測、結構、二級結構
prediction, structure, alignment, protein, secondary structure
統計
Statistics
本論文已被瀏覽 5712 次,被下載 1726
The thesis/dissertation has been browsed 5712 times, has been downloaded 1726 times.
中文摘要
序列比對在計算生物學上是一個基本的方法,其基本功用是藉由生物序列上的差異大小比對出序列在演化過程中距離的遠近,然後再藉此結果加以應用。
傳統上,蛋白質序列比對只應用到蛋白質一級結構的資訊,而沒有考慮到二級結構的重要性,本篇論文的重點即是藉由增加二級結構的資訊來增加序列比對的意義。再者,在同源模擬法(homology modeling)中,其決定性的步驟在於找到一個相似的已知結構,並且以此為基礎來預測未知的蛋白質結構,本論文希望能夠改進傳統序列比對的方法,找出蛋白質一級結構相似程度較低,但是其二級結構相似度較高的蛋白質用來當作結構預測的模板。
Abstract
Sequence alignment is a basic but powerful technique in molecular biology.
Macromolecular sequences (DNA, RNA and protein sequences) can be aligned based
on some criteria. The goal of sequence alignment is to find the similarity and the
difference of input sequences. With various purposes, there are different algorithms
In this thesis, we present a new algorithm which aligns sequences with consideration of secondary structures. Traditionally, a sequence alignment algorithm
considers only the primary structure, which is the amino acid chain. When we make
use of the information of protein secondary structure such as alpha helix, beta sheet etc,
the sensitivity of pairwise alignment can be improved.
目次 Table of Contents
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0
Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 DNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 RNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Protein . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Sequence Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Chapter 2. Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1 Dynamic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Scoring Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Scoring Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 DNA Sequence Alignment . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5 Protein Sequence Alignment . . . . . . . . . . . . . . . . . . . . . . . 17
Chapter 3. Protein Structure Prediction . . . . . . . . . . . . . . . . . 21
3.1 Determination of Structures of Proteins . . . . . . . . . . . . . . . . . 21
3.2 Prediction of Protein Structures . . . . . . . . . . . . . . . . . . . . . 23
Chapter 4. Our Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Page
4.1 Secondary Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2 A Novel Sequence Alignment Algorithm . . . . . . . . . . . . . . . . 29
Chapter 5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Appendixes
A. Score Matrix of Secondary Structure . . . . . . . . . . . . . . . . . . 40
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
參考文獻 References
[1] S. Altschul and B. W. Erickson, Optimal sequence alignment using affine gap
costs," Journal of Molecular Biology, Vol. 48, pp. 603-616, 1986.
[2] S. Altschul, T.Madden, A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. Lipman, Gapped blast and psi-blast: a new generation of protein database search
programs," Nucleic Acids Research, Vol. 25, pp. 3389-3402, 1997.
[3] S. F. Altschul, M. S. Boguski, W. G. Gish, and J. C. Wooton, Issues in searching molecular sequence databases," Nature Genetics, Vol. 6, pp. 119-129, 1994.
[4] S. F. Altschul and D. J. Lipman, Trees, stars and multiple sequence alignment," Journal of Applied Mathematics, Vol. 49, No. 1, pp. 197-209, 1989.
[5] D. J. Bacon and W. F. Anderson, Multiple sequence alignment," Journal of Molecular Biology, Vol. 191, pp. 153-161, 1986.
[6] V. Bafna, E. L. Lawler, and P. Pevzner, Approximation algorithms for multiple sequence alignment," Proc. of 5th Ann. Symp. On Pattern Combinatorial Matching, Vol. 807, pp. 43-53, 1994.
[7] T. Blundell, B. Sibanda, M. Sternberg, and J. Thornton, Knowledge-based
prediction of protein structures and the design of novel molecules," Nature, Vol. 326, pp. 347-352, 1987.
[8] S. H. Bryant and C. E. Lawrence, An empirical energy function for threading a protein sequence through the folding motif," Proteins, Vol. 16, pp. 92-112, 1993.
[9] H. Carrillo and D. J. Lipman, The multiple sequence alignment problem in biology," Journal of Applied Mathematics, Vol. 48, pp. 1073-1082, 1988.
[10] K. M. Chao, R. Hardison, and W. Miller, Constrained sequence alignment," Bulletin of Mathematical Biology, Vol. 55, pp. 503-524, 1993. 42
[11] K.M. Chao, R. Hardison, and W. Miller, Locating well-conserved regions
within a pairwise alignment," Computer Application in the Biosciences, Vol. 9,
pp. 387-396, 1993.
[12] Y. Y. Chen, Prediction of protein structures based on curve alignment," Master Thesis, National Sun Yat-sen Unversity, 2002.
[13] D. G. Covell, Folding protein carbon chains into compact forms by monte carlo methods," Proteins, Vol. 14, pp. 409-420, 1992.
[14] T. Dandekar and P. Argos, Folding the main chain of small proteins with the genetic algorithm," Journal of Molecular Biololgy, Vol. 236, pp. 844-861, 1994.
[15] M. O. Dayhoff, Atlas of Protein Sequence and Structure. National Biomedical Research Foundation, Washington, DC, USA, 1978.
[16] M. O. Dayhoff W. C. Barker, and L. Hunt, Establishing homologies in protein sequences," Methods Enzymol, Vol. 91, pp. 524-545, 1983.
[17] D. F. Feng, M. S. Johnson, and R. F. Doolittle, Aligning amino acid sequences: comparison of commonly used methods," Journal of Molecular Evolution, Vol. 21, pp. 112-125, 1985.
[18] C. Gibas and P. Jambeck, Developing Bioinformatics Comuter Skills. O'REILLY, 2001.
[19] A. Godzik, A. Kolinski, and J. Skolnick, Toplogy fingerprint approach to the inverse protein folding problem," Journal of Molecular Biology, Vol. 227, pp. 227-238, 1992.
[20] A. D. Gordon, A sequence-comparison statistic and algorithm," Biometrika, Vol. 60, pp. 197-200, 1973.
[21] O. Gotoh, An improved algorithm for matching biological sequences," Journal of Molecular Biology, Vol. 162, pp. 705-708, 1982.
[22] O. Gotoh, Optimal sequence alignment allowing for long gaps," Bulletin of Mathematical Biology, Vol. 52, pp. 359-373, 1990.
[23] S. M. L. Grand and K. M. Merz, The application of the genetic algorithm to the minimization of potential energy functions," Journal of Global Optimization, Vol. 3, pp. 49-66, 1993. 43
[24] J. Greer, Comparative modeling of homologous proteins," Methods Enzymol, Vol. 202, pp. 239-252, 1991.
[25] S. Henikoff and J. G. Henikoff Amino acid substitution matrices from protein blocks," Proceedings of the National Academy of Sciences, Vol. 89, pp. 10915-10919, 1992.
[26] M. S. Johnson and R. F. Doolittle, A method for the simultaneous alignment of three or more amino acid sequences," Journal of Molecular Evolution, Vol. 23, pp. 267-278, 1986.
[27] D. Jones, W. Taylor, and J. Thornton, A new approach to protein fold recognition," Nature, Vol. 358, pp. 86-89, 1992.
[28] W. Kabsch and C. Sander, Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features," Biopolymers, Vol. 22, pp. 2577-2637, 1983.
[29] R. Luthy, J. U. Bowie, and D. Eisenberg, Assessment of protein models with three-dimensional pro疹es," Nature, Vol. 356, pp. 83-85, 1992.
[30] S. B. Needleman and C. D. Wunsch, A general method applicable to the search for similarities in the amino acid sequences of two proteins," Journal of Molecular Biology, Vol. 48, pp. 443-453, 1970.
[31] W. Pearson and W. Miller, Dynamic programming algorithms for biological sequence comparison," Methods in Enzymology, Vol. 210, pp. 575-601, 1992.
[32] W. R. Pearson, How the regulatory and catalytic domains get together," Protein Science, Vol. 4, pp. 1145-1160, 1995.
[33] R. B. Russell, Protein fold recognition by mapping predicted secondary structures," Journal of Molecular Biololgy, Vol. 259, pp. 349-365, 1996.
[34] R. M. Schwartz and M. O. Dayhoff, Matrices for detecting distant relationships. In M. Dayhoff editor, Atlas of Protein Sequence and Structure, Volume 5, pages 353-358. National Biomedical Research Foundation, Washington, DC,
USA, 1979.
[35] T. F. Smith and M. S. Waterman, Comparison of biosequences," Advances in Applied Mathematics, Vol. 2, pp. 482-489, 1981. 44
[36] W. R. Taylor, A flexible method to align a large number of sequences," Journal of Molecular Evolution, Vol. 28, pp. 161-169, 1988.
[37] U. Tonges, S. W. Perrey, J. Stoye, and A. W. M. Dress, A general method for fast multiple sequence alignment," Gene, Vol. 172, No. 1, pp. 33-41, 1996.
[38] C. Venclovas A. Kryshtafovych, and K. Fidelis, http://predictioncenter.llnl.gov/,"
[39] K. E. Vrana, How the regulatory and catalytic domains get together," Nature Structural Biology, Vol. 6, pp. 401-402, 1999.
[40] A. Wong, S. Chan, and D. Chiu, A multiple sequence comparison method," Society for Mathematical Biology, Vol. 55, No. 2, pp. 465-486, 1993. 45
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內立即公開,校外一年後公開 off campus withheld
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code