Responsive image
博碩士論文 etd-0728104-125530 詳細資訊
Title page for etd-0728104-125530
論文名稱
Title
蛋白質摺疊預測之基因演算法
Protein Folding Prediction with Genetic Algorithms
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
35
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2004-07-09
繳交日期
Date of Submission
2004-07-28
關鍵字
Keywords
摺疊、預測、基因演算法、蛋白質結構、二級結構
prediction, folding, secondary structure, genetic algorithm, protein structure
統計
Statistics
本論文已被瀏覽 5667 次,被下載 2318
The thesis/dissertation has been browsed 5667 times, has been downloaded 2318 times.
中文摘要
蛋白質其生物上之功能是決定於其三維空間的結構,這是眾所皆知的性質。因此,解決蛋白質結構的問題是研究蛋白質很重要的其中一項工作。然而,關於蛋白質如何摺疊成其三維空間的結構目前仍是沒有很明確的定論,因此預測蛋白質結構是一個非常具有挑戰性的任務。
本論文提出一個建構於晶格模型的基因演算法,以預測所謂的目標蛋白質之三維空間的結構,而假設其蛋白質序列與二級結構是已知的。
親疏水性模型是最簡化且最受歡迎的蛋白質摺疊模型之一。其考慮蛋白質結構中,胺基酸之間疏水性與疏水性的相互作用;但是,這些模型所預測的結構仍然不夠好。因此,我們認為還有其他特性應該考慮,例如二級結構、電荷與雙硫鍵。也就是說,在我們的基因演算法的適應性函式裡,除了考慮到疏水性成對的數量外,同時也考慮到每一個胺基酸是位在哪種二級結構。而既然一開始我們對於蛋白質如何摺疊沒有頭緒,所以事實上晶格模型是為了幫助我們得到目標蛋白質的一個初步之摺疊構形。
從預測結果與其真實結構的RMSD值之比較來看,這些額外的特性對於預測結構的確有所改進。
Abstract
It is well known that the biological function of a protein depends on its 3D structure. Therefore, solving the problem of protein structures is one of the most important works for studying proteins. However, protein structure prediction is a very challenging task because there is still no clear feature about how a protein folds to its 3D structure yet.
In this thesis, we propose a genetic algorithm (GA) based on the lattice model to predict the 3D structure of an unknown protein, target protein, whose primary sequence and secondary structure elements (SSEs) are assumed known.
Hydrophobic-hydrophilic model (HP model) is one of the most simplified and popular protein folding models. These models consider the hydrophobic-hydrophobic interactions of protein structures, but the results of prediction are still not encouraged enough. Therefore, we suggest that some other features should be considered, such as SSEs, charges, and disulfide bonds. That is, the fitness function of GA in our method considers not only how many hydrophobic-hydrophobic pairs there are, but also what kind of SSEs these amino acids belong to. The lattice model is in fact used to help us get a rough folding of the target protein, since we have no idea how they fold at the very beginning.
We show that these additional features do improve the prediction accuracy by comparing our prediction results with their real structures with RMSD.
目次 Table of Contents
TABLE OF CONTENTS
Page
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0
Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Chapter 2. Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1 Amino Acids in Proteins . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Levels of Protein Structures . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.3 The Hydrophobic-hydrophilic Model . . . . . . . . . . . . . . . . . . . . 7
2.4 Genetic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Chapter 3. Protein Structure Prediction Methods . . . . . . . . . . . . . . . . 11
3.1 Strategies of Protein Structure Prediction . . . . . . . . . . . . . . . . . . 11
3.2 Representations of Protein Sequences on the Lattice Model . . . . . . . . 13
3.3 Previous PSP Methods for the HP Model . . . . . . . . . . . . . . . . . . 15
Chapter 4. A New Method Based on the Lattice Model . . . . . . . . . . . . . 17
4.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2 Overall Steps of the Algorithm . . . . . . . . . . . . . . . . . . . . . . . 17
4.3 The Fitness Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Chapter 5. Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 27
Chapter 6. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
參考文獻 References
[1] P. Baldi, G. Pollastri, C. A. F. Andersen, and S. Brunak, “Matching protein betasheet
partners by feedforward and recurrent neural network,” Proceedings of the
2000 Conference on Intelligent Systems for Molecular Biology, (ISMB00), AAAI
Press, 2000.
[2] D. H. Ballard, An Introduction to Natural Computation. MIT Press, 1999.
[3] D. Beasley, D. Bull, and R. Martin, “An overview of genetic algorithms: Part2,
research topics,” University Computing, Vol. 15, No. 4, pp. 170–181, 1993.
[4] B. Berger and T. Leight, “Protein folding in the hydrophobic-hydrophilic (HP)
model is NP-complete,” Journal of Computational Biology, Vol. 5, No. 1, pp. 27–40,
1998.
[5] M. K. Campbell and S. O. Farrell, Biochemistry. Brooks Cole, fourth ed., 2002.
[6] Y. Y. Chen, C. B. Yang, and K. T. Tseng, “Prediction of protein structures based on
curve alignment,” Proceedings of the 20thWorkshop on Combinatorial Mathematics
and Computation Theory, Chiayi, Taiwan, pp. 34–44, 2003.
[7] R. S. Cheng, C. B. Yang, and K. T. Tseng, “Protein structure prediction based on
secondary structure alignment,” Proceedings of 2004 Symposium on Digital Life and
Internet Technologies(Abstract, full text in CD), Tainan, Taiwan, pp. 29–29, 2004.
[8] P. Crescenzi, D. Goldman, C. Capadimitriou, A. Piccolboni, and M. Yannakakis,
“On the complexity of protein folding,” Journal of Computational Biology, Vol. 5,
No. 1, pp. 409–422, 1998.
[9] Y. Cui, R. S. Chen, and W. H. Wong, “Protein folding simulation with genetic algorithm
and supersecondary structure constraints,” Proteins, Vol. 31, pp. 247–257,
1998.
[10] K. A. Dill, “Theory for the folding and stability of globular proteins,” Biochemistry,
Vol. 24, pp. 1501–1509, 1985.
[11] A. Dovier, M. Burato, and F. Fogolari, “Using secondary structure information for
protein folding in CLP(FD),” Electronic Notes in Theoretical Computer Science
(M. Comini and M. Falaschi, eds.), Vol. 76, Elsevier, 2002.
[12] S. Duarte-Flores and J. Smith, “Study of fitness landscapes for the HP model of protein
structure prediction,” In Proceedings of the Congress on Evolutionary Computation
2003 (CEC’2003), Vol. 1, Canberra, Australia, IEEE Service Center, pp. 2338–
2345, 2003.
[13] S. Forrest, “Genetic algorithms,” ACM Computing Surveys, Vol. 28, pp. 77–80,
1996.
[14] A. Fraenkel, “Complexity of protein folding,” Bulletin of Mathematical Biology,
pp. 1199–1210, 1993.
[15] C. Gibas and P. Jambeck, Developing Bioinformatics Computer Skills. O’Reilly &
Associates, Inc., first ed., 2001.
[16] F. Glover, “Future paths for integer programming and links to artificial intelligence,”
Computers and Operations Research, Vol. 13, pp. 533–549, 1986.
[17] D. Goldberg, Genetic Algorithms. AddisonWesley Publishing, first ed., 1988.
[18] W. Hart and S. Istrail, “Robust proofs of NP-hardness for protein folding: general
lattices and energy potentials,” Journal of Computational Biology, Vol. 4, No. 1,
pp. 1–22, 1997.
[19] J. Holland, “Adaptation in natural and artificial system.” Technical Report. The University
of Michigan Press, USA, 1975.
[20] T. Jiang, Q. Cui, G. Shi, and S. Ma, “Protein folding simulations of the hydrophobichydrophilic
model by combining tabu search with genetic algorithm,” Journal of
Chemical Physics, Vol. 119, No. 8, pp. 4592–4596, 2003.
[21] N. Krasnogor, W. Hart, J. Smith, and D. Pelta, “Protein structure prediction with
evolutionary algorithms,” In W. Banzhaf, J. Daida, A.E. Eiben, M.H. Garzon, V.
Honavar, M. Jakaiela, and R.E. Smith, editors, GECCO-99: Proceedings of the
Genetic and Evolutionary Computation Conference, Morgan Kaufman, 1999.
[22] R. C. T. Lee, “Computational biology.” http://www.csie.ncnu.edu.tw/, Department
of Computer Science and Information Engineering, National Chi-Nan University,
Taiwan, 2001.
[23] M. Milostan, P. Lukasiak, K. Dill, and J. Blazewicz, “A tabu search strategy for finding
low energy structures of proteins in HP-model,” Proceedings of Seventh Annual
International Conference on Research in Computational Molecular Biology, Berlin,
Germany, 2003.
[24] A. Patton, W. P. III, and E. Goodman, “A standard GA approach to native protein
structure prediction,” Proceedings of 6th International Conference On Genetic Algorithm,
Dublin, Ireland, pp. 574–581, 1995.
[25] I. Ruczinski, C. Kooperberg, R. Bonneau, and D. Baker, “Distrubutions of beta
sheets in proteins with application to structure prediction,” Proteins, Vol. 48, pp. 85–
97, 2002.
[26] J. Setubal and J. Meidanis, Introduction to Computational Molecular Biology. PWS
Publishing Company, Boston, second ed., 1997.
[27] R. Unger and J. Moult, “Finding the lowest free energy conformation of a protein
is NP-hard problem: Proof and implications,” Bulletin of Mathematical Biology,
Vol. 55, No. 6, pp. 1183–1198, 1993.
[28] R. Unger and J. Moult, “Genetic algorithms for protein folding simulations,” Journal
of Molecular Biology, Vol. 231, No. 1, pp. 75–81, 1993.
[29] M. Waterman, Introduction to Computational Biology: Maps, Sequences and
Genomes. Chapman and Hall, London: CRC Press, 1995.
[30] C. Zhang and A. K. Wong, “A genetic algorithm for multiple molecular sequence
alignment,” Comput. Appl. Biosci., Vol. 13, pp. 565–581, 1997.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內立即公開,校外一年後公開 off campus withheld
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code