Responsive image
博碩士論文 etd-0907110-202928 詳細資訊
Title page for etd-0907110-202928
論文名稱
Title
利用SVM改進蛋白質全原子結構預測之方法
Improvement of Protein All-atom Prediction with SVM
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
68
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2010-07-07
繳交日期
Date of Submission
2010-09-07
關鍵字
Keywords
骨架、預測、方法選取、蛋白質
prediction, tool preference, backbone, protein
統計
Statistics
本論文已被瀏覽 5748 次,被下載 1
The thesis/dissertation has been browsed 5748 times, has been downloaded 1 times.
中文摘要
有許多研究致力於蛋白質骨幹結構重建問題 (PBRP) ,像是Adcock的方法、MaxSprout、SABBAC、還有我們實驗室學長姐之前的研究。在之前的研究中, 王仁暉學長利用homology modeling的方式解決這個問題,之後張小燕學姐基於 AMBER能量力場進一步微調氧原子的位置。我們比較CASP7和8當中的結果, 發現某一些蛋白質序列在使用張學姐的方法時預測的比較準確,而某一些使用 SABBAC預測比較準確。因此,我們提出一個想法,利用SVM,在真正進行預 測之前先替蛋白質序列選擇預測結果較準確的方法。我們設計了幾個步驟選出 能夠分類較適合方法的特徵。實驗的資料集是在CASP7和8中僅包含20種標準胺 基酸的30和24條蛋白質序列。而實驗結果顯示,對於之前的研究來說,在 CASP7和8中,RMSD值還可以再進步7.39%和2.94%。只要有軟體設計來解決同 樣的問題,我們的方法可以應用在各方面的預測上。
Abstract
There are many studies have been devoted to solve the all-atom protein back- bone reconstruction problem (PBRP), such as Adcock’s method, MaxSprout, SAB- BAC and Chang’s method. In the previous work, Wang et al. tried to solve this problem by homology modeling. Then, Chang et al. improved Wang’s result by refining the positions of oxygen based on the AMBER force field. We compare the results in CASP7 and 8 from Chang et al. and SABBAC v1.2 and find that some proteins get better predicting results by Chang’s method and others do better in SABBAC. Based on SVM, we propose a tool preference classification method for determining which tool is potentially the better one for predicting the structure of a target protein. We design a series of steps to select the better feature sets for SVM. Our method is tested on the proteins with standard amino acids in CASP7 and 8 dataset, which contains 30 and 24 protein sequences, respectively. The experimen- tal results show that our method has 7.39% and 2.94% RMSD improvement against Chang’s result in CASP7 and 8, respectively. Our method can also be applied to other effective prediction methods, even if they will be developed in the future.
目次 Table of Contents
ABSTRACT ................................... 0
Chapter1.Introduction............................ 1
Chapter2.Preliminaries ........................... 3
2.1 Protiens.................................. 3
2.1.1 PropertiesofProteins ...................... 3
2.1.2 AminoAcidsandPeptides.................... 3
2.1.3 LevelsofProteinStructures ................... 4
2.2 RootMeanSquareDeviation ...................... 8
2.3 PreviousWorksandMotivation ..................... 10
2.4 SupportVectorMachine(SVM)..................... 14
Chapter 3. The Tool Preference Classification Method . . . . . . . . 18
3.1 NewFeatureExtraction ......................... 18
3.2 Step 0: Feature Set Selection and Feature Set Reorganization . . . . . 22
3.3 Step1:TheTrainingSetFiltering.................... 22
3.4 Step2:TrainingSetWeighting ..................... 26
3.5 Step3:SVMClassification........................ 26
Chapter4.ExperimentalResults:PartI................. 28
4.1 FeatureSetSelection:PartI....................... 29
4.2 TrainingSetFiltering:PartI ...................... 29
4.3 TrainingSetWeightingStep:PartI................... 32
4.4 TheSelf-TestandIndependentTest: PartI . . . . . . . . . . . . . . 33
Chapter5.ExperimentalResults:PartII ................ 36
5.1 FeatureSetSelection:PartII ...................... 36
5.2 TrainingSetFiltering:PartII...................... 36
5.3 TrainingSetWeighting:PartII ..................... 39
5.4 TheSelf-TestandIndependentTest: PartII . . . . . . . . . . . . . . 41
Chapter6.Conclusion............................. 43 BIBLIOGRAPHY................................ 49
參考文獻 References
[1] S. A. Adcock, “Peptide backbone reconstruction using dead-end elimination and a knowledge-based forcefield,” Journal of Computational Chemistry, Vol. 25, pp. 16–27, 2004.
[2] A. Anand, G. Pugalenthi, and P. N. Suganthan, “Predicting protein structural class by svm with class-wise optimized features and decision probabilities,” Journal of Theoretical Biology, Vol. 253, pp. 375–380, 2008.
[3] C. C. Chang and C. J. Lin, LIBSVM: A library for support vector machines. National Taiwan University, No. 1, Roosevelt Rd. Sec. 4, Taipei, Taiwan 106, ROC, 2001. Software available at http://www.csie.ntu.edu.tw/∼cjlin/libsvm.
[4] H. Y. Chang, C. B. Yang, and H. Y. Ann, “Refinement on o atom positions for protein backbone prediction,” Proceedings of the 2nd WSEAS International Conference on BIOMEDICAL ELECTRONICS and BIOMEDICAL INFOR- MATICS, 2007.
[5] W. D. Cornell, P. Cieplak, C. I. Bayly, I. R. Gould, K. M. Merz, Jr., D. M. Ferguson, D. C. Spellmeyer, T. Fox, J. W. Caldwell, and P. A. Kollman, “A second generation force field for the simulation of proteins, nucleic acids, and organic molecules,” Journal of American Chemical Society, Vol. 117, pp. 5179– 5197, 1995.
[6] H. Q. Ding and I. Dubchak, “Multi-class protein fold recognition using support vector machines and neural networks,” Bioinformatics, Vol. 17, No. 4, pp. 349– 358, 2001.
[7] I. Dubchak, I. Muchnik, C. Mayor, I. Dralyuk, and S. H. Kim, “Recognition of a protein fold in the context of the scop classification,” PROTEINS: Structure, Function, and Genetics, Vol. 35, p. 401407, 1999.
[8] J. Guo, H. Chen, Z. Sun, and Y. Lin, “A novel method for protein secondary structure prediction using dual-layer svm and proles,” PROTEINS: Structure, Function, and Bioinformatics, Vol. 54, pp. 738–743, 2004.
49
[9] R. Gupta, A. Mittal, and K. Singh, “A time-series-based feature extraction ap- proach for prediction of protein structural class,” EURASIP Journal on Bioin- formatics and Systems Biology, Vol. 2008, pp. 1–7, 2008.
[10] L. Holm and C. Sander, “Database algorithm for generating protein backbone and side-chain coordinates from a c alpha trace application to model build- ing and detection of coordinate errors,” Journal of Molecular Biology, Vol. 21, No. 1, pp. 183–194, 1991.
[11] J. L. Hsin, C. B. Yang, K. S. Huang, and C. N. Yang, “An ant colony opti- mization approach for the protein side chain packing problem,” Proceedings of the 6th WSEAS International Conference on Microelectronics, Nanoelectronics, Optoelectronics, Istanbul, Turkey, pp. 44–49, 2007.
[12] R. Kazmierkiewicz, A. Liwo, and H. A. Scheraga, “Energy-based reconstruction of a protein backbone from its α-carbon trace by a Monte-Carlo method,” Journal of Computational Chemistry, Vol. 23, pp. 715–723, 2002.
[13] C. Lee and S. Subbiah, “Prediction of protein side-chain conformation by pack- ing optimization,” Journal of Molecular Biology, Vol. 217, No. 2, pp. 373–388, 1991.
[14] S. Liang and N. V. Grishin, “Side-chain modeling with an optimized scoring function,” Protein Science, Vol. 11, No. 2, pp. 322–331, 2002.
[15] V. N. Maiorov and G. M. Crippen, “Significance of root-mean-square deviation in comparing three-dimensional structures of globular proteins,” Journal of Molecular Biology, Vol. 235, No. 2, pp. 625–634, 1994.
[16] J. Maupetit, R. Gautier, and P. Tuffery, “SABBAC: online structural alphabet- based protein backbone reconstruction from alpha-carbon trace,” Nucleic Acids Research, Vol. 34, pp. W147–W151, 2006.
[17] M. Milik, A. Kolinski, and J. Skolnick, “Algorithm for rapid reconstruction of protein backbone from alpha carbon coordinates,” Journal of Computational Chemistry, Vol. 18, pp. 80–85, 1997.
[18] J. D. Qiu, X. Y. Sun, J. H. Huang, and R. P. Liang, “Prediction of the types of membrane proteins based on discrete wavelet transform and support vector machines,” The Protein Journal, Vol. 29, pp. 114–119, 2010.
50
[19] I. Ruczinski, C. Kooperberg, R. Bonneau, and D. Baker, “Distributions of beta sheets in proteins with application to structure prediction,” Proteins: Structure, Function, and Genetics, Vol. 48, No. 1, pp. 85–97, 2002.
[20] A. A. Tantar, N. Melab, and E. G. Talbi, “A comparative study of paral- lel metaheuristics for protein structure prediction on the computational grid,” International Parallel and Distributed Processing Symposium (IPDPS 2007), Long Beach, California, USA, pp. 1–10, 2007.
[21] J. H. Wang, C. B. Yang, and C. T. Tseng, “Reconstruction of protein back- bone with the α-carbon coordinates,” Proceedings of 2007 National Computer Symposium, Taichung, Taiwan, pp. 136–143, 2007.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內一年後公開,校外永不公開 campus withheld
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus:永不公開 not available

您的 IP(校外) 位址是 52.15.63.145
論文開放下載的時間是 校外不公開

Your IP address is 52.15.63.145
This thesis will be available to you on Indicate off-campus access is not available.

紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code