Responsive image
博碩士論文 etd-0906111-171625 詳細資訊
Title page for etd-0906111-171625
論文名稱
Title
利用混合模型之雙硫鍵預測方法
Disulfide Bond Prediction with Hybrid Models
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
42
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2011-08-31
繳交日期
Date of Submission
2011-09-06
關鍵字
Keywords
半胱胺酸、雙硫鍵、混合模型、支持向量機、預測
prediction, SVM, disulfide bond, cysteine, hybrid model
統計
Statistics
本論文已被瀏覽 5631 次,被下載 1132
The thesis/dissertation has been browsed 5631 times, has been downloaded 1132 times.
中文摘要
雙硫鍵是存在於蛋白質裡兩個半胱胺酸之間的一種特殊共價鍵。這種共價鍵
對蛋白質的折疊跟穩定作用上扮演很重要的角色。在雙硫鍵連結模式預測這一方
面,可能的連接模式會因半胱胺酸數量的增加而急速成長,而成為一個難題。在
這篇論文中,我們針對這個問題提出了一個改進方法。這個方法是以支持向量機
作為基礎的模型方法。經由這個策略,我們可以藉由選取適合的模型來增加預測
的準確度。為了評估我們方法的效能,我們使用SP39 這個擁有446 條蛋白質的
資料集,並採用4 次交叉驗證的方法。我們在配對模式跟連結模式上個別達到
70.8%以及65.9%,且得到比前人較好的成果。
Abstract
Disulfide bonds are special covalent cross links between two cysteines in a
protein. This kind of bonding state plays an important role in protein folding and
stabilization. For connectivity pattern prediction, it is a very difficult problem because
of the fast growth of possible patterns with respect to the number of cysteines. In this
thesis, we propose a new approach to address this problem. The method is based on
hybrid models with SVM. Via this strategy, we can improve the prediction accuracies
by selecting appropriate models. In order to evaluate the performance of our method,
we apply the method by 4-fold cross-validation on SP39 dataset, which contains 446
proteins. We achieve accuracies with 70.8% and 65.9% for pair-wise and pattern-wise
prediction respectively, which is better than the previous works.
目次 Table of Contents
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0
Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Chapter 2. Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1 Position-Specific Score Matrix . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Secondary Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Support Vector Machine . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4 Previous Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4.1 Chung’s Method with Down-sampling . . . . . . . . . . . . . . 10
2.4.2 Song’s Method with Secondary Structure Information . . . . . 11
2.4.3 Zhu’s Method Using Feature Selection . . . . . . . . . . . . . 12
Chapter 3. Algorithms for Disulfide Bond Prediction . . . . . . . . . 13
3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Algorithm for Connectivity Prediction . . . . . . . . . . . . . . . . . 15
Chapter 4. Experimental Results . . . . . . . . . . . . . . . . . . . . . . 19
4.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.3 Connectivity Prediction Experiments . . . . . . . . . . . . . . . . . . 20
Chapter 5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
參考文獻 References
[1] S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipmanl, “Basic
local alignment search tool,” Journal of Molecular Biology, Vol. 215, No. 3,
pp. 403–410, 1990.
[2] S. F. Altschul, T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller,
and D. J. Lipman, “Gapped blast and psi-blast: a new generation of protein
database search programs,” Nucleic Acids Research, Vol. 25, No. 17, pp. 3389–
3402, 1997.
[3] C.-C. Chang and C.-J. Lin, “LIBSVM: a library for support vector machines,”
2001. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
[4] H. Y. Chang, C. B. Yang, and H. Y. Ann, “Refinement on o atom positions
for protein backbone prediction,” Proceedings of the 2nd WSEAS International
Conference on BIOMEDICAL ELECTRONICS and BIOMEDICAL INFOR-
MATICS, 2009.
[5] Y.-C. Chen and J.-K. Hwang, “Prediction of disulfide connectivity from protein
sequences,” PROTEINS: Structure, Function, and Genetics, Vol. 61, pp. 507–
512, 2005.
[6] Y.-C. Chen, Y.-S. Lin, C.-J. Lin, and J.-K. Hwang, “Prediction of the bonding
states of cysteines using the support vector machines based on multiple feature
vectors and cysteine state sequences,” PROTEINS: Structure, Function, and
Genetics, Vol. 55, pp. 1036–1042, 2004.
[7] W.-C. Chung, “A multi-phase approach for disulfide bond prediction,” Master’s
Thesis, Department of Computer Science and Engineering, National Sun Yat-
Sen University, Kaohsiung, Taiwan, 2009.
[8] W.-C. Chung, C.-B. Yang, and C.-Y. Hor, “An effective tuning method for cys-
teine state classification,” Proc. of National Computer Symposium, Workshop
on Algorithms and Bioinformatics, Taipei, Taiwan, Nov. 27-28, 2009.
[9] P. Frasconi, A. Passerini, and A. Vullo, “A two-stage svm architecture for
predicting the disulfide bonding state of cysteines,” Neural Networks for Signal
Processing, 2002. Proceedings of the 2002 12th IEEE Workshop on, pp. 25–34,
2002.
[10] D. T. Jones, “Protein secondary structure prediction based on position-specific
scoring matrices,” Journal of Molecular Biology, Vol. 292, No. 2, pp. 195–202,
1999.
[11] W. Kabsch and C. Sander, “Dictionary of protein secondary structure: Pat-
tern recognition of hydrogen-bonded and geometrical features,” Biopolymers,
Vol. 22, pp. 2577–2637, 1983.
[12] J. R. G. L., A. P. Shilton, M. M. Parker, and M. Palaniswami, “Prediction
of cystine connectivity using SVM,” Bioinformation, Vol. 1, No. 2, pp. 69–74,
2005.
[13] H.-L. Liu and S.-C. Chen, “Prediction of disulfide connectivity in proteins with
support vector machine,” Journal of the Chinese Institute of Chemical Engi-
neers, Vol. 38, No. 1, pp. 63–70, 2007.
[14] C.-H. Lu, Y.-C. Chen, C.-S. Yu, and J.-K. Hwang, “Predicting disulfide con-
nectivity patterns,” PROTEINS: Structure, Function, and Genetics, Vol. 67,
pp. 262–270, 2007.
[15] R. Singh, “A review of algorithmic techniques for disulfide-bond determina-
tion,” Brief Funct Genomic Proteomic, Vol. 7, No. 2, pp. 157–172, 2008.
[16] J. Song, Z. Yuan, H. Tan, T. Huber, and K. Burrage, “Predicting disulfide
connectivity from protein sequence using multiple sequence feature vectors and
secondary structure,” Bioinformatics, Vol. 23, No. 23, pp. 3147–3154, 2007.
[17] C.-H. Tsai, B.-J. Chen, C.-H. Chan, H.-L. Liu, and C.-Y. Kao, “Improving
disulfide connectivity prediction with sequential distance between oxidized cys-
teines,” Bioinformatics, Vol. 21, No. 24, pp. 4416–4419, 2005.
[18] M. Vincent, A. Passerini, M. Labbe, and P. Frasconi, “A simplified approach to
disulfide connectivity prediction from protein sequences,” BMC Bioinformatics,
Vol. 9, No. 1, p. 20, 2008.
[19] L. Zhu, J. Yang, J.-N. Song, K.-C. Chou, and H.-B. Shen, “Cysteine separa-
tions profiles (csp) on protein sequences infer disulfide connectivity,” Journal
of Computational Chemistry, Vol. 31, No. 7, pp. 1415–1420, 2009.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code