Responsive image
博碩士論文 etd-0821103-201717 詳細資訊
Title page for etd-0821103-201717
論文名稱
Title
生物序列樣本的尋找方法
Motif Finding in Biological Sequences
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
38
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2003-06-20
繳交日期
Date of Submission
2003-08-21
關鍵字
Keywords
局部序列對齊、計算生物學、樣本尋找、螞蟻演算法
Computational Biology, Motif Finding, ACO Algorithm, Local Sequences Alignment
統計
Statistics
本論文已被瀏覽 5634 次,被下載 2732
The thesis/dissertation has been browsed 5634 times, has been downloaded 2732 times.
中文摘要
由人類基因體計劃所產生出巨大數目的基因資訊,包含蛋白質和DNA序列。要解釋這些序列和若干序列裡局部殘基樣本的探查是非常困難。解釋這些生物序列的方法之一是要探查它們之間的局部殘基樣本。然而,從若干序列探查不清楚模式仍然很困難。在此論文中,我們以Gibbs樣本方法為基礎提出一個為了確定單分子序列中的局部一致樣本的演算法。首先,我們設計螞蟻演算法找出一個好的初始答案和一個較好的候選位置的集合來修改此樣本。然後,Gibbs樣本方法應用這些候選位置作為輸入。運用我們的演算法去尋找樣本,將大大地減少了所需的時間。只花費Gibbs樣本方法的20%的時間,並且我們將維持那比較的品質。
Abstract
A huge number of genomic information, including protein and DNA sequences, is generated by the human genome project. Deciphering these sequences and detecting local residue patterns of multiple sequences are very difficult. One of the ways to decipher these biological sequences is to detect local residue patterns from them. However, detecting unknown patterns from multiple sequences is still very difficult. In this thesis, we propose an algorithm, based on the Gibbs sampler method, for identifying local consensus patterns (motifs) in monomolecular sequences. We first designed an ACO (ant colony optimization) algorithm to find a good initial solution and a set of better candidate positions for revising the motif. Then the Gibbs sampler method is applied with these better candidate positions as the input. The required time for finding motifs using our algorithm is reduced drastically. It takes only 20 % of time of the Gibbs sampler method and it maintains the comparable quality.


目次 Table of Contents
Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Chapter 2. Some Methods for Motif Finding . . . . . . . . . . . . . . 3
2.1 The Motif Finding Problem . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 The Gibbs Sampler Method . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Expectation Maximization . . . . . . . . . . . . . . . . . . . . . . . . 8
Chapter 3. Ant Colony Optimization Algorithm (ACO) . . . . . . . 9
3.1 The Traveling Salesman Problem (TSP) . . . . . . . . . . . . . . . . 12
3.2 The Subset Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Chapter 4. Our Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Chapter 5. Experimental Results . . . . . . . . . . . . . . . . . . . . . . 28
5.1 Experiments on Real Domains . . . . . . . . . . . . . . . . . . . . . . 28
5.2 Experiments on Artificial Domains . . . . . . . . . . . . . . . . . . . 31
Page
Chapter 6. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
參考文獻 References
[1] T. L. Bailey and C. Elkan, “Unsupervised learning of multiple motifs in biopolymers using expectation maximization,” Machine Learning, Vol. 21, No. 1-2, pp. 51-80, 1995.
[2] J. Buhler and M. Tompa, “Finding motifs using random projections,” Journal of Computational Biology, Vol. 9, No. 2, pp. 225-242, 2002.
[3] G. D. Caro and M. Dorigo, “Ant net: Distributed stigmergetic control for communications networks,” Journal of Artificial Intelligence Research (JAIR),
Vol. 9, pp. 317-365, Dec. 1998.
[4] C.-H. Chu, J. Gu, X. D. Hou, and Q. Gu, “A heuristic ant algorithm for solving QoS multicast routing problem,” Proceedings of the 2002 Congress on Evolutionary Computation CEC2002, pp. 1630-1635, 2002.
[5] M. Dorigo and L. M. Gambardella, “Ant colony system: A cooperative learning approach to the traveling salesman problem,” IEEE Transactions on Evolutionary Computation, Vol. 1, No. 1, pp. 53-56, 1997.
[6] M. Dorigo, V. Maniezzo, and A. Colorni, “The ant system: Optimization by a colony of cooperating agents,” IEEE Transactions on Systems, Man, and Cybernetics - Part B, Vol. 26, No. 1, pp. 29-42, 1996.
[7] Y.-J. Hu, S. B. Sandmeyer, and D. F. Kibler, “Detecting motifs from sequences,” In Proceedings of the 16th International Conference on Machine Learning (ICML), pp. 181-190, 1999.
[8] K. Karadimitriou and D. H. Kraft, “Genetic algorithms and the multiple sequence alignment problem in biology,” Proceedings of the Second Annual Molecular Biology and Biotechnology Conference, pp. 1-7, 1996.
[9] C. E. Lawrence, S. F. Altschul, M. S. Boguski, J. S. Liu, A. F. Neuwald, and J. C. Wootton, “Detecting subtle sequence signals: A Gibbs sampling strategy for multiple alignment,” Science, Vol. 262, pp. 208-214, Oct. 1993.
[10] C. E. Lawrence and A. A. Reilly, “An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences,” Proteins, Vol. 7, No. 1, pp. 41-51, 1990.
[11] G. Leguizamon and Z. Michalewicz, “A new version of ant system for subset problems,” Proceedings of the Congress on Evolutionary Computation, pp. 1459-1464, 1999
[12] L. R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proceedings of the IEEE, Vol. 77, pp. 257-282, 1989.
[13] E. Rocke and M. Tompa, “An algorithm for finding novel gapped motifs in DNA sequences,” Proceedings of the Second Annual International Conference on Computational Molecular Biology, pp. 228-233, Mar. 1998.
[14] G. Stormo and G. Hartzell, “Identifying protein-binding sites from unaligned DNA fragments,” Proc. Natl. Acad. Sci. USA, Vol. 86, pp. 1183-1187, 1989.
[15] J. Stoye, “Multiple sequence alignment with the divide-and-conquer method,” Gene, Vol. 211, No. 2, pp. 45-56, 1998.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內校外完全公開 unrestricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code