論文使用權限 Thesis access permission:校內校外完全公開 unrestricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available
論文名稱 Title |
建構演化樹之近似演算法
Approximation Algorithms for Constructing Evolutionary Trees |
||
系所名稱 Department |
|||
畢業學年期 Year, semester |
語文別 Language |
||
學位類別 Degree |
頁數 Number of pages |
58 |
|
研究生 Author |
|||
指導教授 Advisor |
|||
召集委員 Convenor |
|||
口試委員 Advisory Committee |
|||
口試日期 Date of Exam |
2001-06-01 |
繳交日期 Date of Submission |
2001-08-10 |
關鍵字 Keywords |
生物資訊、演化樹 dynamic programming, evolutionary tree, ultrametric, bioinformations, MUT, phylogeny, circular order |
||
統計 Statistics |
本論文已被瀏覽 5680 次,被下載 1709 次 The thesis/dissertation has been browsed 5680 times, has been downloaded 1709 times. |
中文摘要 |
演化樹(Evolutionary tree)不僅對於生物分類極為重要,更可藉由演化樹來得知生命的起源。我們將以DNA這個最原始最直接的資訊,略過生物的構造,嘗試建立和真正演化情況最相近的演化樹。 建構演化樹的基礎有三種類型:「特徵」(Character base)、「序列」(Sequence base)及「距離」(Distance base)。除了不同類型的基礎外,尚有多種不同計分函數。除了少數特殊計分函數配合特殊輸入資料以外,絕大多數的演化樹問題均為NP-hard。我們將以距離為基礎,設計建構演化樹的演算法。 我們將提出兩種建構演化樹的方法: 方法一:提出一演算法,並利用距離函數滿足三角不等式的特性,證明其近似解倍率之上限值不超過1.44 log n + 1。 方法二:提出一動態演算法,其在固定樹葉節點順序時,可建構最佳ultrametric演化樹。並尋找建構演化樹時所需要的較佳樹葉節點循環順序,再利用此順序建構演化樹。 |
Abstract |
In this thesis, we shall propose heuristic algorithms to construct evolutionary trees under the distance base model. For a distance matrix of any type, the problem of constructing a minimum ultrametric tree (MUT), whose scoring function is the minimum tree size, is NP-hard. Furthermore, the problem of constructing an approximate ultrametric tree with approximation error ratio within $n^{epsilon}, epsilon > 0$, is also NP-hard. When the distance matrix is metric, the problem is called the triangle minimum ultrametric tree problem ($ riangle$MUT). For the $ riangle$MUT, there is a previous approximation algorithm, with error ratio $leq 1.5 ( lceil log n ceil + 1 )$. And we shall propose an improvement, with error ratio $leq lceil log_{alpha} n ceil + 1 cong 1.44 lceil log n ceil + 1$, where $alpha = frac{sqrt{5}+1}{2}$ for solving the $ riangle$MUT problem. We shall also propose a heuristic algorithm to obtain a good leaf node circular order. The heuristic algorithm is based on the clustering scheme. And then we shall design a dynamic programming algorithm to construct the optimal ultrametric tree with some fixed leaf node circular order. The time complexity of the dynamic programming is $O(n^3)$, if the scoring function is the minimum tree size or $L^1$-min increment. |
目次 Table of Contents |
Chapter 1. Introduction Chapter 2. Preliminaries 2.1 Distance Matrices and Evolutionary Trees 2.2 Optimization Problems 2.2.1 Scoring functions 2.2.2 Complexity 2.3 The Unweighted Pair Group Method with Maximum (UPGMM) for Ultrametric Trees 2.4 The Neighbor Joining Method for Additive Trees 2.5 An Approximation for riangle MUT Chapter 3. An Approximation Algorithm for riangle MUT 3.1 The Binary Splitting Tree Problem 3.2 Our Approximation Algorithm Chapter 4. A Heuristic Algorithm for MUT 4.1 The Leaf Node Circular Order 4.2 The Optimal Ultrametric Tree for a Certain Fixed Leaf Node Circular Order Chapter 5. Experiment Results Chapter 6. Conclusion |
參考文獻 References |
egin{thebibliography}{10} ibitem{Aga94} R.~Agarwala and D.~Fernandez-Baca, ``A polynomial-time algorithm for the perfect phylogeny problem when the number of character states is fixed,' {em SIAM Journal on Computing}, Vol.~23, No.~6, pp.~1216--1224, 1994. ibitem{Aga92} R.~Agarwala, D.~Fernandez-Baca, and G.~Slutzki, ``Fast algorithms for inferring evolutionary trees,' {em In Proceedings of the 30th Allerton Conference on Comm., Control, and Comput}, pp.~594--603, 1992. ibitem{Alt89} S.~F. Altschul and D.~Lipman, ``Trees, stars, and multiple biological sequence alignment,' {em SIAM Journal of Applied Mathematics}, Vol.~49, pp.~197--209, 1989. ibitem{Bod92a} H.~Bodlaender, M.~Fellows, and T.~Warnow, ``Two strikes against perfect phylogeny,' {em In Proceedings of the 19th International Colloquium on Automata, Languages, and Programming, Springer Verlag, Lecture Notes in Computer Science}, pp.~273--283, 1992. ibitem{Bod92b} H.~Bodlaender and T.~Kloks, ``A simple linear time algorithm for triangulating three-colored graphs,' {em In Proceedings of the 9th Annual Symposium on Theoretical Aspects of Computer Science}, pp.~415--423, 1992. ibitem{Day87} W.~H.~E. Day, ``Computational complexity of inferring phylogenies by dissimilarity matrices,' {em Bulletin of Mathematical Biology}, Vol.~49, No.~4, pp.~461--467, 1987. ibitem{Day86} W.~H.~E. Day and D.~Sankoff, ``Computational complexity of inferring phylogenies by compatibility,' {em Systematic Zoology}, Vol.~35, No.~2, pp.~224--229, 1986. ibitem{Far97} M.~Farach and J.~Cohen, ``Numerical taxonomy on data: Experimental results,' {em ACM-SIAM Symposium on Discrete Algorithms}, 1997. ibitem{Far95b} M.~Farach, T.~Przytycka, and M.~Thorup, ``On the agreement of many trees,' {em Information Processing Letters}, Vol.~55, pp.~297--301, 1995. ibitem{Far95a} M.~Farach, S.Kannan, and T.Warnow, ``A robust model for finding optimal evolutionary trees,' {em Algorithmica}, Vol.~13, No.~1/2, pp.~155--179, 1995. ibitem{Far94} M.~Farach and M.~Thorup, ``Fast comparison of evolutionary trees,' {em In Proc. 5th ACM-SIAM Symp. on Discrete Algorithms}, pp.~481--488, 1994. ibitem{Kim99} J.~Kim and T.~Warnow, {em Tutorial on phylogenetic tree estimation}. ewblock Department of Ecology and Evolutionary Biology, Yale University, 1999. ewblock http://ismb99.gmd.de/TUTORIALS/Kim/4KimTutorial.ps. ibitem{Kor00} C.~Korostensky and G.~H. Gonnet, ``Using traveling salesman problem algorithms for evolutionary tree construction,' {em Bioinformatics}, Vol.~16, No.~7, pp.~619--627, 2000. ibitem{Lee01} R.~C.~T. Lee, ``Computational biology.' http://www.csie.ncnu.edu.tw/$symbol{126}$rctlee/ biology.html, Department of Computer Science and Information Engineering, National Chi-Nan University, 2001. ibitem{LiM00} M.~Li, J.~H. Badger, X.~Chen, S.~K.~P. Kearney, and H.~Zhang, ``An information based sequence distance and its application to whole mitochondrial genome phylogeny,' {em Bioinformatics}, Vol.~17, No.~2, pp.~149--154, 2001. ibitem{LiW91} W.~H. Li and D.~Graur, {em Fundamentals of molecular evolution}. ewblock MA: Sinauer Associates, 1991. ibitem{Mas} W.~J. Masek and M.~S. Paterson, ``How to compute string-edit distances quickly,' {em Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison} (D.~Sankoff and J.~Kruskal, eds.), Addison-Wesley Reading, 1983. ibitem{Sai87} N.~Saitou and M.~Nei, ``The neighbor-joining method: a new method for reconstructing phylogenetic trees,' {em Molecular Biology and Evolution}, Vol.~4, pp.~406--424, 1987. ibitem{Sel74} P.~H. Sellers, ``On the theory and computation of evolutionary distances,' {em SIAM Journal of Applied Mathematics}, Vol.~26, pp.~787--793, 1974. ibitem{Set97} J.~Setubal and J.~Meidanis, {em Introduction to computational molecular biology}. ewblock PWS publishing company, 1997. ibitem{Ste92} M.~A. Steel, ``The complexity of reconstructing trees from qualitative characters and subtrees,' {em Journal of Classification}, Vol.~9, pp.~91--116, 1992. ibitem{Swo90} D.~L. Swofford and G.~J. Olsen, ``Phylogeny reconstruction,' {em Molecular Systematics} (D.~M. Hillis and C.~Moritz, eds.), pp.~411--501, Sinauer Associates, 1990. ibitem{War93} H.~T. Wareham, ``On the computational complexity of inferring evolutionary trees,' Tech. Rep. 9301, Department of Computer Science, Memorial University of Newfoundland, 1993. ewblock Available by anonymous ftp from ftp.cs.mun.ca in directory pub/techreports. ibitem{Wat95} M.~S. Waterman, {em Introduction to Computational Biology: Maps, Sequences and Genomes}. ewblock New York: Chapman and Hall, 1995. ibitem{Wat77} M.~Waterman, T.~Smith, M.~Singh, and W.~Beyer, ``Additive evolutionary trees,' {em Journal of Theoretical Biology}, Vol.~64, pp.~199--213, 1977. ibitem{Won80} R.~Wong, ``Worst-case analysis of network design problem heuristics,' {em SIAM J. Algebraic Descrete Mathematics}, Vol.~1, pp.~51--63, 1980. ibitem{WuB99} B.~Y. Wu, K.~M. Chao, and C.~Y. Tang, ``Approximation and exact algorithms for constructing minimum ultrametric trees from distance matrices,' {em Journal of Combinatorial Optimization}, Vol.~3, pp.~199--211, 1999. end{thebibliography} |
電子全文 Fulltext |
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。 論文使用權限 Thesis access permission:校內校外完全公開 unrestricted 開放時間 Available: 校內 Campus: 已公開 available 校外 Off-campus: 已公開 available |
紙本論文 Printed copies |
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。 開放時間 available 已公開 available |
QR Code |