Responsive image
博碩士論文 etd-0810101-223823 詳細資訊
Title page for etd-0810101-223823
論文名稱
Title
建構演化樹之近似演算法
Approximation Algorithms for Constructing Evolutionary Trees
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
58
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2001-06-01
繳交日期
Date of Submission
2001-08-10
關鍵字
Keywords
生物資訊、演化樹
dynamic programming, evolutionary tree, ultrametric, bioinformations, MUT, phylogeny, circular order
統計
Statistics
本論文已被瀏覽 5680 次,被下載 1709
The thesis/dissertation has been browsed 5680 times, has been downloaded 1709 times.
中文摘要
演化樹(Evolutionary tree)不僅對於生物分類極為重要,更可藉由演化樹來得知生命的起源。我們將以DNA這個最原始最直接的資訊,略過生物的構造,嘗試建立和真正演化情況最相近的演化樹。
建構演化樹的基礎有三種類型:「特徵」(Character base)、「序列」(Sequence base)及「距離」(Distance base)。除了不同類型的基礎外,尚有多種不同計分函數。除了少數特殊計分函數配合特殊輸入資料以外,絕大多數的演化樹問題均為NP-hard。我們將以距離為基礎,設計建構演化樹的演算法。
我們將提出兩種建構演化樹的方法:
方法一:提出一演算法,並利用距離函數滿足三角不等式的特性,證明其近似解倍率之上限值不超過1.44 log n + 1。
方法二:提出一動態演算法,其在固定樹葉節點順序時,可建構最佳ultrametric演化樹。並尋找建構演化樹時所需要的較佳樹葉節點循環順序,再利用此順序建構演化樹。


Abstract
In this thesis, we shall propose heuristic algorithms to construct evolutionary trees under the distance base model. For a distance matrix of any type, the problem of constructing a minimum ultrametric tree (MUT), whose scoring function is the minimum tree size, is NP-hard. Furthermore, the problem of constructing an approximate ultrametric tree with approximation error ratio within $n^{epsilon}, epsilon > 0$, is also NP-hard. When the distance matrix is metric, the problem is called the triangle minimum ultrametric tree problem ($ riangle$MUT). For the $ riangle$MUT, there is a previous approximation algorithm, with error ratio $leq 1.5 ( lceil log n
ceil + 1 )$. And we shall propose an improvement, with error ratio $leq lceil log_{alpha} n
ceil + 1 cong 1.44 lceil log n
ceil + 1$, where $alpha = frac{sqrt{5}+1}{2}$ for solving the $ riangle$MUT problem.

We shall also propose a heuristic algorithm to obtain a good leaf
node circular order. The heuristic algorithm
is based on the clustering scheme. And then we shall design a dynamic
programming algorithm to construct the optimal ultrametric tree with
some fixed leaf node circular order. The time complexity of the
dynamic programming is $O(n^3)$, if the scoring function is the
minimum tree size or $L^1$-min increment.


目次 Table of Contents
Chapter 1. Introduction
Chapter 2. Preliminaries
2.1 Distance Matrices and Evolutionary Trees
2.2 Optimization Problems
2.2.1 Scoring functions
2.2.2 Complexity
2.3 The Unweighted Pair Group Method with Maximum (UPGMM) for Ultrametric Trees
2.4 The Neighbor Joining Method for Additive Trees
2.5 An Approximation for riangle MUT
Chapter 3. An Approximation Algorithm for riangle MUT
3.1 The Binary Splitting Tree Problem
3.2 Our Approximation Algorithm
Chapter 4. A Heuristic Algorithm for MUT
4.1 The Leaf Node Circular Order
4.2 The Optimal Ultrametric Tree for a Certain Fixed Leaf Node Circular Order
Chapter 5. Experiment Results
Chapter 6. Conclusion
參考文獻 References
egin{thebibliography}{10}

ibitem{Aga94}
R.~Agarwala and D.~Fernandez-Baca, ``A polynomial-time algorithm for the
perfect phylogeny problem when the number of character states is fixed,'
{em SIAM Journal on Computing}, Vol.~23, No.~6, pp.~1216--1224, 1994.

ibitem{Aga92}
R.~Agarwala, D.~Fernandez-Baca, and G.~Slutzki, ``Fast algorithms for inferring
evolutionary trees,' {em In Proceedings of the 30th Allerton Conference on
Comm., Control, and Comput}, pp.~594--603, 1992.

ibitem{Alt89}
S.~F. Altschul and D.~Lipman, ``Trees, stars, and multiple biological sequence
alignment,' {em SIAM Journal of Applied Mathematics}, Vol.~49,
pp.~197--209, 1989.

ibitem{Bod92a}
H.~Bodlaender, M.~Fellows, and T.~Warnow, ``Two strikes against perfect
phylogeny,' {em In Proceedings of the 19th International Colloquium on
Automata, Languages, and Programming, Springer Verlag, Lecture Notes in
Computer Science}, pp.~273--283, 1992.

ibitem{Bod92b}
H.~Bodlaender and T.~Kloks, ``A simple linear time algorithm for triangulating
three-colored graphs,' {em In Proceedings of the 9th Annual Symposium on
Theoretical Aspects of Computer Science}, pp.~415--423, 1992.

ibitem{Day87}
W.~H.~E. Day, ``Computational complexity of inferring phylogenies by
dissimilarity matrices,' {em Bulletin of Mathematical Biology}, Vol.~49,
No.~4, pp.~461--467, 1987.

ibitem{Day86}
W.~H.~E. Day and D.~Sankoff, ``Computational complexity of inferring
phylogenies by compatibility,' {em Systematic Zoology}, Vol.~35, No.~2,
pp.~224--229, 1986.

ibitem{Far97}
M.~Farach and J.~Cohen, ``Numerical taxonomy on data: Experimental results,'
{em ACM-SIAM Symposium on Discrete Algorithms}, 1997.

ibitem{Far95b}
M.~Farach, T.~Przytycka, and M.~Thorup, ``On the agreement of many trees,'
{em Information Processing Letters}, Vol.~55, pp.~297--301, 1995.

ibitem{Far95a}
M.~Farach, S.Kannan, and T.Warnow, ``A robust model for finding optimal
evolutionary trees,' {em Algorithmica}, Vol.~13, No.~1/2, pp.~155--179,
1995.

ibitem{Far94}
M.~Farach and M.~Thorup, ``Fast comparison of evolutionary trees,' {em In
Proc. 5th ACM-SIAM Symp. on Discrete Algorithms}, pp.~481--488, 1994.

ibitem{Kim99}
J.~Kim and T.~Warnow, {em Tutorial on phylogenetic tree estimation}.

ewblock Department of Ecology and Evolutionary Biology, Yale University,
1999.

ewblock http://ismb99.gmd.de/TUTORIALS/Kim/4KimTutorial.ps.

ibitem{Kor00}
C.~Korostensky and G.~H. Gonnet, ``Using traveling salesman problem algorithms
for evolutionary tree construction,' {em Bioinformatics}, Vol.~16, No.~7,
pp.~619--627, 2000.

ibitem{Lee01}
R.~C.~T. Lee, ``Computational biology.'
http://www.csie.ncnu.edu.tw/$symbol{126}$rctlee/ biology.html, Department of
Computer Science and Information Engineering, National Chi-Nan University,
2001.

ibitem{LiM00}
M.~Li, J.~H. Badger, X.~Chen, S.~K.~P. Kearney, and H.~Zhang, ``An information
based sequence distance and its application to whole mitochondrial genome
phylogeny,' {em Bioinformatics}, Vol.~17, No.~2, pp.~149--154, 2001.

ibitem{LiW91}
W.~H. Li and D.~Graur, {em Fundamentals of molecular evolution}.

ewblock MA: Sinauer Associates, 1991.

ibitem{Mas}
W.~J. Masek and M.~S. Paterson, ``How to compute string-edit distances
quickly,' {em Time Warps, String Edits, and Macromolecules: The Theory and
Practice of Sequence Comparison} (D.~Sankoff and J.~Kruskal, eds.),
Addison-Wesley Reading, 1983.

ibitem{Sai87}
N.~Saitou and M.~Nei, ``The neighbor-joining method: a new method for
reconstructing phylogenetic trees,' {em Molecular Biology and Evolution},
Vol.~4, pp.~406--424, 1987.

ibitem{Sel74}
P.~H. Sellers, ``On the theory and computation of evolutionary distances,'
{em SIAM Journal of Applied Mathematics}, Vol.~26, pp.~787--793, 1974.

ibitem{Set97}
J.~Setubal and J.~Meidanis, {em Introduction to computational molecular
biology}.

ewblock PWS publishing company, 1997.

ibitem{Ste92}
M.~A. Steel, ``The complexity of reconstructing trees from qualitative
characters and subtrees,' {em Journal of Classification}, Vol.~9,
pp.~91--116, 1992.

ibitem{Swo90}
D.~L. Swofford and G.~J. Olsen, ``Phylogeny reconstruction,' {em Molecular
Systematics} (D.~M. Hillis and C.~Moritz, eds.), pp.~411--501, Sinauer
Associates, 1990.

ibitem{War93}
H.~T. Wareham, ``On the computational complexity of inferring evolutionary
trees,' Tech. Rep. 9301, Department of Computer Science, Memorial University
of Newfoundland, 1993.

ewblock Available by anonymous ftp from ftp.cs.mun.ca in directory
pub/techreports.

ibitem{Wat95}
M.~S. Waterman, {em Introduction to Computational Biology: Maps, Sequences and
Genomes}.

ewblock New York: Chapman and Hall, 1995.

ibitem{Wat77}
M.~Waterman, T.~Smith, M.~Singh, and W.~Beyer, ``Additive evolutionary trees,'
{em Journal of Theoretical Biology}, Vol.~64, pp.~199--213, 1977.

ibitem{Won80}
R.~Wong, ``Worst-case analysis of network design problem heuristics,' {em
SIAM J. Algebraic Descrete Mathematics}, Vol.~1, pp.~51--63, 1980.

ibitem{WuB99}
B.~Y. Wu, K.~M. Chao, and C.~Y. Tang, ``Approximation and exact algorithms for
constructing minimum ultrametric trees from distance matrices,' {em Journal
of Combinatorial Optimization}, Vol.~3, pp.~199--211, 1999.

end{thebibliography}

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內校外完全公開 unrestricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code