Responsive image
博碩士論文 etd-0706113-113208 詳細資訊
Title page for etd-0706113-113208
論文名稱
Title
結合微陣列時序資料與本體論知識於基因分群及網路重建之研究
Using Microarray Time Series Data and Gene Ontology for Gene Clustering and Network Reconstruction
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
69
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2013-06-28
繳交日期
Date of Submission
2013-08-06
關鍵字
Keywords
基因分群、基因調控網路、基因本體論、布林網路、微陣列時序資料
gene cluster, boolnet, gene regulatory network, gene ontology, microarray time series data
統計
Statistics
本論文已被瀏覽 5860 次,被下載 519
The thesis/dissertation has been browsed 5860 times, has been downloaded 519 times.
中文摘要
近年來使用基因微陣列時序資料,來重建基因調控網路,已經是一個非常普遍的方式,然而這些基因的數量通常是非常龐大的,因此我們想要在重建之前,先對這些基因做一個適當的分群,也就是彼此間有互動的基因會被分在同一個群組裡,在過去有許多分群的演算法,都已經被應用在基因的分群上。
在這邊我們使用的方式是結合多個資料來源,一方面避免受單一資料來源的影響太大,當只有一個資料來源時,資料的品質會對分群和重建有很大的影響。另一方面希望來對分群的績效會有一定的改善,進而達到提高後續重建,基因調控網路的正確性,在我們的研究裡,結合了兩種不同型態的資料來源,一個是基因微陣列時序資料,另一個是基因本體論,我們將基因本體論的資訊量化,再和時序資料做結合,結合的方式是使用座標放大縮小的方法,來調整時序資料和基因本體論間的權重,每個分群都會有二十一個權重,最後每個權重都分別使用分割式的分群演算來做分群,重建的部份以布林網路為主要工具。
經過我們的方法實驗後,我們可以看到,當我們有同時使用,微陣列時序資料,及基因本體論做分群時,其成效都會比單獨使用其中一個來分群還要好,而在後續的重建裡,當分群結果較好時,也可以得到較好的重建結果,因此我提出的方法,在對於分群基因上,是有效且可行的。
Abstract
In recent years, using microarray time series data to reconstruct gene regulatory network, has become a very popular way. However, the number of these genes are usually very large. We want to rebuild before these genes do a proper clustering, which is the each other interaction between the genes will be divided in the same group.
The way we use here is to combine multiple data sources. On the one hand avoid being affected by the impact of a single data source. When there is only one data source, data quality will have a great impact. On the other hand, we hope to have some clustering performance improvement, and improve the subsequent reconstruction of the accuracy of gene regulatory network. In our study, we combine two different types of data sources. One of source is microarray time series data, the other is the Gene Ontology. We quantify Gene Ontology, and combine with time series data. Finally, we use the partition clustering algorithm to cluster, and use Boolnet to reconstruct gene regulatory network.
After our experiment, we can obtain more great performance when we use microarray time series data and Gene Ontology simultaneously. In the following reconstruction, when the clustering result is better, we can get a better reconstruction of gene regulatory network. Therefore, our method for clustering of gene is effective and feasible.
目次 Table of Contents
論文審定書 i
中文摘要 ii
英文摘要 iii
1.序論 1
1.1 研究背景 1
1.2 研究動機與目標 2
2. 文獻探討 4
2.1 基因分群 4
2.1.1 k-means 4
2.1.2 fuzzy k-means 5
2.2 Gene Ontology 6
2.2.1 GO term相似度測量 7
2.2.2 基因相似度計算 11
2.3 模擬基因調控網路 12
3. 研究方法 14
3.1 基因分群 14
3.2 引進GO協助基因分群 14
3.3 評估分群 16
3.4 使用布林網路推論基因調控網路 17
3.5 網路評估 18
3.6 實驗設計 19
3.6.1 分群前資料處理 19
3.6.2 分群及權重調整 19
3.6.3 重建基因調控網路 19
3.7 實驗流程圖 20
4. 實驗結果與討論 22
4.1 實驗資料 22
4.1 分群實驗結果 23
4.1.1 模糊分群PPI評估 23
4.1.2 模糊分群的群聚系數 26
4.1.3 fuzzy-cmeans與k-means分群的PPI 29
4.1.4 模糊程度對分群的影響 31
4.2.1 使用較高PPI值與較低PPI值重建結果 33
4.2.2 隨機分群後重建結果 37
4.2.3 k-means之重建結果 44
4.3 討論與分析 46
4.3.1 分群 46
4.3.2 重建基因調控網路 47
5. 結論與未來研究 54
6. 參考文獻 56
參考文獻 References
1. K. H. Cho, S. M. Choo, S. H. Jung, J. R. Kim, H. S. Choi and J. Kim,“Reverse engineering of gene regulatory networks,”Systems Biology, Vol. 1, No 3, 2007, pp. 149-163
2. F. Geier, J. Timmer and C. Fleck, “Reconstructing gene-regulatory networks from time series, knock-out data, and prior knowledge,”BMC Systems Biology, Vol. 1, No. 3, 2007, pp. 149-163
3. P. Zoppoli, S. Morganella and M. Ceccarelli, “TimeDelay-ARACNE: Reverse engineering of gene networks from time-course data by an information theoretic approach,"BMC Bioinformatics, Vol. 11, No. 1, 2010, pp. 154-169
4. F. M. Alakwaa, N. H. Solouma and Y. M. Kadah, “Construction of Gene Regulatory Networks using biclustering and Bayesian networks,”Theoretical Biology and Medical Modelling, Vol. 8, No. 1, 2011, pp. 39-58
5. S. Zainudin and N. S. Mohamed,“Evaluating The Performance of Partitioning Techniques for Gene Network Inference,”IEEE, Vol. 1, No. 10, 2010, pp. 1119-1124
6. N. Speer, C. Spieth and A. Zell, “A Memetic Clustering Algorithm for the Functional Partition of Genes Based on the Gene Ontology,” Proc. 2004 IEEE Symp. Computational Intelligence in Bioinformatics and Computational Biology, 2004, pp. 252-259
7. P. W. Lord, R. D. Stevens, A. Brass and C. A. Goble, “Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation,” Bioinformatics, Vol. 19, No. 10, 2003, pp. 1275–1283
8. J. Z. Wang, Z. Du, R. Payattakool, P. S. Yu and C.-F. Chen, “A new method to measure the semantic similarity of GO terms, ” Bioinformatics, Vol. 23, No. 10, 2007, pp. 1274–1281
9. D. Dembele and P. Kastner, “Fuzzy K-means method for clustering microarray Data,” Bioinformatics, Vol. 19, No. 8, 2003, pp. 973–980
10. R. Kustra and A. Zagdanski,“Incorporating Gene Ontology in Clustering Gene Expression Data,” Proc. of IEEE Symp. on Computer-Based Medical Systems, 2006, pp. 555-563
11. H. Pirim, B. Eksioglu, A. D. Perkins, C. Yuceer, “Clustering of high throughput gene expression data,” Computers & Operations Research, 2012, pp. 3046-3061
12. Y. Zhang, J. Xuan, B. G. de los Reyes, R. Clarke and H. W. Ressom,“Reconstruction of Gene Regulatory Modules in Cancer Cell Cycle by Multi-Source Data Integration,”PLoS ONE, Vol. 5, No. 4, 2010
13. D. A. Cameron, F. A. Middleton, A. Chenn and E. C. Olson, “Hierarchical clustering of gene expression patterns in the Eomes + lineage of excitatory neurons during early neocortical development,” BMC Neuroscience, Vol. 13, 2012
14. C. Mussel, M. Hopfensitz, H. A. Kestler,“BoolNet—an R package for generation, reconstruction and analysis of Boolean networks,” Bioinformatics, Vol. 26, No. 10, 2012, pp. 1378-1380
15. F. E. Streib, G. V. Glazko, G. Altay and R. M. Simoes, “Statistical inference and reverse engineering of gene regulatory networks from observational expression data,” Front Genet, Vol. 3, 2012
16. D. Dotan-Cohen, S. Kasif and A. A. Melkman,“Seeing the forest for the trees: using the Gene Ontology to restructure hierarchical clustering,”Bioinformatics, Vol. 23, No. 10, 2009, pp. 1274-1281
17. R. Kustra and A. Zagdanski,“Data-Fusion in Clustering Microarray Data: Balancing Discovery and Interpretability,” IEEE/ACM Trans. on. Computational Biology and Bioinformatics, Vol. 7, No. 1, 2010, pp. 50-63
18. S. Raychaudhuri, J. M. Stuart and R. B. Altman,“Principal Components Analysis to Summarize Microarray Experiments: application to sporulation time series,” Pacific Symposium on Biocomputing, 2000, pp. 455-466
19. T. Saithong, S. Bumee, C. Liamwirat and A. Meechai, “Analysis and practical guideline of constraint-based Boolean method in genetic network inference,”PLoS ONE, Vol. 7, No. 1, 2012
20. S. Martin, Z. Zhang, A. Martino and J.-L. Faulon,“Boolean dynamics of genetic regulatory networks inferred from microarray time series data,”Bioinformatics, Vol. 23, No. 7, 2007, pp. 866–874
21. F. Herrmann, A. Gro, D. Zhou, H. A. Kestler, M. Kuhl, “A Boolean model of the cardiac gene regulatory network determining first and second heart field identity,” PLOS ONE, Vol. 7,No. 10, 2012
22. J. Makhoul, S. Roucos and H. Gish,“Vector quantization in speech coding,” Proceedings of the IEEE,Vol. 73, No. 11, 1985, pp. 1551-1588
23. E. R. Ruspini, “A new approach to clustering” Information and Control, Vol. 15, No. 1, 1969, pp. 22-32
24. J. C. Bezdek, “A convergence theorem for the fuzzy IDODATA clustering algorithms”,IEEE Trans. Pattern Analysis Machine Intelligence, Vol PAMI-2, No. 1, 1980, pp.1-8
25. P. Resnik, “Using information content to evaluate semantic similarity in a taxonomy,” Proc. of International Joint Conference on Artificial Intelligence, 1995, pp. 448–453
26. The Gene Ontology Consortium, “Gene ontology: tool for the unification of biology.”Nature Genetics, Vol. 25, No. 1, 2000
27. V. Pekar and S. Staab, “Taxonomy learning: factoring the structure of a taxonomy into a semantic classification decision”, Proc. of International Conference on Computational Linguistics, 2002, pp. 786-792
28. Y. Shen, S. Zhang, H.-S. Wong, “A new method for measuring the semantic similarity on gene ontology,” IEEE International Conference on Bioinformatics and Biomedicine, 2010, pp. 533-538
29. S. A. Kauffman, The Origins of Order: Self Organization and Selection in Evolution. Oxford University Press, USA, 1993
30. R. Somogyi and C. Sniegoski, “Modeling the complexity of genetic networks: understanding multigenic and pleiotropic regulation,” Complexity, Vol. 1, 1996. pp. 45-63
31. T. Akutsu, S. Miyano and S. Kuhara, “Identification of genetic networks from a small number of gene expression patterns under the Boolean network model,” Pacific Symposium on Biocomputing, 1999, pp. 17-28
32. J. Dutkowski1, M. Kramer, M. A. Surma, R. Balakrishnan, J. M. Cherry, N. J. Krogan, and T. Ideker, “A gene ontology inferred from molecular networks,” Nature Biotechnology, Vol. 31, 2012
33. A. Crombach, K. R. Wotton, D. C. Sain, M. Ashyraliyev and J. Jaeger, “Efficient reverse-engineering of a developmental gene regulatory network,” PLOS ONE, Vol. 8, No. 7, 2012
34. A. J. Hartemink, “Reverse engineering gene regulatory networks,” Nature Biotechnology, Vol. 3, No. 5, 2005, pp. 554-555
35. H. Wang, F. Azuaje, O. Bodenreider, J. Dopazo, “Gene expression correlation and gene ontology-based similarity: an Assessment of quantitative relationships,” Proc. of IEEE Symp. on Computational Intelligence in Bioinformatics and computational Biology, 2004, pp. 25-31
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code