Responsive image
博碩士論文 etd-0113115-161317 詳細資訊
Title page for etd-0113115-161317
論文名稱
Title
生物資訊、無線通訊及經濟模型的統計推論
Statistical Inference of Bioinformatics, Wireless Communication and Economic Models
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
111
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2015-01-27
繳交日期
Date of Submission
2015-02-13
關鍵字
Keywords
互補迴文、巨細胞病毒、空間相關性、高斯強度、鞅差過程、複製起點、無線通訊、V統計量、基因表現、傅立葉係數
gene expression, Fourier, cytomegalovirus, Complementary palindromes, martingale, replication origin, spatial correlation, V statistic, wireless communication, Gaussian intensity
統計
Statistics
本論文已被瀏覽 5747 次,被下載 58
The thesis/dissertation has been browsed 5747 times, has been downloaded 58 times.
中文摘要
本論文考慮在生物資訊、經濟以及無線通訊領域中的統計推論。在生物資訊方面,我們考慮有致命危險的巨細胞病毒當中的互補迴文模式。為了幫助找到巨細胞病毒的複製起點,我們以互補迴文個數為基礎,提供了四種統計量。推導出這四種統計量的分佈及漸進分佈。除了以模擬研究確認理論的正確性,也探討巨細胞病毒的互補迴文表現。

關於經濟方面,我們提出一個關於鞅差過程的檢定統計量。利用 V 統計量的一些理論,得到了所提出的統計量之近似分佈。以模擬研究驗證了該統計量的近似常態性,及其在有限樣本下的檢定水準與檢定力。

關於無線通訊方面的應用,考慮一個立基於移動發射器、基地台接收天線以及頻道散射體間幾何關係的統計模型。我們嚴格地推導出,在接受探測器陣列間,上行線路接受訊號的四階空間相關係數。散射體的空間位置為卜瓦松分佈,且具有二維高斯強度。最終得到的公式僅由幾何模型中少數參數決定。也利用了蒙地卡羅法驗證理論結果的正確性。

最後,為了協助科學家找到具有相似基因表現及改變模式的基因群,計算了基因表現資料的傅立葉係數。無論基因表現資料相關與否,我們都推導出這些傅立葉係數的聯合分佈。未來可利用這些分佈探討立基於模型的聚類分析方法。
Abstract
In this thesis, we consider statistical inferences for Bioinformatics, Economics and Wireless Communication models. For application of Bioinformatics, we consider the complementary palindrome (CP) pattern for the life-threatening disease virus called cytomegalovirus (CMV). To help search the replication origin of the CMV, four kinds of statistics based on the number of CP's are proposed. The distributions of the four statistics and the asymptotic distributions of some statistics are also derived. A simulation study is performed to confirm the theoretical findings. Some empirical results of CMV DNA sequence are also considered.

For Economic models, we propose a novel test statistic Tn for martingale difference problem. The asymptotic distribution of the proposed Tn is derived by using the theory of V statistic. A simulation study is performed to validate the asymptotic normality of Tn. The simulated sizes and powers of Tn statistic are also presented.

For Wireless Communication application, we consider a model based on idealized geometric relationships among the mobile transmitter, the base-station's receiving antennas, and the scatterers of the channel. The uplink received signal's fourth-order spatial-correlation coefficient across a receiving sensor-array's aperture is rigorously derived. The scatterers' spatial locations are modeled as Poisson distributed, with a Gaussian intensity over a two-dimensional space. The final formula is explicitly in terms of the simple geometric model's few independent parameters. Monte Carlo verification is carried out to support the theoretical results.

Finally, for the purpose of finding similar gene expression and change patterns, the Fourier coefficients (FC's) of gene expression data are calculated. For correlated and uncorrelated gene expression data, the joint distributions of these FC's are derived. With these distributions, the model-based clustering analysis can be considered in the future.
目次 Table of Contents
論文審定書 i

誌謝 ii

中文摘要 iv

Abstract v

1 Introduction 1
Reference 3

2 Complementary Palindrome in CMV Data 7
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Notations and De nitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3 Four Test Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1 I2k;t Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Test Statistic I 2k,t . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3 Quadratic Test Statistic T . . . . . . . . . . . . . . . . . . . . . . 15
3.4 Ratio Statistic Rk . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.1 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.2 Figure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.3 Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3 Martingale Di erence Test 31
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2 The Proposed Test Statistic vs the V Statistic . . . . . . . . . . . . . . . . 33
3 Simulation Study and Conclusion . . . . . . . . . . . . . . . . . . . . . . . 40
3.1 Simulation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4 Higher-Order Correlations Across the Uplink Receiver's Spatial Aper-
ture 51
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2 The Proposed Geometric Model" . . . . . . . . . . . . . . . . . . . . . . . 54
3 Third-Order Spatial Correlation-Coe cient Functions . . . . . . . . . . . . 58
3.1 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.2 Su cient Conditions for Zero Third-Order Spatial Correlation-Coe cient
Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4 Fourth-Order Correlation-Coe cient Function . . . . . . . . . . . . . . . . 61
4.1 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2 The Special Case of a Uniform Linear Array of Identical Isotropic
Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5 Monte Carlo Veri cation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
7 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
7.1 Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
7.2 Figure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

5 Fourier Coe cients in Gene Expression Data 79
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3 The Joint Distribution of the Fourier Coe cients . . . . . . . . . . . . . . 81
3.1 Independent Noise Sequence . . . . . . . . . . . . . . . . . . . . . . 81
3.2 Dependent Noise Sequence . . . . . . . . . . . . . . . . . . . . . . . 83
4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.1 Proof of Lemma 5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.2 Proof of Lemma 5.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.3 Proof of Lemma 5.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
參考文獻 References
Reference for chapter 1

Banfeild, J. D. and Raftery, A. E. (1993). Model-based gaussian and non-gaussian clustering. Biometrics, 49:803-821.
Bierens, H. (1982). Consistent model speci cation tests. Journal of Econometrics, 20:105-134.
Bierens, H. (1984). Model sepci tation testing of time series regressions. Journal of Econometrics, 26:323-353.
Chew, D., Choi, K., and Leung, M. (2005). Scoring schemes of palindrome clusters for more sensitive prediction of replication origins in herpesviruses. Nucleic Acids Research, 33(15):e134.
Cox, D. and Hinkley, D. (1974). Theoretical Statistics. Chapman and Hall, London. Eagleson, G. (1984). Orthogonal expansions and u-statistics. Australian Journal of Statistics, 21:221-237.
Kim, J. (2011). Clustering change patterns using fourier transformation with time-course gene expression data. Methods in molecular biology (Clifton, N.J.), 734:201-220.
Kim, J. and Kim, H. (2008). Clustering of change patterns using fourier coe cients. Bioinformatics, 24:184-191.
Kuan, C. and Lee, W. (2004). A new test of the martingale di erence hypothesis. Studies in Nonlinear Dynamics and Econometrics, 8:1-24.
Leung, M., Blaisdell, E., Burge, C., and Karlin, S. (1991). An e cient algorithm for identifying matches with errors in multiple long molecular sequences. Journal of Molecular Biology, 221:1367-1378.
Leung, M., Choi, K., Xia, A., and Chen, L. (2005). Nonrandom clusters of palindromes in herpesvirus genomes. Journal of computational biology: a journal of computational molecular cell biology, 12:331-354.
Masse, M., Karlin, S., Schachtel, G., and Mocarski, E. (1992). Human cytomegalo virus origin of dna replication (orilyt) resides within a highly complex repetitive region. Proceedings of the National Academy of Sciences of the United States of America, 89:5246-5250.
Nolan, D. and Speed, T. (2000). Stat Labs: Mathematical Statistics Through Applications. Springer-Verlag, New York.
Park, J. and Wang, Y. (2005). A test of the martingale hypothesis. Studies in Nonlinear Dynamics and Econometrics, 9:1-28.
Piterbarg, V. I. and Wong, K. T. (2005). patial-correlation-coe cient at the basestation, in closed-form explicit analytic expression, due to heterogeneously poisson distributed scatterers. IEEE Antennas and Wireless Propagation Letters, 4:385-388.

Reference for chapter 2

Chew, D., Choi, K., and Leung, M. (2005). Scoring schemes of palindrome clusters for more sensitive prediction of replication origins in herpesviruses. Nucleic Acids Research, 33:134-142.
CW, L. and ML., T. (1984). Opportunistic infection complicating acquired immune de ciency syndrome. clinical features of 25 cases. Medicine (Baltimore), 63:155-164.
Leung, M., Blaisdell, E., Burge, C., and Karlin, S. (1991). An e cient algorithm for identifying matches with errors in multiple long molecular sequences. Journal of Molecular Biology, 221:1367-1378.
Leung, M., Choi, K., Xia, A., and Chen, L. (2005). Nonrandom clusters of palindromes in herpesvirus genomes. Journal of computational biology: a journal of computational molecular cell biology, 12:331-354.
Masse, M., Karlin, S., Schachtel, G., and Mocarski, E. (1992). Human cytomegalo virus origin of dna replication (orilyt) resides within a highly complex repetitive region. Proc. Natl. Acad. Sci. USA., 89:5246-5250.
Nolan, D. and Speed, T. (2000). Stat Labs: Mathematical Statistics Through Applications. Springer-Verlag, New York.

Reference for chapter 3

Bierens, H. (1982). Consistent model speci cation tests. Journal of Econometrics, 20:105-134.
Bierens, H. (1984). Model sepci tation testing of time series regressions. Journal of Econometrics, 26:323-353.
Cox, D. and Hinkley, D. (1974). Theoretical Statistics. Chapman and Hall, London. Durlauf, S. N. (1991). Spectral based testing of the martingale hypothesis. Journal of Econometrics, 50:355- 376.
Eagleson, G. (1984). Orthogonal expansions and u-statistics. Australian Journal of Statistics, 21:221-237.
Kuan, C. and Lee, W. (2004). A new test of the martingale di erence hypothesis. Studies in Nonlinear Dynamics and Econometrics, 8:1-24.
Leucht, A. and Neumann, M. H. (2009). Consistency of general bootstrap methods for degenerate u- and v -type statistics. Journal of Multivariate Analysis, 100:1622-1633.
Park, J. and Whang, Y. (2005). A test of the martingale hypothesis. Studies in Nonlinear Dynamics and Econometrics, 9:1-28.

Reference for chapter 4

Abdi, A. and Kaveh, M. (2000). A versatile spatio-temporal correlation function for mobile fading channels with non-isotropic scattering. Signal Processing Symposium on Sensor and Adaptive Processing, pages 58-62.
Abdi, A. and Kaveh, M. (2002). A space-time correlation model for multielement antenna systems in mobile fading channels. Journal on Selected Areas in Communications, 20:550-560.
Adachi, F., Feeney, M. T., Williamson, A. G., and Parsons, J. D. (1986). Crosscorrelation between the envelopes of 900 mhz signals received at a mobile radio base station site. IEE Proceedings: Radar Signal Processing, 133:506-512.
Almers, P., Santos, T., Tufvesson, F., Molisch, A. F., Karedal, J., and Johansson, A. J. (2007). Antenna subset selection in measured indoor channels. IET Microwave, Antennas and Propagation, 1:1092-1100.
Brooks, D. H. and Nikias, C. L. (1993). Multichannel adaptive blind deconvolution using the complex cepstrum of higher order cross-spectra. Transactions on Signal Processing, 41:2928-2934.
Byers, G. J. and Takawira, F. (2004). Spatially and temporally correlated mimo channels: Modeling and capacity analysis. IEEE Transactions on Vehicular Technology, 53:634-643.
Chen, T.-A., Fitz, M. P., Kuo, W.-Y., Zoltowski, M. D., and Grimm, J. H. (2000). A space-time model for frequency nonselective rayleigh fading channels with applications to space-time modems. Journal on Selected Areas in Communications, 18:1175-1190.
Clarke, R. H. (1968). A statistical theory of mobile-radio reception. Bell System Technical Journal, pages 957-1000.
Fuhl, J., Molisch, A. F., and Bonek, E. (1998). Uni ed channel model for mobile radio systems with smart antennas. Proceedings: Radar, Sonar and Navigation, 145:32-41.
Fulghum, T. L., Molnar, K. J., and Duel-Hallen, A. (2002). The jakes fading model for antenna arrays incorporating azimuth spread. Transactions on Vehicular Technology, 51:968-977.
Janaswamy, R. (2002). Angle and time of arrival statistics for the gaussian scatter density model. Transactions on Wireless Communications, 1:488-497.
Kalkan, M. and Clarke, R. H. (1997). Prediction of the space-frequency correlation function for base station diversity reception. Transactions on Vehicular Technology, 46:176-184.
Krasny, L. and Molnar, K. J. (2004). Radio channel models for mimo antenna systems based on ellipsoidal scattering. IEEE Global Telecommunications Conference, 6:3969-3973.
Latinovic, Z., Abdi, A., and Bar-Ness, Y. (2004). On the utility of the circular ring model for wideband mimo channels. IEEE Vehicular Technology Conference, 1:96-100.
Patzold, M. and Hogstad, B. O. (2004). A space-time channel simulator of mimo channels based on the geometrical one-ring scattering model. Wireless Communications and Mobile Computing, 4:727-737.
Piterbarg, V. I. and Wong, K. T. (2005). Spatial-correlation-coe cient at the basestation, in closed-form explicit analytic expression, due to heterogeneously poisson distributed scatterers. IEEE Antennas and Wireless Propagation Letters, 4:385-388.
Rheeden, D. R. V. and Gupta, S. C. (1998). A geometric model for fading correlation in multipath radio channels. International Conference on Communications, pages 1655-1659.
Roy, S. and Falconer, D. D. (1999). Modelling the narrowband base station correlated diversity channel. Communications Theory Mini-Conference, pages 89-95.
Salz, J. and Winters, J. H. (1994). E ect of fading correlation on adaptive arrays in digital mobile radio. Transactions on Vehicular Technology, 43:1049-1057.
Shiu, D.-S., Foschini, G. J., Gans, M. J., and Kahn, J. M. (2000). Fading correlation and its e ect on the capacity of multielement antenna systems. Transactions on Communications, 48:502-513.
Vaughan, R. (2000). Spaced directive antennas for mobile communications by the fourier transform method. Transactions on Antennas and Propagation, 48:1025-1032.
Viswanathan, H. and Balakrishnan, J. (2002). Space6time signaling for high data rates in edge. Transactions on Vehicular Technology, 51:1522-1533.
Wong, K. T. and Wu, Y. I. (2009). Spatio-polarizational correlation-coe cient function between receiving-antennas in radiowave communications: Geometrically modeled, analytically derived, simple, closed-form, explicit formulas. IEEE Transactions on Com-
munications, 57:3566-3570.
Wu, K. T. and Tsaur, S.-A. (1994). Selection diversity for ds-ssma communications on nakagami fading channels. Transactions on Vehicular Technology, 43:428-438.
Younkins, L. T., Su, W., and Liu, K. J. R. (2006). On the robustness of space-time coding techniques based on a general space-time covariance model. IEEE Transactions on Vehicular Technology, 55:219-233.

Reference for chapter 5

Banfeild, J. D. and Raftery, A. E. (1993). Model-based gaussian and non-gaussian clustering. Biometrics, 49:803-821.
Kim, J. (2011). Clustering change patterns using fourier transformation with time-course gene expression data. Methods in molecular biology (Clifton, N.J.), 734:201-220.
Kim, J. and Kim, H. (2008). Clustering of change patterns using fourier coe cients. Bioinformatics, 24:184-191.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code