||The human cytomegalovirus (CMV) is one of the viruses which extensively infect in the world. In order to grow and reproduce, the CMV invades designated cellular lives and influences their behavior. The origin of replication (also called the replication origin) is a particular sequence in the CMV DNA genome at which replication is initiated. In this study, we develop some statistical tests of complementary palindromes, which can be applied to narrow the search for replication origin of the CMV DNA sequence.|
Let X_(2k) be the number of complementary palindromes with length 2k and Y_(2k) be the number of non-covered complementary palindromes with length 2k inside a given DNA sequence. Consider the null hypothesis that the marginal probabilities of the four nucleotides remain the same (1/4) over the given sequence versus the alternative hypothesis that the marginal probabilities are different. The likelihood ratio test based on the joint distributions of Y_(18) and Y_(2k) | (Y_(2(k+1)), ...,Y_(18)), where k=1, ..., 8, under the null and the alternative hypotheses are derived. The null distribution of the test statistic is approximated by a scaled chi-squared distribution. The scale parameter and the degree of freedom are estimated by the method of moments. The Pearson's chi-squared test based on the marginal distributions of X_(2k), where k=1, ..., 9. The null distribution of the test statistic is also approximated by a scaled chi-squared distribution. There is an another focus about ratios statistics X_(2k)/X_(2(k+1)) and Y_(2k)/Y_(2(k+1)), which approximate a specific value under the null hypotheses. Simulation studies are performed to confirm the theoretical findings.