Abstract and Applied Analysis

Analysis of Similarity/Dissimilarity of DNA Sequences Based on Chaos Game Representation

Wei Deng and Yihui Luan

Full-text: Open access

Abstract

The Chaos Game is an algorithm that can allow one to produce pictures of fractal structures. Considering that the four bases A, G, C, and T of DNA sequences can be divided into three classes according to their chemical structure, we propose different kinds of CGR-walk sequences. Based on CGR coordinates of random sequences, we introduce some invariants for the DNA primary sequences. As an application, we can make the examination of similarity/dissimilarity among the first exon of β-globin gene of different species. The results indicate that our method is efficient and can get more biological information.

Article information

Source
Abstr. Appl. Anal., Volume 2013, Special Issue (2013), Article ID 926519, 6 pages.

Dates
First available in Project Euclid: 26 February 2014

Permanent link to this document
https://projecteuclid.org/euclid.aaa/1393449790

Digital Object Identifier
doi:10.1155/2013/926519

Zentralblatt MATH identifier
1272.92041

Citation

Deng, Wei; Luan, Yihui. Analysis of Similarity/Dissimilarity of DNA Sequences Based on Chaos Game Representation. Abstr. Appl. Anal. 2013, Special Issue (2013), Article ID 926519, 6 pages. doi:10.1155/2013/926519. https://projecteuclid.org/euclid.aaa/1393449790


Export citation

References

  • C. K. Peng, S. V. Buldyrev, A. L. Goldberger et al., “Long-range correlations in nucleotide sequences,” Nature, vol. 356, no. 6365, pp. 168–170, 1992.
  • J. S. Almeida, J. A. Carriço, A. Maretzek, P. A. Noble, and M. Fletcher, “Analysis of genomic sequences by Chaos Game Representation,” Bioinformatics, vol. 17, no. 5, pp. 429–437, 2001.
  • H. J. Jeffrey, “Chaos game representation of gene structure,” Nucleic Acids Research, vol. 18, no. 8, pp. 2163–2170, 1990.
  • S. V. Buldyrev, N. V. Dokholyan, A. L. Goldberger et al., “Analysis of DNA sequences using methods of statistical physics,” Physica A, vol. 249, no. 1–4, pp. 430–438, 1998.
  • S. V. Buldyrev, A. L. Goldberger, S. Havlin, C. K. Peng, H. E. Stanley, and G. M. Visvanathan, Fractals in Biology and Medicine: from DNA To the Heartbeat, Springer, Berlin, Germany, 1994.
  • S. V. Buldyrev, A. L. Goldberger, S. Havlin, C.K. Peng, M. Simons, and H. E. Stanley, “Generalized Lévy-walk model for DNA nucleotide sequences,” Physical Eeview E, vol. 47, no. 6, pp. 4514–4523, 1993.
  • S. V. Buldyrev, A. L. Goldberger, S. Havlin et al., “Long-range correlation properties of coding and noncoding DNA sequences: GenBank analysis,” Physical Review E, vol. 51, no. 5, pp. 5084–5091, 1995.
  • G. Dodin, P. Vandergheynst, P. Levoir, C. Cordier, and L. Marcourt, “Fourier and wavelet transform analysis, a tool for visualizing regular patterns in DNA sequences,” Journal of Theoretical Biology, vol. 206, no. 3, pp. 323–326, 2000.
  • A. A. Tsonis, P. Kumar, and J. B. Elsneretal, “Navelet analysis of DNA sequences,” Physical Review E, vol. 53, pp. 1828–1834, 1996.
  • L. F. Luo, L. Tsai, and Y. M. Zhou, “Informational parameters of nucleic acid and molecular evolution,” Journal of Theoretical Biology, vol. 130, no. 3, pp. 351–361, 1988.
  • L. F. Luo and L. Tsai, “Fractal dimension of nucleic acid and its relation to evolutionary level,” Chemical Physics Letters, vol. 5, pp. 421–424, 1988.
  • A. Arneodo, Y. D'Aubenton-Carafa, E. Bacry, P. V. Graves, J. F. Muzy, and C. Thermes, “Wavelet based fractal analysis of DNA sequences,” Physica D, vol. 96, no. 1–4, pp. 291–320, 1996.
  • F.-L. Bai, Y.-Z. Liu, and T.-M. Wang, “A representation of DNA primary sequences by random walk,” Mathematical Biosciences, vol. 209, no. 1, pp. 282–291, 2007.
  • E. Hamori and J. Ruskin, “H curves, a novel method of representation of nucleotide series especially suited for long DNA sequences,” The Journal of Biological Chemistry, vol. 258, no. 2, pp. 1318–1327, 1983.
  • R. Zhang and C. T. Zhang, “Z-curve, an intuitive tool for visualizing and analyzing the DNA sequences,” Journal of Biomolecular Structure & Dynamics, vol. 11, pp. 767–782, 1994.
  • X. F. Guo, M. Randic, and S. C. Basak, “A novel 2-D graphical representation of DNA sequences of low degeneracy,” Chemical Physics Letters, vol. 350, no. 1-2, pp. 106–112, 2001.
  • M. Randic, “Graphical representations of DNA as 2-D map,” Chemical Physics Letters, vol. 386, pp. 468–471, 2004.
  • G. H. Huang, B. Liao, Y. F. Liu, and Z. B. Liu, “HCL curve: a novel 2D graphical representation for DNA sequences,” Chemical Physics Letters, vol. 462, pp. 129–132, 2008.
  • A. Nandy and P. Nandy, “On the uniqueness of quantitative DNA difference descriptions in 2D graphical representation models,” Chemical Physics Letters, vol. 368, no. 1-2, pp. 102–107, 2003.
  • M. Randic, M. Vracko, N. Lers, and D. Plavsic, “Analysis of similarity/dissimilarity of DNA sequences based on novel 2-D graphical representation,” Chemical Physics Letters, vol. 371, pp. 202–207, 2003.
  • Y. Yao and T. Wang, “A class of new 2-D graphical represent ation of DNA sequences and their application,” Chemical Physics Letters, vol. 398, pp. 318–323, 2004.
  • B. Liao and K. Ding, “A 3D graphical representation of DNA sequences and its application,” Theoretical Computer Science, vol. 358, no. 1, pp. 56–64, 2006.
  • Z. Cao, B. Liao, and R. Li, “A group of 3D graphical representation of DNA sequences based on dual nucleotides,” International Journal of Quantum Chemistry, vol. 108, no. 9, pp. 1485–1490, 2008.
  • Y. Huang and T. Wang, “New graphical representation of a DNA sequence based on the ordered dinucleotides and its application to sequence analysis,” International Journal of Quantum Chemistry, vol. 112, pp. 1746–1757, 2012.
  • B. Liao, Y. Zhang, K. Ding, and T. M. Wang, “Analysis of similarity/dissimilarity of DNA sequences based on a condensed curve representation,” Journal of Molecular Structure, vol. 717, no. 1–3, pp. 199–203, 2005.
  • R. Chi and K. Ding, “Novel 4D numerical representation of DNA sequences,” Chemical Physics Letters, vol. 407, no. 1-3, pp. 63–67, 2005.
  • B. Liao, R. Li, W. Zhu, and X. Xiang, “On the similarity of DNA primary sequences based on 5-D representation,” Journal of Mathematical Chemistry, vol. 42, no. 1, pp. 47–57, 2007.
  • B. Liao and T. M. Wang, “Analysis of similarity/dissimilarity of DNA sequences based on nonoverlapping triplets of nucleotide bases,” Journal of Chemical Information and Computer Sciences, vol. 44, no. 5, pp. 1666–1670, 2004.
  • J. Gao and Z. Y. Xu, “Chaos game representation (CGR)-walk model for DNA sequences,” Chinese Physics B, vol. 18, no. 1, pp. 370–376, 2009.
  • Z. G. Yu and V. Anh, “Time series model based on global structure of complete genome,” Chaos, Solitons and Fractals, vol. 12, no. 10, pp. 1827–1834, 2001.
  • L. L. Jiang, Z. Y. Xu, and J. Gao, “Multifractal hurst analysis of DNA sequence,” China Journal of Bioinformatics, vol. 7, no. 4, pp. 264–267, 2009.