Skip to main content

Advertisement

A Novel Signal Processing Measure to Identify Exact and Inexact Tandem Repeat Patterns in DNA Sequences

Article metrics

  • 1396 Accesses

  • 13 Citations

Abstract

The identification and analysis of repetitive patterns are active areas of biological and computational research. Tandem repeats in telomeres play a role in cancer and hypervariable trinucleotide tandem repeats are linked to over a dozen major neurodegenerative genetic disorders. In this paper, we present an algorithm to identify the exact and inexact repeat patterns in DNA sequences based on orthogonal exactly periodic subspace decomposition technique. Using the new measure our algorithm resolves the problems like whether the repeat pattern is of period or its multiple (i.e., 2, 3, etc.), and several other problems that were present in previous signal-processing-based algorithms. We present an efficient algorithm of , where is the length of DNA sequence and is the window length, for identifying repeats. The algorithm operates in two stages. In the first stage, each nucleotide is analyzed separately for periodicity, and in the second stage, the periodic information of each nucleotide is combined together to identify the tandem repeats. Datasets having exact and inexact repeats were taken up for the experimental purpose. The experimental result shows the effectiveness of the approach.

[1234567891011121314151617181920]

References

  1. 1.

    Hahn WC: Telomerase and cancer: where and when? Clinical Cancer Research 2001, 7(10):2953-2954.

  2. 2.

    Sinden RR, Potaman VN, Oussatcheva EA, Pearson CE, Lyubchenko YL, Shlyakhtenko LS: Triplet repeat DNA structures and human genetic disease: dynamic mutations from dynamic DNA. Journal of Biosciences 2002, 27(1, supplement 1):53-65. 10.1007/BF02703683

  3. 3.

    Siyanova EY, Mirkin SM: Expansion of trinucleotide repeats. Molecular Biology 2001, 35(2):168-182. 10.1023/A:1010431232481

  4. 4.

    Tamaki K, Jeffreys AJ: Human tandem repeat sequences in forensic DNA typing. Legal Medicine 2005, 7(4):244-250. 10.1016/j.legalmed.2005.02.002

  5. 5.

    Benson G: Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Research 1999, 27(2):573-580. 10.1093/nar/27.2.573

  6. 6.

    Kurtz S, Choudhuri JV, Ohlebusch E, Schleiermacher C, Stoye J, Giegerich R: REPuter: the manifold applications of repeat analysis on a genomic scale. Nucleic Acids Research 2001, 29(22):4633-4642. 10.1093/nar/29.22.4633

  7. 7.

    Kolpakov R, Bana G, Kucherov G: mreps: efficient and flexible detection of tandem repeats in DNA. Nucleic Acids Research 2003, 31(13):3672-3678. 10.1093/nar/gkg617

  8. 8.

    Landau GM, Schmidt JP, Sokol D: An algorithm for approximate tandem repeats. Journal of Computational Biology 2001, 8(1):1-18. 10.1089/106652701300099038

  9. 9.

    Adebiyi EF, Jiang T, Kaufmann M: An efficient algorithm for finding short approximate non-tandem repeats. Bioinformatics 2001, 17(1):S5-S12. 10.1093/bioinformatics/17.suppl_1.S5

  10. 10.

    Hauth AM, Joseph DA: Beyond tandem repeats: complex pattern structures and distant regions of similarity. Bioinformatics 2002, 18(1):S31-S37. 10.1093/bioinformatics/18.suppl_1.S31

  11. 11.

    Sharma D, Issac B, Raghava GPS, Ramaswamy R: Spectral repeat finders (SRF): identification of repetitive sequences using Fourier transformation. Bioinformatics 2004, 20(9):1405-1412. 10.1093/bioinformatics/bth103

  12. 12.

    Tran TT, Emanuele VA II, Zhou GT: Techniques for detecting approximate tandem repeats in DNA. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '04), Montreal, Quebec, Canada, May 2004 5: 449-452.

  13. 13.

    Buchner M, Janjarasjitt S: Detection and visualization of tandem repeats in DNA sequences. IEEE Transactions on Signal Processing 2003, 51(9):2280-2287. 10.1109/TSP.2003.815396

  14. 14.

    Muresan DD, Parks TW: Orthogonal, exactly periodic subspace decomposition. IEEE Transactions on Signal Processing 2003, 51(9):2270-2279. 10.1109/TSP.2003.815381

  15. 15.

    Anastassiou D: Genomic signal processing. IEEE Signal Processing Magazine 2001, 18(4):8-20. 10.1109/79.939833

  16. 16.

    Tiwari S, Ramachandran S, Bhattacharya A, Bhattacharya S, Ramaswamy R: Prediction of probable genes by Fourier analysis of genomic sequences. Computer Applications in the Biosciences 1997, 13(3):263-270.

  17. 17.

    Otten AD, Tapscott SJ: Triplet repeat expansion in myotonic dystrophy alters the adjacent chromatin structure. Proceedings of the National Academy of Sciences of the United States of America 1995, 92(12):5465-5469. 10.1073/pnas.92.12.5465

  18. 18.

    Benson G: Tandem Repeat Finder. http://tandem.bu.edu/trf/trf.html

  19. 19.

    Hauth AM: Identification of tandem repeats simple and complex pattern structures in DNA, Ph.D. dissertation.

  20. 20.

    Bussey H, Kaback DB, Zhong W, et al.: The nucleotide sequence of chromosome I from Saccharomyces cerevisiae. Proceedings of the National Academy of Sciences of the United States of America 1995, 92(9):3809-3813. 10.1073/pnas.92.9.3809

Download references

Author information

Correspondence to Ravi Gupta.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Gupta, R., Sarthi, D., Mittal, A. et al. A Novel Signal Processing Measure to Identify Exact and Inexact Tandem Repeat Patterns in DNA Sequences. J Bioinform Sys Biology 2007, 43596 (2007) doi:10.1155/2007/43596

Download citation

Keywords

  • Signal Processing
  • Tandem Repeat
  • System Biology
  • Processing Measure
  • Repeat Pattern