Skip to main content
  • Research Article
  • Open access
  • Published:

A Novel Signal Processing Measure to Identify Exact and Inexact Tandem Repeat Patterns in DNA Sequences

Abstract

The identification and analysis of repetitive patterns are active areas of biological and computational research. Tandem repeats in telomeres play a role in cancer and hypervariable trinucleotide tandem repeats are linked to over a dozen major neurodegenerative genetic disorders. In this paper, we present an algorithm to identify the exact and inexact repeat patterns in DNA sequences based on orthogonal exactly periodic subspace decomposition technique. Using the new measure our algorithm resolves the problems like whether the repeat pattern is of period or its multiple (i.e., 2, 3, etc.), and several other problems that were present in previous signal-processing-based algorithms. We present an efficient algorithm of , where is the length of DNA sequence and is the window length, for identifying repeats. The algorithm operates in two stages. In the first stage, each nucleotide is analyzed separately for periodicity, and in the second stage, the periodic information of each nucleotide is combined together to identify the tandem repeats. Datasets having exact and inexact repeats were taken up for the experimental purpose. The experimental result shows the effectiveness of the approach.

[1234567891011121314151617181920]

References

  1. Hahn WC: Telomerase and cancer: where and when? Clinical Cancer Research 2001, 7(10):2953-2954.

    Google Scholar 

  2. Sinden RR, Potaman VN, Oussatcheva EA, Pearson CE, Lyubchenko YL, Shlyakhtenko LS: Triplet repeat DNA structures and human genetic disease: dynamic mutations from dynamic DNA. Journal of Biosciences 2002, 27(1, supplement 1):53-65. 10.1007/BF02703683

    Article  Google Scholar 

  3. Siyanova EY, Mirkin SM: Expansion of trinucleotide repeats. Molecular Biology 2001, 35(2):168-182. 10.1023/A:1010431232481

    Article  Google Scholar 

  4. Tamaki K, Jeffreys AJ: Human tandem repeat sequences in forensic DNA typing. Legal Medicine 2005, 7(4):244-250. 10.1016/j.legalmed.2005.02.002

    Article  Google Scholar 

  5. Benson G: Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Research 1999, 27(2):573-580. 10.1093/nar/27.2.573

    Article  MathSciNet  Google Scholar 

  6. Kurtz S, Choudhuri JV, Ohlebusch E, Schleiermacher C, Stoye J, Giegerich R: REPuter: the manifold applications of repeat analysis on a genomic scale. Nucleic Acids Research 2001, 29(22):4633-4642. 10.1093/nar/29.22.4633

    Article  Google Scholar 

  7. Kolpakov R, Bana G, Kucherov G: mreps: efficient and flexible detection of tandem repeats in DNA. Nucleic Acids Research 2003, 31(13):3672-3678. 10.1093/nar/gkg617

    Article  Google Scholar 

  8. Landau GM, Schmidt JP, Sokol D: An algorithm for approximate tandem repeats. Journal of Computational Biology 2001, 8(1):1-18. 10.1089/106652701300099038

    Article  Google Scholar 

  9. Adebiyi EF, Jiang T, Kaufmann M: An efficient algorithm for finding short approximate non-tandem repeats. Bioinformatics 2001, 17(1):S5-S12. 10.1093/bioinformatics/17.suppl_1.S5

    Article  Google Scholar 

  10. Hauth AM, Joseph DA: Beyond tandem repeats: complex pattern structures and distant regions of similarity. Bioinformatics 2002, 18(1):S31-S37. 10.1093/bioinformatics/18.suppl_1.S31

    Article  Google Scholar 

  11. Sharma D, Issac B, Raghava GPS, Ramaswamy R: Spectral repeat finders (SRF): identification of repetitive sequences using Fourier transformation. Bioinformatics 2004, 20(9):1405-1412. 10.1093/bioinformatics/bth103

    Article  Google Scholar 

  12. Tran TT, Emanuele VA II, Zhou GT: Techniques for detecting approximate tandem repeats in DNA. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '04), Montreal, Quebec, Canada, May 2004 5: 449-452.

    Google Scholar 

  13. Buchner M, Janjarasjitt S: Detection and visualization of tandem repeats in DNA sequences. IEEE Transactions on Signal Processing 2003, 51(9):2280-2287. 10.1109/TSP.2003.815396

    Article  MathSciNet  Google Scholar 

  14. Muresan DD, Parks TW: Orthogonal, exactly periodic subspace decomposition. IEEE Transactions on Signal Processing 2003, 51(9):2270-2279. 10.1109/TSP.2003.815381

    Article  MathSciNet  Google Scholar 

  15. Anastassiou D: Genomic signal processing. IEEE Signal Processing Magazine 2001, 18(4):8-20. 10.1109/79.939833

    Article  Google Scholar 

  16. Tiwari S, Ramachandran S, Bhattacharya A, Bhattacharya S, Ramaswamy R: Prediction of probable genes by Fourier analysis of genomic sequences. Computer Applications in the Biosciences 1997, 13(3):263-270.

    Google Scholar 

  17. Otten AD, Tapscott SJ: Triplet repeat expansion in myotonic dystrophy alters the adjacent chromatin structure. Proceedings of the National Academy of Sciences of the United States of America 1995, 92(12):5465-5469. 10.1073/pnas.92.12.5465

    Article  Google Scholar 

  18. Benson G: Tandem Repeat Finder. http://tandem.bu.edu/trf/trf.html

  19. Hauth AM: Identification of tandem repeats simple and complex pattern structures in DNA, Ph.D. dissertation.

  20. Bussey H, Kaback DB, Zhong W, et al.: The nucleotide sequence of chromosome I from Saccharomyces cerevisiae. Proceedings of the National Academy of Sciences of the United States of America 1995, 92(9):3809-3813. 10.1073/pnas.92.9.3809

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ravi Gupta.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Gupta, R., Sarthi, D., Mittal, A. et al. A Novel Signal Processing Measure to Identify Exact and Inexact Tandem Repeat Patterns in DNA Sequences. J Bioinform Sys Biology 2007, 43596 (2007). https://doi.org/10.1155/2007/43596

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1155/2007/43596

Keywords