- Research Article
- Open Access
A Novel Signal Processing Measure to Identify Exact and Inexact Tandem Repeat Patterns in DNA Sequences
EURASIP Journal on Bioinformatics and Systems Biology volume 2007, Article number: 43596 (2007)
The identification and analysis of repetitive patterns are active areas of biological and computational research. Tandem repeats in telomeres play a role in cancer and hypervariable trinucleotide tandem repeats are linked to over a dozen major neurodegenerative genetic disorders. In this paper, we present an algorithm to identify the exact and inexact repeat patterns in DNA sequences based on orthogonal exactly periodic subspace decomposition technique. Using the new measure our algorithm resolves the problems like whether the repeat pattern is of period or its multiple (i.e., 2, 3, etc.), and several other problems that were present in previous signal-processing-based algorithms. We present an efficient algorithm of , where is the length of DNA sequence and is the window length, for identifying repeats. The algorithm operates in two stages. In the first stage, each nucleotide is analyzed separately for periodicity, and in the second stage, the periodic information of each nucleotide is combined together to identify the tandem repeats. Datasets having exact and inexact repeats were taken up for the experimental purpose. The experimental result shows the effectiveness of the approach.
Hahn WC: Telomerase and cancer: where and when? Clinical Cancer Research 2001, 7(10):2953-2954.
Sinden RR, Potaman VN, Oussatcheva EA, Pearson CE, Lyubchenko YL, Shlyakhtenko LS: Triplet repeat DNA structures and human genetic disease: dynamic mutations from dynamic DNA. Journal of Biosciences 2002, 27(1, supplement 1):53-65. 10.1007/BF02703683
Siyanova EY, Mirkin SM: Expansion of trinucleotide repeats. Molecular Biology 2001, 35(2):168-182. 10.1023/A:1010431232481
Tamaki K, Jeffreys AJ: Human tandem repeat sequences in forensic DNA typing. Legal Medicine 2005, 7(4):244-250. 10.1016/j.legalmed.2005.02.002
Benson G: Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Research 1999, 27(2):573-580. 10.1093/nar/27.2.573
Kurtz S, Choudhuri JV, Ohlebusch E, Schleiermacher C, Stoye J, Giegerich R: REPuter: the manifold applications of repeat analysis on a genomic scale. Nucleic Acids Research 2001, 29(22):4633-4642. 10.1093/nar/29.22.4633
Kolpakov R, Bana G, Kucherov G: mreps: efficient and flexible detection of tandem repeats in DNA. Nucleic Acids Research 2003, 31(13):3672-3678. 10.1093/nar/gkg617
Landau GM, Schmidt JP, Sokol D: An algorithm for approximate tandem repeats. Journal of Computational Biology 2001, 8(1):1-18. 10.1089/106652701300099038
Adebiyi EF, Jiang T, Kaufmann M: An efficient algorithm for finding short approximate non-tandem repeats. Bioinformatics 2001, 17(1):S5-S12. 10.1093/bioinformatics/17.suppl_1.S5
Hauth AM, Joseph DA: Beyond tandem repeats: complex pattern structures and distant regions of similarity. Bioinformatics 2002, 18(1):S31-S37. 10.1093/bioinformatics/18.suppl_1.S31
Sharma D, Issac B, Raghava GPS, Ramaswamy R: Spectral repeat finders (SRF): identification of repetitive sequences using Fourier transformation. Bioinformatics 2004, 20(9):1405-1412. 10.1093/bioinformatics/bth103
Tran TT, Emanuele VA II, Zhou GT: Techniques for detecting approximate tandem repeats in DNA. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '04), Montreal, Quebec, Canada, May 2004 5: 449-452.
Buchner M, Janjarasjitt S: Detection and visualization of tandem repeats in DNA sequences. IEEE Transactions on Signal Processing 2003, 51(9):2280-2287. 10.1109/TSP.2003.815396
Muresan DD, Parks TW: Orthogonal, exactly periodic subspace decomposition. IEEE Transactions on Signal Processing 2003, 51(9):2270-2279. 10.1109/TSP.2003.815381
Anastassiou D: Genomic signal processing. IEEE Signal Processing Magazine 2001, 18(4):8-20. 10.1109/79.939833
Tiwari S, Ramachandran S, Bhattacharya A, Bhattacharya S, Ramaswamy R: Prediction of probable genes by Fourier analysis of genomic sequences. Computer Applications in the Biosciences 1997, 13(3):263-270.
Otten AD, Tapscott SJ: Triplet repeat expansion in myotonic dystrophy alters the adjacent chromatin structure. Proceedings of the National Academy of Sciences of the United States of America 1995, 92(12):5465-5469. 10.1073/pnas.92.12.5465
Benson G: Tandem Repeat Finder. http://tandem.bu.edu/trf/trf.html
Hauth AM: Identification of tandem repeats simple and complex pattern structures in DNA, Ph.D. dissertation.
Bussey H, Kaback DB, Zhong W, et al.: The nucleotide sequence of chromosome I from Saccharomyces cerevisiae. Proceedings of the National Academy of Sciences of the United States of America 1995, 92(9):3809-3813. 10.1073/pnas.92.9.3809
About this article
Cite this article
Gupta, R., Sarthi, D., Mittal, A. et al. A Novel Signal Processing Measure to Identify Exact and Inexact Tandem Repeat Patterns in DNA Sequences. J Bioinform Sys Biology 2007, 43596 (2007) doi:10.1155/2007/43596
- Signal Processing
- Tandem Repeat
- System Biology
- Processing Measure
- Repeat Pattern