 Research Article
 Open Access
A Bayesian Analysis for Identifying DNA Copy Number Variations Using a Compound Poisson Process
 Jie Chen^{1}Email author,
 Ayten Yiğiter^{2},
 YuPing Wang^{3} and
 HongWen Deng^{4}
https://doi.org/10.1155/2010/268513
© Jie Chen et al. 2010
 Received: 3 May 2010
 Accepted: 6 August 2010
 Published: 17 August 2010
Abstract
To study chromosomal aberrations that may lead to cancer formation or genetic diseases, the arraybased Comparative Genomic Hybridization (aCGH) technique is often used for detecting DNA copy number variants (CNVs). Various methods have been developed for gaining CNVs information based on aCGH data. However, most of these methods make use of the logintensity ratios in aCGH data without taking advantage of other information such as the DNA probe (e.g., biomarker) positions/distances contained in the data. Motivated by the specific features of aCGH data, we developed a novel method that takes into account the estimation of a change point or locus of the CNV in aCGH data with its associated biomarker position on the chromosome using a compound Poisson process. We used a Bayesian approach to derive the posterior probability for the estimation of the CNV locus. To detect loci of multiple CNVs in the data, a sliding window process combined with our derived Bayesian posterior probability was proposed. To evaluate the performance of the method in the estimation of the CNV locus, we first performed simulation studies. Finally, we applied our approach to real data from aCGH experiments, demonstrating its applicability.
Keywords
 Fibroblast Cell Line
 Compound Poisson Process
 Homogeneous Poisson Process
 Slide Window Approach
 Circular Binary Segmentation
1. Introduction
Cancer progression, tumor formations, and many genetic diseases are related to aberrations in some chromosomal regions. Chromosomal aberrations are often reflected in DNA copy number changes, also known as copy number variations (CNVs) [1]. To study such chromosomal aberrations, experiments are often conducted based on tumor samples from a celllineusing technologies such as aCGH or SNP arrays. For instance, in aCGH experiments, a DNA test sample and a diploid reference sample are first fluorescently labeled by Cy3 and Cy5. Then, the samples are mixed and hybridized to the microarray. Finally, the image intensities from the test and reference samples can be obtained for all DNA probes (biomarkers) along the chromosome [2, 3]. The logbase2 ratios of the test and reference intensities, usually denoted as , are used to generate an aCGH profile [4]. To reduce noise, the Gaussiansmoothed profile is often used. With an appropriate normalization process, is viewed as a Gaussian distribution of mean 0 and variance [4, 5]. The deviation from mean 0 and variance in data may indicate a copy number change. Therefore, detecting DNA copy number changes becomes the problem of how to identify significant parameter changes occurred in the sequence of observations.
There are a number of computational and statistical methods developed for the detection of CNVs based on aCGH data and SNP data. Examples include a finite Gaussian mixture model [6], pair wise ttests [7], adaptive weights smoothing [8], circular binary segmentation (CBS) [4], hidden Markov modeling (HMM) [9], maximum likelihood estimation [10], and many others. A comparison between several of these methods for the analysis of aCGH data was given by Lai et al. [11]. There are continued efforts on developing methods for accurate detection of CNVs. Nannya et al. [12] developed a robust algorithm for copy number analysis of the human genome using highdensity oligonucleotide microarrays. Price et al. [13] adapted the SmithWaterman dynamic programming algorithm to provide a sensitive and robust approach (SWARRAY). More recently, Shah et al. [14] proposed a simple modification to the hidden Markov model (HMM) to make it be robust to outliers in aCGH data. Yu et al. [15] developed an edge detection algorithm for copy number analysis in SNP data. An algorithm called reversible jump aCGH (RJaCGH) for identifying copy number alterations was introduced in Rueda and DíazUriarte [16]. This RJaCGH algorithm is based on a nonhomogeneous HMM fitted by reversible jump MCMC using Bayesian approach. PiqueRegi et al. [17] proposed to use piecewise constant (PWC) vectors to represent genome copy number and used sparse Bayesian learning (SBL) to detect copy number alterations breakpoints. Rancoita et al. [18] provided an improved Bayesian regression method for data that are noisy observations of a piecewise constant function and used this method for CNV analysis. We have formulated the problem as a statistical changepoint detection [19] and proposed a mean and variance changepoint model (MVCM), which brought significant improvement over many existing methods such as the CBS proposed by Olshen et al. [4].
The abovementioned algorithms, however, have not taken advantage of other information such as the positions of the DNA probes or biomarkers along the chromosome. Recently, many researchers have begun to consider variations in the distance between biomarkers, gene density, and genomic features in the process of identifying increased or decreased chromosomal region of gene expression [5]. Several notable methods emerged along this line and we list a few of them here. Levin et al. [5] developed a scan statistics for detecting spatial clusters of genes on a chromosome based on gene positions and gene expression data modeled by a compound Poisson process on the basis of two independent simple Poisson processes. Daruwala et al. [20] developed a statistical algorithm for the detection of genomic aberrations in human cancer cell lines, where the location of aberrations in the copy numbers was modeled by a Poisson process. They distinguished genes as "regular" and "deviated", where the regular genes refer to those that have not been affected by chromosomal aberrations while the deviated genes are those whose logtransformed expression follows a Gaussian distribution with unknown mean and variance [20]. Sun et al. [21] developed a SNP association scan statistic similar to that of Levin et al. [5] using a compound Poisson process, which considers the complex distribution of genome variations in chromosomal regions with significant clusters of SNP associations.
Improvements have been made with the above more sophisticated modeling of the aCGH using both the logintensity ratios and biomarker positions. The computation involved in this type of modeling is usually demanding and further improvement is needed. Motivated by these existing works, we propose to use a compound Poisson process approach to model the genomic features in identifying chromosomal aberrations. We use a Bayesian approach to determine an aberration (or a changepoint) in the aCGH profile modeled by a compound Poisson process. In our model, the occurrences of the biomarkers are modeled by a homogeneous Poisson process and the aCGH is modeled by a Gaussian distribution. This novel method is able to identify the aberration corresponding to the CNVs with associated distance between biomarkers on the chromosome. The proposed method is inspired by the scan statistic [5, 21], which is widely used for identifying chromosomal aberrations. However, our method differs from the work of Levin et al. [5] in that our method uses a statistical changepoint model with a compound Poisson process for the identification of CNVs.
2. Methods
2.1. Modeling aCGH Data Using a Compound Poisson Change Point Model
where is the gamma function, and for a positive integer .
Note the fact that the distances s are iid exponential random variables can be used to verify the assumption on the occurrence of being a simple (homogeneous) Poisson process.
Given that is a homogeneous Poisson process and follow independent Gaussian (normal) distributions [5] with mean and variance , , is then defined by a compound Poisson process, where the , are independently and normally distributed with mean and variance , respectively. The number, , of biomarkers in each subinterval of length is distributed as a Poisson distribution with parameter (where represents the occurrence rate of biomarkers or SNPs corresponding to subinterval ) for .
where , and are unknown means, is unknown variance of the normal distribution, and , , and are unknown mean rates of biomarker occurrences in each subinterval. The goal of the study becomes to estimate the value of .
2.2. A Bayesian Analysis for Locating the Change Point
Based on the above theoretical results, we provide the computational implementation of our approach in the next subsection.
2.3. Computational Implementation of the Bayesian Approach
To implement our above Bayesian approach to real data, it is necessary to define the number, , of subintervals at first. Our numerical experiments show that the number, , of subintervals can be chosen such that each subinterval includes at least one observation (log ratio ) and at most 300 observations. The lengths, , , , and , of the subintervals can be chosen equally (in this case, the numbers of biomarkers contained in each subinterval are not equal). An easier option of choosing the length, , for subinterval is to have each subinterval to contain the same number of observations. From a practical point of view, the number of subintervals, , and the size of each subinterval can also be defined by users according to their prior knowledge about their data.
Although our approach was given for the single changepoint model in compound Poisson process, it can be easily extended to the multiple change points (or aberrations) by using a sliding window approach [21, 22]. Sun et al. [21] have taken the sliding window sizes as 3 to 10 consecutive markers in their application. Our numerical experiments suggest that the sliding window of sizes ranging from 12 to 35 subintervals should be effective in searching for multiple changes in the aCGH data based on our proposed Bayesian approach. To avoid intermediate edge problems within each window, the two adjacent windows have to overlap. Many of such issues were also discussed in [22]. For the searching of multiple change points with the sliding window approach, a practical question is how to set the threshold value for the maximum posterior probabilities associated with all windows. In our application, we used the heuristic threshold of 0.5 (which is popular in probability sense) for the maximum posterior probabilities.
 (1)
If it is known that a chromosome has potentially one aberration region, calculate the posterior probability (19) and identify the locus according to (20).
 (2)
If there are multiple aberration regions on a chromosome or genome, choose a total of sliding windows with sizes ranging from 12 to 35 such that each window contains exactly one potential aberration. Denote these windows by , ,..., , where equals the total number of observations on the chromosome.
 (3)
For window , determine the number of subintervals with lengths , , .
 (4)
Count the number of biomarkers, , in each subinterval with length , .
 (5)
Compute the posterior probabilities for using (19), find the maximum of the posterior probability distribution. If the maximum posterior probability is larger than 0.5 (or larger than a selected threshold according to practice) at , then identify according to (20).
 (6)
Convert the identified change position into the actual biomarker position , and declare as the position on the chromosome at which the CNV has changed.
 (7)
Repeat steps 3–6 above for , where J is determined by the final window size and the final window size is determined at the value for which the posterior probabilities stabilize.
The Matlab code of the BayesianCPCM approach has been written and is available upon readers' request.
3. Results
3.1. Simulation Results
Simulation results. In this table, μ_{1} = 0, λ_{1} = .0001, λ_{2} = .0005, δ = μ_{1}, λ = λ_{1}, and σ = .05.
When  When  

n  v 
 f  MSE  v 
 f  MSE 
3  2.8870  0.8210  0.4034  3  2.8960  0.8630  0.2903  
12  6  5.9710  0.9040  0.3774  6  5.9510  0.9070  0.4635 
9  8.7930  0.8560  1.6906  9  8.9130  0.8940  0.8038  
5  5.0010  0.9800  0.0230  5  5.0050  0.9910  0.0150  
20  10  10.0180  0.9800  0.0200  10  10.0110  0.9850  0.0150 
15  15.0090  0.9800  0.0310  15  15.0130  0.9810  0.0190  
8  8.0070  0.9930  0.0070  8  8.0040  0.9960  0.0040  
32  16  16.0020  0.9900  0.0100  16  16.0000  0.9980  0.0020 
24  24.0020  0.9960  0.0040  24  23.9980  0.9980  0.0020  
10  10.0020  0.9980  0.0020  10  10.0030  0.9970  0.0000  
40  20  20.0040  0.9960  0.0040  20  20.010  0.9990  0.0010 
30  30.0000  1.0000  0.0040  30  30.0010  0.9990  0.0010  
20  20.000  1.0000  0.0000  20  20.0000  1.0000  0.0000  
80  40  40.0000  1.0000  0.0000  40  40.0000  1.0000  0.0000 
60  60.0000  1.0000  0.0000  60  60.0000  1.0000  0.0000  
30  30.0030  0.9970  0.0030  30  30.0000  1.0000  0.0000  
120  60  60.0000  1.0000  0.0000  60  60.0000  1.0000  0.0000 
90  90.0000  1.0000  0.0000  90  90.0000  1.0000  0.0000 
The simulation results given in Table 1 indicate that the derived posterior probability (19) can identify changes in the front, the center and the end of the sequence, respectively, with very high certainty—at least 97% for sample sizes of 20 or larger. The average of the estimated locations is remarkably close to the true change locus with very small MSE. The proposed method can be confidently applied to the identification of DNA copy number changes.
3.2. Applications to aCGH Datasets on 9 Fibroblast Cell lines
Several aCGH experiments were performed on 15 fibroblast cell lines and the normalized averages of the (based on triplicate) along positions on each chromosome were available at the following website [23]: http://www.nature.com/ng/journal/v29/n3/full/ng754.html. For the missing values in the log ratio values, we imputed 0 into the original data. The DNA copy number alterations in each of the 15 fibroblast cell lines were verified by karyotyping [23]. Therefore, these 15 fibroblast cell lines aCGH datasets can be used as benchmark datasets to test our methods.
Results of the Bayesian approach on chromosomes with one change identified. The posterior probability shown is the maximum posterior probability for the chromosome.
Cell line  Chromosome  (kb) 


GM01535  chromosome 5  176824  .5237 
GM01750  chromosome 9  26000  .9666 
GM01750  chromosome 14  11545  .7867 
GM03563  chromosome 3  10524  .8808 
GM03563  chromosome 9  2646  1.000 
GM07081  chromosome 7  57971  .6390 
GM13330  chromosome 1  156276  .9994 
GM13330  chromosome 4  173943  .9999 
Results of the Bayesian approach on chromosomes with two changes identified. The posterior probability shown is the maximum posterior probability for the chromosome at the respective loci.
Cell line  Chromosome  (kb) 
 Window size 

GM01524  chromosome 6  74205, 145965  .9501, .7411  17 
GM03134  chromosome 8  99764, 146000  .9397, 9602  20 
GM05296  chromosome 10  64187, 110412  .7229, .8955  30 
GM05296  chromosome 11  34420, 43357  .8496, .9852  18 
GM13031  chromosome 17  50231, 58122  .9434, .7701  20 
3.3. Comparison of the Performances of the Proposed BayesianCPCM with CBS on the Fibroblast CellLines Datasets
There are many approaches (computational or statistical) now available for analyzing aCGH data in the relative literature. But many of those approaches, especially CBS [4], have targeted on modeling the log ratio intensity in aCGH data. Now, in this paper, we have used a new concept to model both the gene position and the log ratio intensity in aCGH data. That is, the most distinct feature of the proposed BayesianCPCM approach, among other existing methods in the literature, is its usage of the information of the gene positions (hence gene distances) and the log ratio intensities in the model.
Comparison of the changes found using CBS and the proposed BayesianCPCM on the nine fibroblast cell lines
Cell line/chromosome  CBS  BayesianCPCM approach  

α = 0.01  α = 0.001  
GM01524/6  Yes  Yes  Yes 
Number of false positives  6  2  0 
Specificity  72.7%  90.9%  100% 
Sensitivity  100%  100%  100% 
GM01535/5  Yes  Yes  Yes 
GM01535/12  No  No  No 
Number of false positives  2  0  0 
Specificity  90.5%  100%  100% 
Sensitivity  50%  50%  100% 
GM01750/9  Yes  Yes  Yes 
GM01750/14  Yes  Yes  Yes 
Number of false positives  1  0  0 
Specificity  95.2%  100%  100% 
Sensitivity  100%  100%  100% 
GM03134/8  Yes  Yes  Yes 
Number of false positives  3  1  3 
Specificity  86.4%  95.5%  97.9% 
Sensitivity  100%  100%  100% 
GM03563/3  Yes  Yes  Yes 
GM03563/9  No  No  Yes 
Number of false positives  8  5  0 
Specificity  61.9%  76.2%  100% 
Sensitivity  50%  50%  100% 
GM05296/10  Yes  Yes  Yes 
GM05296/11  Yes  Yes  Yes 
Number of false positives  3  0  2 
Specificity  88%  100%  99.3% 
Sensitivity  100%  100%  100% 
GM07081/7  Yes  Yes  Yes 
GM07081/15  No  No  No 
Number of false positives  1  0  0 
Specificity  95.2%  100%  100% 
Sensitivity  50%  50%  100% 
GM13031/17  Yes  Yes  Yes 
Number of false positives  5  3  1 
Specificity  79.2%  87.5%  98.8% 
Sensitivity  100%  100%  100% 
GM13330/1  Yes  Yes  Yes 
GM13330/4  Yes  Yes  Yes 
Number of false positives  8  5  0 
Specificity  61.9%  76.2%  100% 
Sensitivity  100%  100%  100% 
From Table 4, it is evident that the new BayesianCPCM approach can detect the CNV regions with highest specificities and sensitivities. The false positives of the BayesianCPCM on two of the chromosomes are due to outliers and noise in the original data.
It is worth noting that the CNV or aberration regions in these 9 fibroblast cell lines that were found using our proposed BayesianCPCM approach are also consistent with those identified in Olshen et al. [4], Chen and Wang [19], Venkatraman and Olshen [24]. However, our new approach, BayesianCPCM, neither involve heavy computations as that of CBS algorithm in Olshen et al. [4], nor any asymptotic distribution as required in our earlier work [19].
4. Conclusion
A Bayesian approach for identifying CNVs in aCGH profile modeled by a compound Poisson process is proposed in this paper. Theoretical results of the Bayesian analysis are obtained and the algorithm has been implemented with Matlab. Applications of the proposed method to several aCGH data sets have demonstrated its effectiveness. Extensive simulation results indicate that the proposed method can work effectively for various cases. The most distinct feature of the proposed BayesianCPCM approach, when compared with existing methods in the literature, is its use of both biomarker positions (hence distances) and the logintensity ratio information in the model. Another important aspect of the proposed approach is that it characterizes the posterior probability of the loci being a CNV. With the common knowledge of probability, the users can easily judge if there is a CNV at a locus by using the posterior probability together with their biological knowledge.
There are many computational and statistical approaches now available for analyzing aCGH data in the literature. But those approaches, especially the CBS of Olshen et al. [4] and MVCM of Chen and Wang [19], are all targeted on modeling the log ratio in aCGH data. In this paper, we have used a new approach to model both the biomarker position and the log ratio intensity in aCGH data. In other words, the most distinct feature of the proposed BayesianCPCM approach, among other existing methods, is the use of both biomarker position information (hence distances) and the logintensity ratios in the model. The size of the sliding window is very important in search multiple change points in a whole sequence. The criterion of choosing the optimal window size remains to be done in the future.
Declarations
Acknowledgments
Part of the paper was done while A. Yi iter was on leave from Hacettepe University and was a visiting scholar at the University of MissouriKansas City with financial support provided by the Scientific and Technological Research Council of Turkey (TUBITAK). J. Chen was supported in part by a 2009 University of Missouri Research Board (UMRB) research Grant. H.W. Deng was partially supported by grants from NIH (nos. P50 AR055081, R01AR050496, R01AR45349, and R01AG026564) and by Dickson/Missouri endowment.
Authors’ Affiliations
References
 Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, Andrews TD, Fiegler H, Shapero MH, Carson AR, Chen W, Cho EK, Dallaire S, Freeman JL, González JR, Gratacòs M, Huang J, Kalaitzopoulos D, Komura D, MacDonald JR, Marshall CR, Mei R, Montgomery L, Nishimura K, Okamura K, Shen F, Somerville MJ, Tchinda J, Valsesia A, Woodwark C, Yang F, Zhang J, Zerjal T, Zhang J, Armengol L, Conrad DF, Estivill X, TylerSmith C, Carter NP, Aburatani H, Lee C, Jones KW, Scherer SW, Hurles ME: Global variation in copy number in the human genome. Nature 2006, 444(7118):444454. 10.1038/nature05329View ArticleGoogle Scholar
 Pinkel D, Seagraves R, Sudar D, et al.: High resolution analysis of DNA copy number variation usingcomparative genomic hybridization to microarrays. Nature Genetics 1998, 20: 207211. 10.1038/2524View ArticleGoogle Scholar
 Pollack JR, Perou CM, Alizadeh AA, Eisen MB, Pergamenschikov A, Williams CF, Jeffrey SS, Botstein D, Brown PO: Genomewide analysis of DNA copynumber changes using cDNA microarrays. Nature Genetics 1999, 23(1):4146. 10.1038/12640View ArticleGoogle Scholar
 Olshen AB, Venkatraman ES, Lucito R, Wigler M: Circular binary segmentation for the analysis of arraybased DNA copy number data. Biostatistics 2004, 5(4):557572. 10.1093/biostatistics/kxh008View ArticleMATHGoogle Scholar
 Levin AM, Ghosh D, Cho KR, Kardia SLR: A modelbased scan statistic for identifying extreme chromosomal regions of gene expression in human tumors. Bioinformatics 2005, 21(12):28672874. 10.1093/bioinformatics/bti417View ArticleGoogle Scholar
 Hodgson G, Hager JH, Volik S, Hariono S, Wernick M, Moore D, Nowak N, Albertson DG, Pinkel D, Collins C, Hanahan D, Gray JW: Genome scanning with array CGH delineates regional alterations in mouse islet carcinomas. Nature Genetics 2001, 29: 459464. 10.1038/ng771View ArticleGoogle Scholar
 Pollack JR, Sørlie T, Perou CM, Rees CA, Jeffrey SS, Lonning PE, Tibshirani R, Botstein D, BørresenDale AL, Brown PO: Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors. Proceedings of the National Academy of Sciences of the United States of America 2002, 99(20):1296312968. 10.1073/pnas.162471999View ArticleGoogle Scholar
 Hupé P, Stransky N, Thiery JP, Radvanyi F, Barillot E: Analysis of array CGH data: from signal ratio to gain and loss of DNA regions. Bioinformatics 2004, 20(18):34133422. 10.1093/bioinformatics/bth418View ArticleGoogle Scholar
 Zhao X, Weir BA, LaFramboise T, Lin M, Beroukhim R, Garraway L, Beheshti J, Lee JC, Naoki K, Richards WG, Sugarbaker D, Chen F, Rubin MA, Jänne PA, Girard L, Minna J, Christiani D, Li C, Sellers WR, Meyerson M: Homozygous deletions and chromosome amplifications in human lung carcinomas revealed by single nucleotide polymorphism array analysis. Cancer Research 2005, 65(13):55615570. 10.1158/00085472.CAN044603View ArticleGoogle Scholar
 Picard F, Robin S, Lavielle M, Vaisse C, Daudin JJ: A statistical approach for array CGH data analysis. BMC Bioinformatics 2005., 6, article 27:Google Scholar
 Lai WR, Johnson MD, Kucherlapati R, Park PJ: Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data. Bioinformatics 2005, 21(19):37633770. 10.1093/bioinformatics/bti611View ArticleGoogle Scholar
 Nannya Y, Sanada M, Nakazaki K, et al.: A robust algorithm for copy number detection using highdensity oligonucleotide single nucleotide polymorphism genotyping arrays. Cancer Research 2005, 65: 60716079. 10.1158/00085472.CAN050465View ArticleGoogle Scholar
 Price TS, Regan R, Mott R, Hedman Å, Honey B, Daniels RJ, Smith L, Greenfield A, Tiganescu A, Buckle V, Ventress N, Ayyub H, Salhan A, PedrazaDiaz S, Broxholme J, Ragoussis J, Higgs DR, Flint J, Knight SJL: SWARRAY: a dynamic programming solution for the identification of copynumber changes in genomic DNA using array comparative genome hybridization data. Nucleic Acids Research 2005, 33(11):34553464. 10.1093/nar/gki643View ArticleGoogle Scholar
 Shah SP, Xuan X, DeLeeuw RJ, Khojasteh M, Lam WL, Ng R, Murphy KP: Integrating copy number polymorphisms into array CGH analysis using a robust HMM. Bioinformatics 2006, 22(14):e431e439. 10.1093/bioinformatics/btl238View ArticleGoogle Scholar
 Yu T, Ye H, Sun W, Li KC, Chen Z, Jacobs S, Bailey DK, Wong DT, Zhou X: A forwardbackward fragment assembling algorithm for the identification of genomic amplification and deletion breakpoints using highdensity single nucleotide polymorphism (SNP) array. BMC Bioinformatics 2007., 8, article 145:Google Scholar
 Rueda OM, DíazUriarte R: Flexible and accurate detection of genomic copynumber changes from aCGH. PLoS Computational Biology 2007, 3(6):11151122.MathSciNetView ArticleGoogle Scholar
 PiqueRegi R, MonsoVarona J, Ortega A, Seeger RC, Triche TJ, Asgharzadeh S: Sparse representation and Bayesian detection of genome copy number alterations from microarray data. Bioinformatics 2008, 24(3):309318. 10.1093/bioinformatics/btm601View ArticleGoogle Scholar
 Rancoita PMV, Hutter M, Bertoni F, Kwee I: Bayesian DNA copy number analysis. BMC Bioinformatics 2009., 10, article 10:Google Scholar
 Chen J, Wang YP: A statistical change point model approach for the detection of DNA copy number variations in array CGH data. IEEE/ACM Transactions on Computational Biology and Bioinformatics 2009, 6: 529541.View ArticleGoogle Scholar
 Daruwala RS, Rudra A, Ostrer H, Lucito R, Wigler M, Mishra B: A versatile statistical analysis algorithm to detect genome copy number variation. Proceedings of the National Academy of Sciences of the United States of America 2004, 101(46):1629216297. 10.1073/pnas.0407247101View ArticleGoogle Scholar
 Sun YV, Levin AM, Boerwinkle E, Robertson H, Kardia SLR: A scan statistic for identifying chromosomal patterns of SNP association. Genetic Epidemiology 2006, 30(7):627635. 10.1002/gepi.20173View ArticleGoogle Scholar
 Ramensky VE, Makeev VJu, Roytberg MA, Tumanyan VG: DNA segmentation through the Bayesian approach. Journal of Computational Biology 2000, 7(12):215231. 10.1089/10665270050081487View ArticleMATHGoogle Scholar
 Snijders AM, Nowak N, Segraves R, Blackwood S, Brown N, Conroy J, Hamilton G, Hindle AK, Huey B, Kimura K, Law S, Myambo K, Palmer J, Ylstra B, Yue JP, Gray JW, Jain AN, Pinkel D, Albertson DG: Assembly of microarrays for genomewide measurement of DNA copy number. Nature Genetics 2001, 29(3):263264. 10.1038/ng754View ArticleGoogle Scholar
 Venkatraman ES, Olshen AB: A faster circular binary segmentation algorithm for the analysis of array CGH data. Bioinformatics 2007, 23(6):657663. 10.1093/bioinformatics/btl646View ArticleGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.