Normalization Benefits Microarray-Based Classification

Hua, Jianping; Balagurunathan, Yoganand; Chen, Yidong; Lowey, James; Bittner, Michael L; Xiong, Zixiang; Suh, Edward; Dougherty, Edward R

doi:10.1155/BSB/2006/43056

Research Article
Open access
Published: 24 August 2006

Normalization Benefits Microarray-Based Classification

Jianping Hua¹,
Yoganand Balagurunathan¹,
Yidong Chen²,
James Lowey¹,
Michael L Bittner¹,
Zixiang Xiong³,
Edward Suh¹ &
…
Edward R Dougherty^1,3

EURASIP Journal on Bioinformatics and Systems Biology volume 2006, Article number: 43056 (2006) Cite this article

3155 Accesses
9 Citations
3 Altmetric
Metrics details

Abstract

When using cDNA microarrays, normalization to correct labeling bias is a common preliminary step before further data analysis is applied, its objective being to reduce the variation between arrays. To date, assessment of the effectiveness of normalization has mainly been confined to the ability to detect differentially expressed genes. Since a major use of microarrays is the expression-based phenotype classification, it is important to evaluate microarray normalization procedures relative to classification. Using a model-based approach, we model the systemic-error process to generate synthetic gene-expression values with known ground truth. These synthetic expression values are subjected to typical normalization methods and passed through a set of classification rules, the objective being to carry out a systematic study of the effect of normalization on classification. Three normalization methods are considered: offset, linear regression, and Lowess regression. Seven classification rules are considered: 3-nearest neighbor, linear support vector machine, linear discriminant analysis, regular histogram, Gaussian kernel, perceptron, and multiple perceptron with majority voting. The results of the first three are presented in the paper, with the full results being given on a complementary website. The conclusion from the different experiment models considered in the study is that normalization can have a significant benefit for classification under difficult experimental conditions, with linear and Lowess regression slightly outperforming the offset method.

[1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20]

References

Quackenbush J: Microarray data normalization and transformation. Nature Genetics 2002,32(5 supplement):496-501.
Article Google Scholar
Bilban M, Buehler LK, Head S, Desoye G, Quaranta V: Normalizing DNA microarray data. Current Issues in Molecular Biology 2002,4(2):57-64.
Google Scholar
Attoor S, Dougherty ER, Chen Y, Bittner ML, Trent JM: Which is better for cDNA-microarray-based classification: ratios or direct intensities. Bioinformatics 2004,20(16):2513-2520. 10.1093/bioinformatics/bth272
Article Google Scholar
Chen Y, Kamat V, Dougherty ER, Bittner ML, Meltzer PS, Trent JM: Ratio statistics of gene expression levels and applications to microarray data analysis. Bioinformatics 2002,18(9):1207-1215. 10.1093/bioinformatics/18.9.1207
Article Google Scholar
Yang YH, Dudoit S, Luu P, et al.: Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Research 2002,30(4):e15. 10.1093/nar/30.4.e15
Article Google Scholar
Tseng GC, Oh M-K, Rohlin L, Liao JC, Wong WH: Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. Nucleic Acids Research 2001,29(12):2549-2557. 10.1093/nar/29.12.2549
Article Google Scholar
Devroye L, Gyorfi L, Lugosi G: A Probabilistic Theory of Pattern Recognition. Springer, New York, NY, USA; 1996.
Book MATH Google Scholar
Vapnik VN: Statistical Learning Theory. John Wiley & Sons, New York, NY, USA; 1998.
MATH Google Scholar
Rosenblatt F: Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Spartan Books, Washington, DC, USA; 1962.
MATH Google Scholar
Duda R, Hart P: Pattern Classification. 2nd edition. John Wiley & Sons, New York, NY, USA; 2001.
MATH Google Scholar
Chang C-C, Lin C-J: LIBSVM: introduction and benchmarks. Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan; 2000.
Google Scholar
Braga-Neto UM, Dougherty ER: Is cross-validation valid for small-sample microarray classification? Bioinformatics 2004,20(3):374-380. 10.1093/bioinformatics/btg419
Article Google Scholar
Pudil P, Novovičová J, Kittler J: Floating search methods in feature selection. Pattern Recognition Letters 1994,15(11):1119-1125. 10.1016/0167-8655(94)90127-9
Article Google Scholar
Jain AK, Zongker D: Feature selection: evaluation, application, and small sample performance. IEEE Transactions on Pattern Analysis and Machine Intelligence 1997,19(2):153-158. 10.1109/34.574797
Article Google Scholar
Kudo M, Sklansky J: Comparison of algorithms that select features for pattern classifiers. Pattern Recognition 2000,33(1):25-41. 10.1016/S0031-3203(99)00041-2
Article Google Scholar
Braga-Neto U, Dougherty ER: Bolstered error estimation. Pattern Recognition 2004,37(6):1267-1281. 10.1016/j.patcog.2003.08.017
Article MATH Google Scholar
Sima C, Attoor S, Brag-Neto U, Lowey J, Suh E, Dougherty ER: Impact of error estimation on feature selection algorithms. Pattern Recognition 2005,38(12):2472-2482. 10.1016/j.patcog.2005.03.026
Article Google Scholar
Hua J, Xiong Z, Lowey J, Suh E, Dougherty ER: Optimal number of features as a function of sample size for various classification rules. Bioinformatics 2005,21(8):1509-1515. 10.1093/bioinformatics/bti171
Article Google Scholar
Jain AK, Waller WG: On the optimal number of features in the classification of multivariate Gaussian data. Pattern Recognition 1978,10(5-6):365-374. 10.1016/0031-3203(78)90008-0
Article MATH Google Scholar
Chen Y, Dougherty ER, Bittner ML: Ratio-based decisions and the quantitative analysis of cDNA microarray images. Journal of Biomedical Optics 1997,2(4):364-374. 10.1117/12.281504
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computational Biology Division, Translational Genomics Research Institute, Phoenix, AZ, 85004, USA
Jianping Hua, Yoganand Balagurunathan, James Lowey, Michael L Bittner, Edward Suh & Edward R Dougherty
Genetics Branch, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892-2152, USA
Yidong Chen
Department of Electrical & Computer Engineering, Texas A&M University, College Station, TX, 77843, USA
Zixiang Xiong & Edward R Dougherty

Authors

Jianping Hua
View author publications
You can also search for this author in PubMed Google Scholar
Yoganand Balagurunathan
View author publications
You can also search for this author in PubMed Google Scholar
Yidong Chen
View author publications
You can also search for this author in PubMed Google Scholar
James Lowey
View author publications
You can also search for this author in PubMed Google Scholar
Michael L Bittner
View author publications
You can also search for this author in PubMed Google Scholar
Zixiang Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Edward Suh
View author publications
You can also search for this author in PubMed Google Scholar
Edward R Dougherty
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianping Hua.

Rights and permissions

Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License ( https://creativecommons.org/licenses/by-nc/2.0 ), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Reprints and permissions

About this article

Cite this article

Hua, J., Balagurunathan, Y., Chen, Y. et al. Normalization Benefits Microarray-Based Classification. J Bioinform Sys Biology 2006, 43056 (2006). https://doi.org/10.1155/BSB/2006/43056

Download citation

Received: 11 December 2005
Revised: 19 April 2006
Accepted: 18 May 2006
Published: 24 August 2006
DOI: https://doi.org/10.1155/BSB/2006/43056

Normalization Benefits Microarray-Based Classification

Abstract

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords