Skip to main content

Clustering Time-Series Gene Expression Data Using Smoothing Spline Derivatives


Microarray data acquired during time-course experiments allow the temporal variations in gene expression to be monitored. An original postprandial fasting experiment was conducted in the mouse and the expression of 200 genes was monitored with a dedicated macroarray at 11 time points between 0 and 72 hours of fasting. The aim of this study was to provide a relevant clustering of gene expression temporal profiles. This was achieved by focusing on the shapes of the curves rather than on the absolute level of expression. Actually, we combined spline smoothing and first derivative computation with hierarchical and partitioning clustering. A heuristic approach was proposed to tune the spline smoothing parameter using both statistical and biological considerations. Clusters are illustrated a posteriori through principal component analysis and heatmap visualization. Most results were found to be in agreement with the literature on the effects of fasting on the mouse liver and provide promising directions for future biological investigations.



  1. Park T, Yi S-G, Lee S, et al.: Statistical tests for identifying differentially expressed genes in time-course microarray experiments. Bioinformatics 2003, 19(6):694-703. 10.1093/bioinformatics/btg068

    Article  Google Scholar 

  2. Peddada SD, Lobenhofer EK, Li L, Afshari CA, Weinberg CR, Umbach DM: Gene selection and clustering for time-course and dose-response microarray experiments using order-restricted inference. Bioinformatics 2003, 19(7):834-841. 10.1093/bioinformatics/btg093

    Article  Google Scholar 

  3. Storey JD, Xiao W, Leek JT, Tompkins RG, Davis RW: Significance analysis of time course microarray experiments. Proceedings of the National Academy of Sciences of the United States of America 2005, 102(36):12837-12842. 10.1073/pnas.0504609102

    Article  Google Scholar 

  4. Tai YC, Speed TP: A multivariate empirical Bayes statistic for replicated microarray time course data. The Annals of Statistics 2006, 34(5):2387-2412. 10.1214/009053606000000759

    Article  MathSciNet  MATH  Google Scholar 

  5. Ramoni MF, Sebastiani P, Kohane IS: Cluster analysis of gene expression dynamics. Proceedings of the National Academy of Sciences of the United States of America 2002, 99(14):9121-9126. 10.1073/pnas.132656399

    Article  MathSciNet  MATH  Google Scholar 

  6. Ernst J, Nau GJ, Bar-Joseph Z: Clustering short time series gene expression data. Bioinformatics 2005, 21(1):i159-i168. 10.1093/bioinformatics/bti1022

    Article  Google Scholar 

  7. Giurcǎneanu CD, Tǎbuş I, Astola J: Clustering time series gene expression data based on sum-of-exponentials fitting. EURASIP Journal on Applied Signal Processing 2005, 2005(8):1159-1173. 10.1155/ASP.2005.1159

    Article  Google Scholar 

  8. Heard NA, Holmes CC, Stephens DA, Hand DJ, Dimopoulos G: Bayesian coclustering of Anopheles gene expression time series: study of immune defense response to multiple experimental challenges. Proceedings of the National Academy of Sciences of the United States of America 2005, 102(47):16939-16944. 10.1073/pnas.0408393102

    Article  Google Scholar 

  9. Conesa A, Nueda MJ, Ferrer A, Talón M: maSigPro: a method to identify significantly differential expression profiles in time-course microarray experiments. Bioinformatics 2006, 22(9):1096-1102. 10.1093/bioinformatics/btl056

    Article  Google Scholar 

  10. Letowski J, Brousseau R, Masson L: Designing better probes: effect of probe size, mismatch position and number on hybridization in DNA oligonucleotide microarrays. Journal of Microbiological Methods 2004, 57(2):269-278. 10.1016/j.mimet.2004.02.002

    Article  Google Scholar 

  11. Ramsay J, Silverman B: Functional Data Analysis. 2nd edition. Springer, New York, NY, USA; 2005.

    Google Scholar 

  12. Bar-Joseph Z, Gerber GK, Gifford DK, Jaakkola TS, Simon I: Continuous representations of time-series gene expression data. Journal of Computational Biology 2003, 10(3-4):341-356. 10.1089/10665270360688057

    Article  Google Scholar 

  13. Bar-Joseph Z: Analyzing time series gene expression data. Bioinformatics 2004, 20(16):2493-2503. 10.1093/bioinformatics/bth283

    Article  Google Scholar 

  14. Martin PGP, Lasserre F, Calleja C, et al.:Transcriptional modulations by RXR agonists are only partially subordinated to PPAR signaling and attest additional, organ-specific, molecular cross-talks. Gene Expression 2005, 12(3):177-192. 10.3727/000000005783992098

    Article  Google Scholar 

  15. Martin PGP, Guillou H, Lasserre F, et al.:Novel aspects of PPAR-mediated regulation of lipid and xenobiotic metabolism revealed through a nutrigenomic study. Hepatology 2007, 45(3):767-777. 10.1002/hep.21510

    Article  Google Scholar 

  16. INRArray: Laboratoire de Pharmacologie et Toxicologie, INRA.2005. []

    Google Scholar 

  17. Silverman B: Some aspects of the spline smoothing approach to non-parametric regression curve fitting. Journal of the Royal Statistical Society: Series B 1985, 47(1):1-52.

    MathSciNet  MATH  Google Scholar 

  18. Besse P, Cardot H, Ferraty F: Simultaneous non-parametric regressions of unbalanced longitudinal data. Computational Statistics & Data Analysis 1997, 24(3):255-270. 10.1016/S0167-9473(96)00067-9

    Article  MathSciNet  MATH  Google Scholar 

  19. Seber GAF: Multivariate Observations. John Wiley & Sons, New York, NY, USA; 1984.

    Book  MATH  Google Scholar 

  20. Yeung KY, Ruzzo WL: Principal component analysis for clustering gene expression data. Bioinformatics 2001, 17(9):763-774. 10.1093/bioinformatics/17.9.763

    Article  Google Scholar 

  21. Chipman H, Hastie TJ, Tibshirani T: Clustering microarray data. In Statistical Analysis of Gene Expression Microarray Data. Edited by: Speed T. Chapmann & Hall/CRC Press, Boca Raton, Fla, USA; 2003:159-200.

    Google Scholar 

  22. Kersten S, Seydoux J, Peters JM, Gonzalez FJ, Desvergne B, Wahli W:Peroxisome proliferator-activated receptor mediates the adaptive response to fasting. Journal of Clinical Investigation 1999, 103(11):1489-1498. 10.1172/JCI6223

    Article  Google Scholar 

  23. Mandard S, Müller M, Kersten S:Peroxisome proliferator-activated receptor target genes. Cellular and Molecular Life Sciences 2004, 61(4):393-416. 10.1007/s00018-003-3216-3

    Article  Google Scholar 

  24. Bauer M, Hamm AC, Bonaus M, et al.: Starvation response in mouse liver shows strong correlation with life-span-prolonging processes. Physiological Genomics 2004, 17(2):230-244. 10.1152/physiolgenomics.00203.2003

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to S Déjean.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Déjean, S., Martin, P., Baccini, A. et al. Clustering Time-Series Gene Expression Data Using Smoothing Spline Derivatives. J Bioinform Sys Biology 2007, 70561 (2007).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI:


  • Gene Expression Data
  • Mouse Liver
  • Heuristic Approach
  • Absolute Level
  • Smoothing Parameter