Open Access

Information-Theoretic Inference of Large Transcriptional Regulatory Networks

  • Patrick E Meyer1Email author,
  • Kevin Kontos1,
  • Frederic Lafitte1 and
  • Gianluca Bontempi1
EURASIP Journal on Bioinformatics and Systems Biology20072007:79879

https://doi.org/10.1155/2007/79879

Received: 26 January 2007

Accepted: 12 May 2007

Published: 24 June 2007

Abstract

The paper presents MRNET, an original method for inferring genetic networks from microarray data. The method is based on maximum relevance/minimum redundancy (MRMR), an effective information-theoretic technique for feature selection in supervised learning. The MRMR principle consists in selecting among the least redundant variables the ones that have the highest mutual information with the target. MRNET extends this feature selection principle to networks in order to infer gene-dependence relationships from microarray data. The paper assesses MRNET by benchmarking it against RELNET, CLR, and ARACNE, three state-of-the-art information-theoretic methods for large (up to several thousands of genes) network inference. Experimental results on thirty synthetically generated microarray datasets show that MRNET is competitive with these methods.

[12345678910111213141516171819202122232425262728]

Authors’ Affiliations

(1)
ULB Machine Learning Group, Computer Science Department, Université Libre de Bruxelles

References

  1. van Someren EP, Wessels LFA, Backer E, Reinders MJT: Genetic network modeling. Pharmacogenomics 2002, 3(4):507-525. 10.1517/14622416.3.4.507View ArticleGoogle Scholar
  2. Gardner TS, Faith JJ: Reverse-engineering transcription control networks. Physics of Life Reviews 2005, 2(1):65-88. 10.1016/j.plrev.2005.01.001View ArticleGoogle Scholar
  3. Chow C, Liu C: Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory 1968, 14(3):462-467. 10.1109/TIT.1968.1054142View ArticleMathSciNetMATHGoogle Scholar
  4. Butte AJ, Kohane IS: Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. Pacific Symposium on Biocomputing 2000, 418-429.Google Scholar
  5. Margolin AA, Nemenman I, Basso K, et al.: ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 2006, 7(1):S7. 10.1186/1471-2105-7-S1-S7View ArticleGoogle Scholar
  6. Faith JJ, Hayete B, Thaden JT, et al.: Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biology 2007, 5(1):e8. 10.1371/journal.pbio.0050008View ArticleGoogle Scholar
  7. Pearl J: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible. Morgan Kaufmann, San Fransisco, Calif, USA; 1988.Google Scholar
  8. Cheng J, Greiner R, Kelly J, Bell D, Liu W: Learning Bayesian networks from data: an information-theory based approach. Artificial Intelligence 2002, 137(1-2):43-90. 10.1016/S0004-3702(02)00191-1View ArticleMathSciNetMATHGoogle Scholar
  9. Schneidman E, Still S, Berry MJ II, Bialek W: Network information and connected correlations. Physical Review Letters 2003, 91(23):4 pages.View ArticleGoogle Scholar
  10. Nemenman I: Multivariate dependence, and genetic network inference. In Tech. Rep. NSF-KITP-04-54. KITP, UCSB, Santa Barbara, Calif, USA; 2004.Google Scholar
  11. Tourassi GD, Frederick ED, Markey MK, Floyd CE Jr.: Application of the mutual information criterion for feature selection in computer-aided diagnosis. Medical Physics 2001, 28(12):2394-2402. 10.1118/1.1418724View ArticleGoogle Scholar
  12. Ding C, Peng H: Minimum redundancy feature selection from microarray gene expression data. Journal of Bioinformatics and Computational Biology 2005, 3(2):185-205. 10.1142/S0219720005001004View ArticleMathSciNetGoogle Scholar
  13. Meyer PE, Bontempi G: On the use of variable complementarity for feature selection in cancer classification. In Applications of Evolutionary Computing: EvoWorkshops, Lecture Notes in Computer Science. Volume 3907. Edited by: Rothlauf F, Branke J, Cagnoniet S, et al.. Springer, Berlin, Germany; 2006:91-102. 10.1007/11732242_9View ArticleGoogle Scholar
  14. Meyer PE, Kontos K, Bontempi G: Biological network inference using redundancy analysis. Proceedings of the 1st International Conference on Bioinformatics Research and Development (BIRD '07), Berlin, Germany, March 2007 916-927.Google Scholar
  15. Butte AJ, Tamayo P, Slonim D, Golub TR, Kohane IS: Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks. Proceedings of the National Academy of Sciences of the United States of America 2000, 97(22):12182-12186. 10.1073/pnas.220392197View ArticleGoogle Scholar
  16. Cover TM, Thomas JA: Elements of Information Theory. John Wiley & Sons, New York, NY, USA; 1990.Google Scholar
  17. Merz P, Freisleben B: Greedy and local search heuristics for unconstrained binary quadratic programming. Journal of Heuristics 2002, 8(2):197-213. 10.1023/A:1017912624016View ArticleMATHGoogle Scholar
  18. Rogers S, Girolami M: A Bayesian regression approach to the inference of regulatory networks from gene expression data. Bioinformatics 2005, 21(14):3131-3137. 10.1093/bioinformatics/bti487View ArticleGoogle Scholar
  19. van den Bulcke T, van Leemput K, Naudts B, et al.: SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms. BMC Bioinformatics 2006, 7: 43. 10.1186/1471-2105-7-43View ArticleGoogle Scholar
  20. Paninski L: Estimation of entropy and mutual information. Neural Computation 2003, 15(6):1191-1253. 10.1162/089976603321780272View ArticleMATHGoogle Scholar
  21. Beirlant J, Dudewica EJ, Gyofi L, van der Meulen E: Nonparametric entropy estimation: an overview. Journal of Statistics 1997, 6(1):17-39.MATHGoogle Scholar
  22. Dougherty J, Kohavi R, Sahami M: Supervised and unsupervised discretization of continuous features. Proceedings of the 12th International Conference on Machine Learning (ML '95), Lake Tahoe, Calif, USA, July 1995 194-202.Google Scholar
  23. Provost FJ, Fawcett T, Kohavi R: The case against accuracy estimation for comparing induction algorithms. In Proceedings of the 15th International Conference on Machine Learning (ICML '98), Madison, Wis, USA, July 1998. Morgan Kaufmann; 445-453.Google Scholar
  24. Bockhorst J, Craven M: Markov networks for detecting overlapping elements in sequence data. In Advances in Neural Information Processing Systems 17. Edited by: Saul LK, Weiss Y, Bottou L. MIT Press, Cambridge, Mass, USA; 2005:193-200.Google Scholar
  25. Dietterich TG: Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation 1998, 10(7):1895-1923. 10.1162/089976698300017197View ArticleGoogle Scholar
  26. Hwang KB, Lee JW, Chung S-W, Zhang B-T: Construction of large-scale Bayesian networks by local to global search. Proceedings of the 7th Pacific Rim International Conference on Artificial Intelligence (PRICAI '02), Tokyo, Japan, August 2002 375-384.Google Scholar
  27. Tsamardinos I, Aliferis C, Statnikov A: Algorithms for large scale markov blanket discovery. Proceedings of the 16th International Florida Artificial Intelligence Research Society Conference (FLAIRS '03), St. Augustine, Fla, USA, May 2003 376-381.Google Scholar
  28. Tsamardinos I, Aliferis C: Towards principled feature selection: relevancy, filters and wrappers. Proceedings of the 9th International Workshop on Artificial Intelligence and Statistics (AI&Stats '03), Key West, Fla, USA January 2003.Google Scholar

Copyright

© Patrick E. Meyer et al. 2007

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.