Skip to main content

Advertisement

Information-Theoretic Inference of Large Transcriptional Regulatory Networks

Article metrics

Abstract

The paper presents MRNET, an original method for inferring genetic networks from microarray data. The method is based on maximum relevance/minimum redundancy (MRMR), an effective information-theoretic technique for feature selection in supervised learning. The MRMR principle consists in selecting among the least redundant variables the ones that have the highest mutual information with the target. MRNET extends this feature selection principle to networks in order to infer gene-dependence relationships from microarray data. The paper assesses MRNET by benchmarking it against RELNET, CLR, and ARACNE, three state-of-the-art information-theoretic methods for large (up to several thousands of genes) network inference. Experimental results on thirty synthetically generated microarray datasets show that MRNET is competitive with these methods.

[12345678910111213141516171819202122232425262728]

References

  1. 1.

    van Someren EP, Wessels LFA, Backer E, Reinders MJT: Genetic network modeling. Pharmacogenomics 2002, 3(4):507-525. 10.1517/14622416.3.4.507

  2. 2.

    Gardner TS, Faith JJ: Reverse-engineering transcription control networks. Physics of Life Reviews 2005, 2(1):65-88. 10.1016/j.plrev.2005.01.001

  3. 3.

    Chow C, Liu C: Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory 1968, 14(3):462-467. 10.1109/TIT.1968.1054142

  4. 4.

    Butte AJ, Kohane IS: Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. Pacific Symposium on Biocomputing 2000, 418-429.

  5. 5.

    Margolin AA, Nemenman I, Basso K, et al.: ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 2006, 7(1):S7. 10.1186/1471-2105-7-S1-S7

  6. 6.

    Faith JJ, Hayete B, Thaden JT, et al.: Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biology 2007, 5(1):e8. 10.1371/journal.pbio.0050008

  7. 7.

    Pearl J: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible. Morgan Kaufmann, San Fransisco, Calif, USA; 1988.

  8. 8.

    Cheng J, Greiner R, Kelly J, Bell D, Liu W: Learning Bayesian networks from data: an information-theory based approach. Artificial Intelligence 2002, 137(1-2):43-90. 10.1016/S0004-3702(02)00191-1

  9. 9.

    Schneidman E, Still S, Berry MJ II, Bialek W: Network information and connected correlations. Physical Review Letters 2003, 91(23):4 pages.

  10. 10.

    Nemenman I: Multivariate dependence, and genetic network inference. In Tech. Rep. NSF-KITP-04-54. KITP, UCSB, Santa Barbara, Calif, USA; 2004.

  11. 11.

    Tourassi GD, Frederick ED, Markey MK, Floyd CE Jr.: Application of the mutual information criterion for feature selection in computer-aided diagnosis. Medical Physics 2001, 28(12):2394-2402. 10.1118/1.1418724

  12. 12.

    Ding C, Peng H: Minimum redundancy feature selection from microarray gene expression data. Journal of Bioinformatics and Computational Biology 2005, 3(2):185-205. 10.1142/S0219720005001004

  13. 13.

    Meyer PE, Bontempi G: On the use of variable complementarity for feature selection in cancer classification. In Applications of Evolutionary Computing: EvoWorkshops, Lecture Notes in Computer Science. Volume 3907. Edited by: Rothlauf F, Branke J, Cagnoniet S, et al.. Springer, Berlin, Germany; 2006:91-102. 10.1007/11732242_9

  14. 14.

    Meyer PE, Kontos K, Bontempi G: Biological network inference using redundancy analysis. Proceedings of the 1st International Conference on Bioinformatics Research and Development (BIRD '07), Berlin, Germany, March 2007 916-927.

  15. 15.

    Butte AJ, Tamayo P, Slonim D, Golub TR, Kohane IS: Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks. Proceedings of the National Academy of Sciences of the United States of America 2000, 97(22):12182-12186. 10.1073/pnas.220392197

  16. 16.

    Cover TM, Thomas JA: Elements of Information Theory. John Wiley & Sons, New York, NY, USA; 1990.

  17. 17.

    Merz P, Freisleben B: Greedy and local search heuristics for unconstrained binary quadratic programming. Journal of Heuristics 2002, 8(2):197-213. 10.1023/A:1017912624016

  18. 18.

    Rogers S, Girolami M: A Bayesian regression approach to the inference of regulatory networks from gene expression data. Bioinformatics 2005, 21(14):3131-3137. 10.1093/bioinformatics/bti487

  19. 19.

    van den Bulcke T, van Leemput K, Naudts B, et al.: SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms. BMC Bioinformatics 2006, 7: 43. 10.1186/1471-2105-7-43

  20. 20.

    Paninski L: Estimation of entropy and mutual information. Neural Computation 2003, 15(6):1191-1253. 10.1162/089976603321780272

  21. 21.

    Beirlant J, Dudewica EJ, Gyofi L, van der Meulen E: Nonparametric entropy estimation: an overview. Journal of Statistics 1997, 6(1):17-39.

  22. 22.

    Dougherty J, Kohavi R, Sahami M: Supervised and unsupervised discretization of continuous features. Proceedings of the 12th International Conference on Machine Learning (ML '95), Lake Tahoe, Calif, USA, July 1995 194-202.

  23. 23.

    Provost FJ, Fawcett T, Kohavi R: The case against accuracy estimation for comparing induction algorithms. In Proceedings of the 15th International Conference on Machine Learning (ICML '98), Madison, Wis, USA, July 1998. Morgan Kaufmann; 445-453.

  24. 24.

    Bockhorst J, Craven M: Markov networks for detecting overlapping elements in sequence data. In Advances in Neural Information Processing Systems 17. Edited by: Saul LK, Weiss Y, Bottou L. MIT Press, Cambridge, Mass, USA; 2005:193-200.

  25. 25.

    Dietterich TG: Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation 1998, 10(7):1895-1923. 10.1162/089976698300017197

  26. 26.

    Hwang KB, Lee JW, Chung S-W, Zhang B-T: Construction of large-scale Bayesian networks by local to global search. Proceedings of the 7th Pacific Rim International Conference on Artificial Intelligence (PRICAI '02), Tokyo, Japan, August 2002 375-384.

  27. 27.

    Tsamardinos I, Aliferis C, Statnikov A: Algorithms for large scale markov blanket discovery. Proceedings of the 16th International Florida Artificial Intelligence Research Society Conference (FLAIRS '03), St. Augustine, Fla, USA, May 2003 376-381.

  28. 28.

    Tsamardinos I, Aliferis C: Towards principled feature selection: relevancy, filters and wrappers. Proceedings of the 9th International Workshop on Artificial Intelligence and Statistics (AI&Stats '03), Key West, Fla, USA January 2003.

Download references

Author information

Correspondence to Patrick E Meyer.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Meyer, P.E., Kontos, K., Lafitte, F. et al. Information-Theoretic Inference of Large Transcriptional Regulatory Networks. J Bioinform Sys Biology 2007, 79879 (2007) doi:10.1155/2007/79879

Download citation

Keywords

  • Feature Selection
  • Transcriptional Regulatory
  • Microarray Data
  • Mutual Information
  • Regulatory Network