Open Access

NML Computation Algorithms for Tree-Structured Multinomial Bayesian Networks

EURASIP Journal on Bioinformatics and Systems Biology20082007:90947

https://doi.org/10.1155/2007/90947

Received: 1 March 2007

Accepted: 30 July 2007

Published: 20 January 2008

Abstract

Typical problems in bioinformatics involve large discrete datasets. Therefore, in order to apply statistical methods in such domains, it is important to develop efficient algorithms suitable for discrete data. The minimum description length (MDL) principle is a theoretically well-founded, general framework for performing statistical inference. The mathematical formalization of MDL is based on the normalized maximum likelihood (NML) distribution, which has several desirable theoretical properties. In the case of discrete data, straightforward computation of the NML distribution requires exponential time with respect to the sample size, since the definition involves a sum over all the possible data samples of a fixed size. In this paper, we first review some existing algorithms for efficient NML computation in the case of multinomial and naive Bayes model families. Then we proceed by extending these algorithms to more complex, tree-structured Bayesian networks.

[123456789101112131415161718192021222324252627282930313233]

Authors’ Affiliations

(1)
Complex Systems Computation Group (CoSCo), Helsinki Institute for Information Technology (HIIT), (Department of Computer Science), FIN-00014 University of Helsinki

References

  1. Korodi G, Tabus I: An efficient normalized maximum likelihood algorithm for DNA sequence compression. ACM Transactions on Information Systems 2005, 23(1):3-34. 10.1145/1055709.1055711View ArticleGoogle Scholar
  2. Tibshirani R, Hastie T, Eisen M, Ross D, Botstein D, Brown B: Clustering methods for the analysis of DNA microarray data. Department of Health Research and Policy, Stanford University, Stanford, Calif, USA; 1999.Google Scholar
  3. Pan W, Lin J, Le CT: Model-based cluster analysis of microarray gene-expression data. Genome Biology 2002, 3(2):1-8.View ArticleGoogle Scholar
  4. McLachlan GJ, Bean RW, Peel D: A mixture model-based approach to the clustering of microarray expression data. Bioinformatics 2002, 18(3):413-422. 10.1093/bioinformatics/18.3.413View ArticleGoogle Scholar
  5. Hartemink AJ, Gifford DK, Jaakkola TS, Young RA: Using graphical models and genomic expression data to statistically validate models of genetic regulatory networks. Proceedings of the 6th Pacific Symposium on Biocomputing (PSB '01), The Big Island of Hawaii, Hawaii, USA, January 2001 422-433.Google Scholar
  6. Rissanen J: Modeling by shortest data description. Automatica 1978, 14(5):465-471. 10.1016/0005-1098(78)90005-5View ArticleMATHGoogle Scholar
  7. Rissanen J: Stochastic complexity. Journal of the Royal Statistical Society, Series B 1987, 49(3):223-239. with discussions, 223–265MathSciNetMATHGoogle Scholar
  8. Rissanen J: Fisher information and stochastic complexity. IEEE Transactions on Information Theory 1996, 42(1):40-47. 10.1109/18.481776View ArticleMathSciNetMATHGoogle Scholar
  9. Shtarkov YuM: Universal sequential coding of single messages. Problems of Information Transmission 1987, 23(3):175-186.MathSciNetGoogle Scholar
  10. Barron A, Rissanen J, Yu B: The minimum description length principle in coding and modeling. IEEE Transactions on Information Theory 1998, 44(6):2743-2760. 10.1109/18.720554View ArticleMathSciNetMATHGoogle Scholar
  11. Rissanen J: Strong optimality of the normalized ML models as universal codes and information in data. IEEE Transactions on Information Theory 2001, 47(5):1712-1717. 10.1109/18.930912View ArticleMathSciNetMATHGoogle Scholar
  12. Grünwald P: The Minimum Description Length Principle. The MIT Press, Cambridge, Mass, USA; 2007.Google Scholar
  13. Rissanen J: Information and Complexity in Statistical Modeling. Springer, New York, NY, USA; 2007.MATHGoogle Scholar
  14. Heckerman D: A tutorial on learning with Bayesian networks. In Tech. Rep. MSR-TR-95-06. Microsoft Research, Advanced Technology Division, One Microsoft Way, Redmond, Wash, USA, 98052; 1996.Google Scholar
  15. Kontkanen P, Myllymäki P: A linear-time algorithm for computing the multinomial stochastic complexity. Information Processing Letters 2007, 103(6):227-233. 10.1016/j.ipl.2007.04.003View ArticleMathSciNetMATHGoogle Scholar
  16. Kontkanen P, Myllymäki P, Buntine W, Rissanen J, Tirri H: An MDL framework for data clustering. In Advances in Minimum Description Length: Theory and Applications. Edited by: Grünwald P, Myung IJ, Pitt M. The MIT Press, Cambridge, Mass, USA; 2006.Google Scholar
  17. Xie Q, Barron AR: Asymptotic minimax regret for data compression, gambling, and prediction. IEEE Transactions on Information Theory 2000, 46(2):431-445. 10.1109/18.825803View ArticleMathSciNetMATHGoogle Scholar
  18. Balasubramanian V: MDL, Bayesian inference, and the geometry of the space of probability distributions. In Advances in Minimum Description Length: Theory and Applications. Edited by: Grünwald P, Myung IJ, Pitt M. The MIT Press, Cambridge, Mass, USA; 2006:81-98.Google Scholar
  19. Kontkanen P, Myllymäki P: MDL histogram density estimation. Proceedings of the 11th International Conference on Artificial Intelligence and Statistics, (AISTATS '07), San Juan, Puerto Rico, USA, March 2007Google Scholar
  20. Kontkanen P, Buntine W, Myllymäki P, Rissanen J, Tirri H: Efficient computation of stochastic complexity. In Proceedings of the 9th International Conference on Artificial Intelligence and Statistics, Key West, Fla, USA, January 2003. Edited by: Bishop C, Frey B. Society for Artificial Intelligence and Statistics; 233-238.Google Scholar
  21. Koivisto M: Sum-Product Algorithms for the Analysis of Genetic Risks. In Tech. Rep. A-2004-1. Department of Computer Science, University of Helsinki, Helsinki, Finland; 2004.Google Scholar
  22. Kontkanen P, Myllymäki P: A fast normalized maximum likelihood algorithm for multinomial data. Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI '05), Edinburgh, Scotland, August 2005Google Scholar
  23. Knuth DE, Pittle B: A recurrence related to trees. Proceedings of the American Mathematical Society 1989, 105(2):335-349. 10.1090/S0002-9939-1989-0949878-9View ArticleMathSciNetMATHGoogle Scholar
  24. Corless RM, Gonnet GH, Hare DEG, Jeffrey DJ, Knuth DE: On the Lambert W function. Advances in Computational Mathematics 1996, 5(1):329-359. 10.1007/BF02124750View ArticleMathSciNetMATHGoogle Scholar
  25. Szpankowski W: Average Case Analysis of Algorithms on Sequences. John Wiley & Sons, New York, NY, USA; 2001.View ArticleMATHGoogle Scholar
  26. Flajolet P, Odlyzko AM: Singularity analysis of generating functions. SIAM Journal on Discrete Mathematics 1990, 3(2):216-240. 10.1137/0403019View ArticleMathSciNetMATHGoogle Scholar
  27. Schwarz G: Estimating the dimension of a model. Annals of Statistics 1978, 6(2):461-464. 10.1214/aos/1176344136View ArticleMathSciNetMATHGoogle Scholar
  28. Kontkanen P, Myllymäki P, Tirri H: Constructing Bayesian finite mixture models by the EM algorithm. In Tech. Rep. NC-TR-97-003. ESPRIT Working Group on Neural and Computational Learning (NeuroCOLT), Helsinki, Finland; 1997.Google Scholar
  29. Kontkanen P, Myllymäki P, Silander T, Tirri H: On Bayesian case matching. In Proceedings of the 4th European Workshop Advances in Case-Based Reasoning (EWCBR '98), Lecture Notes In Computer Science, Springer, Dublin, Ireland, September 1998 Edited by: Smyth B, Cunningham P. 1488: 13-24.Google Scholar
  30. Grünwald P, Kontkanen P, Myllymäki P, Silander T, Tirri H: Minimum encoding approaches for predictive modeling. In Proceedings of the 14th International Conference on Uncertainty in Artificial Intelligence (UAI '98), Madison, Wis, USA, July 1998. Edited by: Cooper G, Moral S. Morgan Kaufmann; 183-192.Google Scholar
  31. Kontkanen P, Myllymäki P, Silander T, Tirri H, Grünwald P: On predictive distributions and Bayesian networks. Statistics and Computing 2000, 10(1):39-54. 10.1023/A:1008984400380View ArticleGoogle Scholar
  32. Kontkanen P, Lahtinen J, Myllymäki P, Silander T, Tirri H: Supervised model-based visualization of high-dimensional data. Intelligent Data Analysis 2000, 4(3-4):213-227.MATHGoogle Scholar
  33. Dyer M, Kannan R, Mount J: Sampling contingency tables. Random Structures and Algorithms 1997, 10(4):487-506. 10.1002/(SICI)1098-2418(199707)10:4<487::AID-RSA4>3.0.CO;2-QView ArticleMathSciNetMATHGoogle Scholar

Copyright

© Petri Kontkanen et al. 2007

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.