Skip to main content


We’d like to understand how you use our websites in order to improve them. Register your interest.

NML Computation Algorithms for Tree-Structured Multinomial Bayesian Networks


Typical problems in bioinformatics involve large discrete datasets. Therefore, in order to apply statistical methods in such domains, it is important to develop efficient algorithms suitable for discrete data. The minimum description length (MDL) principle is a theoretically well-founded, general framework for performing statistical inference. The mathematical formalization of MDL is based on the normalized maximum likelihood (NML) distribution, which has several desirable theoretical properties. In the case of discrete data, straightforward computation of the NML distribution requires exponential time with respect to the sample size, since the definition involves a sum over all the possible data samples of a fixed size. In this paper, we first review some existing algorithms for efficient NML computation in the case of multinomial and naive Bayes model families. Then we proceed by extending these algorithms to more complex, tree-structured Bayesian networks.



  1. 1.

    Korodi G, Tabus I: An efficient normalized maximum likelihood algorithm for DNA sequence compression. ACM Transactions on Information Systems 2005, 23(1):3-34. 10.1145/1055709.1055711

  2. 2.

    Tibshirani R, Hastie T, Eisen M, Ross D, Botstein D, Brown B: Clustering methods for the analysis of DNA microarray data. Department of Health Research and Policy, Stanford University, Stanford, Calif, USA; 1999.

  3. 3.

    Pan W, Lin J, Le CT: Model-based cluster analysis of microarray gene-expression data. Genome Biology 2002, 3(2):1-8.

  4. 4.

    McLachlan GJ, Bean RW, Peel D: A mixture model-based approach to the clustering of microarray expression data. Bioinformatics 2002, 18(3):413-422. 10.1093/bioinformatics/18.3.413

  5. 5.

    Hartemink AJ, Gifford DK, Jaakkola TS, Young RA: Using graphical models and genomic expression data to statistically validate models of genetic regulatory networks. Proceedings of the 6th Pacific Symposium on Biocomputing (PSB '01), The Big Island of Hawaii, Hawaii, USA, January 2001 422-433.

  6. 6.

    Rissanen J: Modeling by shortest data description. Automatica 1978, 14(5):465-471. 10.1016/0005-1098(78)90005-5

  7. 7.

    Rissanen J: Stochastic complexity. Journal of the Royal Statistical Society, Series B 1987, 49(3):223-239. with discussions, 223–265

  8. 8.

    Rissanen J: Fisher information and stochastic complexity. IEEE Transactions on Information Theory 1996, 42(1):40-47. 10.1109/18.481776

  9. 9.

    Shtarkov YuM: Universal sequential coding of single messages. Problems of Information Transmission 1987, 23(3):175-186.

  10. 10.

    Barron A, Rissanen J, Yu B: The minimum description length principle in coding and modeling. IEEE Transactions on Information Theory 1998, 44(6):2743-2760. 10.1109/18.720554

  11. 11.

    Rissanen J: Strong optimality of the normalized ML models as universal codes and information in data. IEEE Transactions on Information Theory 2001, 47(5):1712-1717. 10.1109/18.930912

  12. 12.

    Grünwald P: The Minimum Description Length Principle. The MIT Press, Cambridge, Mass, USA; 2007.

  13. 13.

    Rissanen J: Information and Complexity in Statistical Modeling. Springer, New York, NY, USA; 2007.

  14. 14.

    Heckerman D: A tutorial on learning with Bayesian networks. In Tech. Rep. MSR-TR-95-06. Microsoft Research, Advanced Technology Division, One Microsoft Way, Redmond, Wash, USA, 98052; 1996.

  15. 15.

    Kontkanen P, Myllymäki P: A linear-time algorithm for computing the multinomial stochastic complexity. Information Processing Letters 2007, 103(6):227-233. 10.1016/j.ipl.2007.04.003

  16. 16.

    Kontkanen P, Myllymäki P, Buntine W, Rissanen J, Tirri H: An MDL framework for data clustering. In Advances in Minimum Description Length: Theory and Applications. Edited by: Grünwald P, Myung IJ, Pitt M. The MIT Press, Cambridge, Mass, USA; 2006.

  17. 17.

    Xie Q, Barron AR: Asymptotic minimax regret for data compression, gambling, and prediction. IEEE Transactions on Information Theory 2000, 46(2):431-445. 10.1109/18.825803

  18. 18.

    Balasubramanian V: MDL, Bayesian inference, and the geometry of the space of probability distributions. In Advances in Minimum Description Length: Theory and Applications. Edited by: Grünwald P, Myung IJ, Pitt M. The MIT Press, Cambridge, Mass, USA; 2006:81-98.

  19. 19.

    Kontkanen P, Myllymäki P: MDL histogram density estimation. Proceedings of the 11th International Conference on Artificial Intelligence and Statistics, (AISTATS '07), San Juan, Puerto Rico, USA, March 2007

  20. 20.

    Kontkanen P, Buntine W, Myllymäki P, Rissanen J, Tirri H: Efficient computation of stochastic complexity. In Proceedings of the 9th International Conference on Artificial Intelligence and Statistics, Key West, Fla, USA, January 2003. Edited by: Bishop C, Frey B. Society for Artificial Intelligence and Statistics; 233-238.

  21. 21.

    Koivisto M: Sum-Product Algorithms for the Analysis of Genetic Risks. In Tech. Rep. A-2004-1. Department of Computer Science, University of Helsinki, Helsinki, Finland; 2004.

  22. 22.

    Kontkanen P, Myllymäki P: A fast normalized maximum likelihood algorithm for multinomial data. Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI '05), Edinburgh, Scotland, August 2005

  23. 23.

    Knuth DE, Pittle B: A recurrence related to trees. Proceedings of the American Mathematical Society 1989, 105(2):335-349. 10.1090/S0002-9939-1989-0949878-9

  24. 24.

    Corless RM, Gonnet GH, Hare DEG, Jeffrey DJ, Knuth DE: On the Lambert W function. Advances in Computational Mathematics 1996, 5(1):329-359. 10.1007/BF02124750

  25. 25.

    Szpankowski W: Average Case Analysis of Algorithms on Sequences. John Wiley & Sons, New York, NY, USA; 2001.

  26. 26.

    Flajolet P, Odlyzko AM: Singularity analysis of generating functions. SIAM Journal on Discrete Mathematics 1990, 3(2):216-240. 10.1137/0403019

  27. 27.

    Schwarz G: Estimating the dimension of a model. Annals of Statistics 1978, 6(2):461-464. 10.1214/aos/1176344136

  28. 28.

    Kontkanen P, Myllymäki P, Tirri H: Constructing Bayesian finite mixture models by the EM algorithm. In Tech. Rep. NC-TR-97-003. ESPRIT Working Group on Neural and Computational Learning (NeuroCOLT), Helsinki, Finland; 1997.

  29. 29.

    Kontkanen P, Myllymäki P, Silander T, Tirri H: On Bayesian case matching. In Proceedings of the 4th European Workshop Advances in Case-Based Reasoning (EWCBR '98), Lecture Notes In Computer Science, Springer, Dublin, Ireland, September 1998 Edited by: Smyth B, Cunningham P. 1488: 13-24.

  30. 30.

    Grünwald P, Kontkanen P, Myllymäki P, Silander T, Tirri H: Minimum encoding approaches for predictive modeling. In Proceedings of the 14th International Conference on Uncertainty in Artificial Intelligence (UAI '98), Madison, Wis, USA, July 1998. Edited by: Cooper G, Moral S. Morgan Kaufmann; 183-192.

  31. 31.

    Kontkanen P, Myllymäki P, Silander T, Tirri H, Grünwald P: On predictive distributions and Bayesian networks. Statistics and Computing 2000, 10(1):39-54. 10.1023/A:1008984400380

  32. 32.

    Kontkanen P, Lahtinen J, Myllymäki P, Silander T, Tirri H: Supervised model-based visualization of high-dimensional data. Intelligent Data Analysis 2000, 4(3-4):213-227.

  33. 33.

    Dyer M, Kannan R, Mount J: Sampling contingency tables. Random Structures and Algorithms 1997, 10(4):487-506. 10.1002/(SICI)1098-2418(199707)10:4<487::AID-RSA4>3.0.CO;2-Q

Download references

Author information



Corresponding author

Correspondence to Petri Kontkanen.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Kontkanen, P., Wettig, H. & Myllymäki, P. NML Computation Algorithms for Tree-Structured Multinomial Bayesian Networks. J Bioinform Sys Biology 2007, 90947 (2008).

Download citation


  • Statistical Method
  • Bayesian Network
  • System Biology
  • General Framework
  • Mathematical Formalization