Annals of Statistics

Convergence rates for density estimation with Bernstein polynomials

Subhashis Ghosal

Full-text: Open access


Mixture models for density estimation provide a very useful set up for the Bayesian or the maximum likelihood approach.For a density on the unit interval, mixtures of beta densities form a flexible model. The class of Bernstein densities is a much smaller subclass of the beta mixtures defined by Bernstein polynomials, which can approximate any continuous density. A Bernstein polynomial prior is obtained by putting a prior distribution on the class of Bernstein densities. The posterior distribution of a Bernstein polynomial prior is consistent under very general conditions. In this article, we present some results on the rate of convergence of the posterior distribution. If the underlying distribution generating the data is itself a Bernstein density, then we show that the posterior distribution converges at “nearly parametric rate” $(log n) /\sqrt{n}$ for the Hellinger distance. If the true density is not of the Bernstein type, we show that the posterior converges at a rate $n^{1/3}(log n)^{5/6}$ provided that the true density is twice differentiable and bounded away from 0. Similar rates are also obtained for sieve maximum likelihood estimates.These rates are inferior to the pointwise convergence rate of a kernel type estimator.We show that the Bayesian bootstrap method gives a proxy for the posterior distribution and has a convergence rate at par with that of the kernel estimator.

Article information

Ann. Statist., Volume 29, Number 5 (2001), 1264-1280.

First available in Project Euclid: 8 February 2002

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G07: Density estimation 62G20: Asymptotic properties

Bayesian bootstrap Bernstein polynomial entropy maximum likelihood estimate mixture of beta posterior distribution rate of convergence sieve strong approximation


Ghosal, Subhashis. Convergence rates for density estimation with Bernstein polynomials. Ann. Statist. 29 (2001), no. 5, 1264--1280. doi:10.1214/aos/1013203453.

Export citation


  • Bickel, P. J. and Rosenblatt, M. (1973). On some global measures of the deviations of density estimates. Ann. Statist. 1 1071-1095.
  • Birg´e, L. and Massart, P. (1998). Minimum contract estimators on sieves: exponential bounds and rates of convergence. Bernoulli 4 329-375.
  • Choudhuri, N. (1998). Bayesian bootstrap credible sets for multidimensional mean functional. Ann. Statist. 26 2104-2127.
  • Cs ¨org ¨o, M. and R´ev´esz, R. (1981). Strong Approximations in Probability and Statistics. Academic Press, New York.
  • Diaconis, P. (1993). Seminar at the University of Minnesota.
  • Escobar, M. and West, M. (1995). Bayesian density estimation and inference using mixtures. J. Amer. Statist. Assoc. 90 577-588.
  • Ferguson, T. S. (1983). Bayesian density estimation by mixtures of Normal distributions. In Recent Advances in Statistics (M. Rizvi, J. Rustagi and D. Siegmund, eds.) 287-302. Academic Press, New York.
  • Geman, S. and Hwang, C. (1982). Nonparametric maximum likelihood estimation by the method of sieves. Ann. Statist. 10 401-414.
  • Genovese, C. and Wasserman, L. (2000). Rates of convergence for the Gaussian mixture sieve. Ann. Statist. 28 1105-1127. Ghosal, S., Ghosh, J. K. and Ramamoorthi, R. V. (1999a). Posterior consistency of Dirichlet mixtures in density estimation. Ann. Statist. 27 143-158. Ghosal, S., Ghosh, J. K. and Ramamoorthi, R. V. (1999b). Consistency issues in Bayesian nonparametrics. In Asymptotics, Nonparametrics and Time Series: A Tribute to Madan Lal Puri (S. Ghosh, ed.) 639-668. Dekker, New York.
  • Ghosal, S., Ghosh, J. K. and van der Vaart, A. W. (2000). Convergence rates of posterior distributions. Ann. Statist. 28 500-531.
  • Ghosal, S. and van der Vaart, A. W. (2001). Entropies and rates of convergence of maximum likelihood and Bayes estimation for mixtures of normal densities. Ann. Statist. 29 1233-1263.
  • Grenander, U. (1981). Abstract Inference. Wiley, New York.
  • Kolmogorov, A. N. and Tihomirov. V. M. (1961). -entropy and -capacity of sets in function spaces. Amer. Math. Soc. Transl. Ser. 2 17 277-364. [Translated from Russian: Uspekhi Mat. Nauk 14 (1959) 3-86.]
  • Koml ´os, J., Major, P. and Tusn´ady, G. (1975). An approximation of partial sums of independent R. V.'s and the sample DF. I. Z. Wahrsch. Verw. Gebiete 32 111-131.
  • Lindsay, B. (1995). Mixture Models: Theory, Geometry and Applications. IMS, Hayward, CA.
  • Lo, A. Y. (1984). On a class of Bayesian nonparametric estimates I: Density estimates. Ann. Statist. 12 351-357.
  • Lo, A. Y. (1987). A large sample study of the Bayesian bootstrap. Ann. Statist. 15 360-375.
  • Lorenz, G. G. (1953). Bernstein Polynomials. Univ. Toronto Press.
  • McLachlan, G. and Basford, K. (1988). Mixture Models: Inference and Applications to Clustering. Dekker, New York. Petrone, S. (1999a). Random Bernstein polynomials. Scand. J. Statist. 26 373-393. Petrone, S. (1999b). Bayesian density estimation using Bernstein polynomials. Canad. J. Statist. 26 373-393.
  • Petrone, S. and Wasserman, L. (2001). Consistency of Bernstein polynomial posteriors. J. Roy. Statist. Soc. Ser. B. To appear.
  • Rubin, D. (1981). The Bayesian bootstrap. Ann. Statist. 9 130-134.
  • Schwartz, L. (1965). On Bayes procedures. Z. Wahrsch. Verw. Gebiete 4 10-26.
  • Shen, X. and Wasserman, L. (2001). Rates of convesrgence of posterior distributions. Ann. Statist. 29 687-714.
  • Shen, X. and Wong, W. H. (1994). Convergence rate of sieve estimates. Ann. Statist. 22 580-615.
  • Tenbusch, A. (1994). Two-dimensional Bernstein polynomial density estimators. Metrika 41 233-253.
  • van de Geer, S. (1993). Hellinger consistency of certain nonparametric maximum likelihood estimators. Ann. Statist. 21 14-44.
  • van de Geer, S. (1996). Rates of convergence for the maximum likelihood estimator in mixture models. J. Nonparametr. Statist. 6 293-310.
  • van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes. Springer, New York.
  • Vitale, R. A. (1975). A Bernstein polynomial approachto density estimation. In Statistical Inference and Related Topics (M. L. Puri, ed.) 2 87-100. Academic Press, New York.
  • Wasserman, L. (1998). Asymptotic properties of nonparametric Bayesian procedures. Practical Nonparametric and Semiparametric Bayesian Statistics. Lecture Notes in Statist. 133 293-304. Springer, New York.
  • Wong, W. H. and Shen, X. (1995). Probability inequalities for likelihood ratios and convergence rates of sieve MLEs. Ann. Statist. 23 339-362.