Statistical Science

The Bayesian Analysis of Complex, High-Dimensional Models: Can It Be CODA?

Y. Ritov, P. J. Bickel, A. C. Gamst, and B. J. K. Kleijn

Full-text: Open access

Abstract

We consider the Bayesian analysis of a few complex, high-dimensional models and show that intuitive priors, which are not tailored to the fine details of the model and the estimated parameters, produce estimators which perform poorly in situations in which good, simple frequentist estimators exist. The models we consider are: stratified sampling, the partial linear model, linear and quadratic functionals of white noise and estimation with stopping times. We present a strong version of Doob’s consistency theorem which demonstrates that the existence of a uniformly $\sqrt{n}$-consistent estimator ensures that the Bayes posterior is $\sqrt{n}$-consistent for values of the parameter in subsets of prior probability 1. We also demonstrate that it is, at least, in principle, possible to construct Bayes priors giving both global and local minimax rates, using a suitable combination of loss functions. We argue that there is no contradiction in these apparently conflicting findings.

Article information

Source
Statist. Sci., Volume 29, Number 4 (2014), 619-639.

Dates
First available in Project Euclid: 15 January 2015

Permanent link to this document
https://projecteuclid.org/euclid.ss/1421330550

Digital Object Identifier
doi:10.1214/14-STS483

Mathematical Reviews number (MathSciNet)
MR3300362

Zentralblatt MATH identifier
1331.62162

Keywords
Foundations CODA Bayesian inference white noise models partial linear model stopping time functional estimation semiparametrics

Citation

Ritov, Y.; Bickel, P. J.; Gamst, A. C.; Kleijn, B. J. K. The Bayesian Analysis of Complex, High-Dimensional Models: Can It Be CODA?. Statist. Sci. 29 (2014), no. 4, 619--639. doi:10.1214/14-STS483. https://projecteuclid.org/euclid.ss/1421330550


Export citation

References

  • Bayarri, M. J. and Berger, J. O. (2004). The interplay of Bayesian and frequentist analysis. Statist. Sci. 19 58–80.
  • Berger, J. (2006a). The case for objective Bayesian analysis. Bayesian Anal. 1 385–402.
  • Berger, J. (2006b). Rejoinder. Bayesian Anal. 1 457–464.
  • Berger, J. O. and Wolpert, R. L. (1988). The Likelihood Principle: A Review, Generalizations, and Statistical Implications, 2nd ed. Lecture Notes—Monograph Series 6. IMS, Hayward, CA.
  • Berry, S. M., Reese, C. S. and Larkey, P. D. (1999). Bridging different eras in sports. J. Amer. Statist. Assoc. 84 661–676.
  • Bickel, P. J. (1981). Minimax estimation of the mean of a normal distribution when the parameter space is restricted. Ann. Statist. 9 1301–1309.
  • Bickel, P. J. and Kleijn, B. J. K. (2012). The semiparametric Bernstein–von Mises theorem. Ann. Statist. 40 206–237.
  • Bickel, P. J. and Ritov, Y. (1988). Estimating integrated squared density derivatives: Sharp best order of convergence estimates. Sankhyā Ser. A 50 381–393.
  • Bickel, P. J. and Ritov, Y. (2003). Nonparametric estimators which can be “plugged-in”. Ann. Statist. 31 1033–1053.
  • Bickel, P. J., Klaassen, C. A. J., Ritov, Y. and Wellner, J. A. (1998). Efficient and Adaptive Estimation in Semiparametric Models. Springer, New York.
  • Bock, M. E. (2004). Conversations with Herman Rubin. In A Festschrift for Herman Rubin. Institute of Mathematical Statistics Lecture Notes—Monograph Series 45 408–417. IMS, Beachwood, OH.
  • Brown, L. D. and Low, M. G. (1996). Asymptotic equivalence of nonparametric regression and white noise. Ann. Statist. 24 2384–2398.
  • Chen, H. and Shiau, J. J. H. (1994). Data-driven efficient estimators for a partially linear model. Ann. Statist. 22 211–237.
  • Cochran, W. G. (1977). Sampling Techniques, 3rd ed. Wiley, New York.
  • Cox, D. D. (1993). An analysis of Bayesian inference for nonparametric regression. Ann. Statist. 21 903–923.
  • Diaconis, P. and Freedman, D. A. (1993). Nonparametric binary regression: A Bayesian approach. Ann. Statist. 21 2108–2137.
  • Diaconis, P. W. and Freedman, D. (1998). Consistency of Bayes estimates for nonparametric regression: Normal theory. Bernoulli 4 411–444.
  • Donoho, D. L. and Johnstone, I. M. (1994). Minimax risk over $l_{p}$-balls for $l_{q}$-error. Probab. Theory Related Fields 99 277–303.
  • Donoho, D. L. and Johnstone, I. M. (1995). Adapting to unknown smoothness via wavelet shrinkage. J. Amer. Statist. Assoc. 90 1200–1224.
  • Engle, R. F., Granger, C. W. J., Rice, J. and Weiss, A. (1986). Nonparametric estimates of the relation between weather and electricity sales. J. Amer. Statist. Assoc. 81 310–320.
  • Everson, P. J. and Morris, C. N. (2000). Inference for multivariate normal hierarchical models. J. R. Stat. Soc. Ser. B Stat. Methodol. 62 399–412.
  • Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. Ann. Statist. 1 209–230.
  • Freedman, D. A. (1963). On the asymptotic behavior of Bayes’ estimates in the discrete case. Ann. Math. Statist. 34 1386–1403.
  • Freedman, D. A. (1965). On the asymptotic behavior of Bayes estimates in the discrete case. II. Ann. Math. Statist. 36 454–456.
  • Freedman, D. (1999). On the Bernstein–von Mises theorem with infinite-dimensional parameters. Ann. Statist. 27 1119–1140.
  • Ghosal, S., Ghosh, J. K. and van der Vaart, A. W. (2000). Convergence rates of posterior distributions. Ann. Statist. 28 500–531.
  • Goldstein, M. (2006). Subjective Bayesian analysis: Principles and practice. Bayesian Anal. 1 403–420 (electronic).
  • Greenshtein, E., Park, J. and Ritov, Y. (2008). Estimating the mean of high valued observations in high dimensions. J. Stat. Theory Pract. 2 407–418.
  • Harmeling, S. and Toussaint, M. (2007). Bayesian estimators for Robins–Ritov’s problem. Technical report, Univ. Edinburgh, School of Informatics Research Report EDI-INF-RR-1189.
  • Ibragimov, I. A. and Hasminskii, R. Z. (1984). On nonparametric estimation of a linear functional in Gaussian white noise. Theory Probab. Appl. 29 19–32.
  • Kleijn, B. J. K. and van der Vaart, A. W. (2006). Misspecification in infinite-dimensional Bayesian statistics. Ann. Statist. 34 837–877.
  • Lehmann, E. L. and Casella, G. (1998). Theory of Point Estimation, 2nd ed. Springer, New York.
  • Le Cam, L. and Yang, G. L. (1990). Asymptotics in Statistics: Some Basic Concepts. Springer, New York.
  • Li, K. (1999). Testing symmetry and proportionality in ppp. J. Bus. Econom. Statist. 17 409–418.
  • Li, L. (2010). Are Bayesian inferences weak for Wasserman’s example? Comm. Statist. Simulation Comput. 39 655–667.
  • Lindley, D. V. (1953). Statistical inference. J. R. Stat. Soc. Ser. B Stat. Methodol. 15 30–65; discussion 65–76.
  • Lindley, D. V. and Smith, A. F. M. (1972). Bayes estimates for the linear model. J. R. Stat. Soc. Ser. B Stat. Methodol. 34 1–41.
  • McShane, B. B. and Wyner, A. J. (2011). A statistical analysis of multiple temperature proxies: Are reconstructions of surface temperatures over the last 1000 years reliable? Ann. Appl. Stat. 5 5–44.
  • Nussbaum, M. (1996). Asymptotic equivalence of density estimation and Gaussian white noise. Ann. Statist. 24 2399–2430.
  • Robins, J. M. and Ritov, Y. (1997). Toward a curse of dimensionality appropriate (coda) asymptotic theory for semiparametric models. Stat. Med. 17 285–319.
  • Robins, J., Tchetgen, E. T., Li, L. and van der Vaart, A. (2009). Semiparametric minimax rates. Electron. J. Stat. 3 1305–1321.
  • Savage, L. J. (1961). The foundations of statistics reconsidered. In Proc. 4th Berkeley Sympos. Math. Statist. and Prob., Vol. I 575–586. Univ. California Press, Berkeley, CA.
  • Schick, A. (1993). On efficient estimation in regression models. Ann. Statist. 21 1486–1521.
  • Smith, A. F. M. (1986). Some Bayesian thoughts on modelling and model choice. J. R. Stat. Soc. Ser. D (The Statistician) 35 97–101.
  • van der Pas, S. L. and Kleijn, B. J. K. (2014). The horseshoe estimator: Posterior concentration around nearly black vectors. Electron. J. Statist. 8 2585–2618.
  • Wang, L., Brown, L. D. and Cai, T. T. (2011). A difference based approach to the semiparametric partial linear model. Electron. J. Stat. 5 619–641.
  • Wasserman, L. (2000). Asymptotic inference for mixture models using data-dependent priors. J. R. Stat. Soc. Ser. B Stat. Methodol. 62 159–180.
  • Wasserman, L. (2004). All of Statistics: A Concise Course in Statistical Inference. Springer, New York.
  • Zhao, L. H. (2000). Bayesian aspects of some nonparametric problems. Ann. Statist. 28 532–552.