• Bernoulli
  • Volume 22, Number 1 (2016), 615-651.

Asymptotic optimality of myopic information-based strategies for Bayesian adaptive estimation

Janne V. Kujala

Full-text: Open access


This paper presents a general asymptotic theory of sequential Bayesian estimation giving results for the strongest, almost sure convergence. We show that under certain smoothness conditions on the probability model, the greedy information gain maximization algorithm for adaptive Bayesian estimation is asymptotically optimal in the sense that the determinant of the posterior covariance in a certain neighborhood of the true parameter value is asymptotically minimal. Using this result, we also obtain an asymptotic expression for the posterior entropy based on a novel definition of almost sure convergence on “most trials” (meaning that the convergence holds on a fraction of trials that converges to one). Then, we extend the results to a recently published framework, which generalizes the usual adaptive estimation setting by allowing different trial placements to be associated with different, random costs of observation. For this setting, the author has proposed the heuristic of maximizing the expected information gain divided by the expected cost of that placement. In this paper, we show that this myopic strategy satisfies an analogous asymptotic optimality result when the convergence of the posterior distribution is considered as a function of the total cost (as opposed to the number of observations).

Article information

Bernoulli, Volume 22, Number 1 (2016), 615-651.

Received: March 2012
Revised: May 2014
First available in Project Euclid: 30 September 2015

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

active data selection active learning asymptotic optimality Bayesian adaptive estimation cost of observation D-optimality decision theory differential entropy sequential estimation


Kujala, Janne V. Asymptotic optimality of myopic information-based strategies for Bayesian adaptive estimation. Bernoulli 22 (2016), no. 1, 615--651. doi:10.3150/14-BEJ670.

Export citation


  • [1] Azuma, K. (1967). Weighted sums of certain dependent random variables. Tôhoku Math. J. (2) 19 357–367.
  • [2] Chow, Y.S. (1967). On a strong law of large numbers for martingales. Ann. Math. Statist. 38 610.
  • [3] Cover, T.M. and Thomas, J.A. (2006). Elements of Information Theory, 2nd ed. Hoboken, NJ: Wiley.
  • [4] Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc. 58 13–30.
  • [5] Kujala, J.V. (2010). Obtaining the best value for money in adaptive sequential estimation. J. Math. Psych. 54 475–480.
  • [6] Kujala, J.V. (2012). Bayesian adaptive estimation: A theoretical review. In Descriptive and Normative Approaches to Human Behavior (E.N. Dzhafarov and L. Perry, eds.). Adv. Ser. Math. Psychol. 3 123–159. Hackensack, NJ: World Sci. Publ.
  • [7] Kujala, J.V. and Lukka, T.J. (2006). Bayesian adaptive estimation: The next dimension. J. Math. Psych. 50 369–389.
  • [8] Lindley, D.V. (1956). On a measure of the information provided by an experiment. Ann. Math. Statist. 27 986–1005.
  • [9] MacKay, D.J.C. (1992). Information-based objective functions for active data selection. Neural Comput. 4 590–604.
  • [10] Paninski, L. (2005). Asymptotic theory of information-theoretic experimental design. Neural Comput. 17 1480–1507.
  • [11] Schervish, M.J. (1995). Theory of Statistics. Springer Series in Statistics. New York: Springer.
  • [12] Shiryaev, A.N. (1996). Probability, 2nd ed. Graduate Texts in Mathematics 95. New York: Springer.
  • [13] van der Vaart, A.W. (1998). Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics 3. Cambridge: Cambridge Univ. Press.