The Annals of Applied Probability

The sample size required in importance sampling

Sourav Chatterjee and Persi Diaconis

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


The goal of importance sampling is to estimate the expected value of a given function with respect to a probability measure $\nu$ using a random sample of size $n$ drawn from a different probability measure $\mu$. If the two measures $\mu$ and $\nu$ are nearly singular with respect to each other, which is often the case in practice, the sample size required for accurate estimation is large. In this article, it is shown that in a fairly general setting, a sample of size approximately $\exp(D(\nu\parallel\mu))$ is necessary and sufficient for accurate estimation by importance sampling, where $D(\nu\parallel\mu)$ is the Kullback–Leibler divergence of $\mu$ from $\nu$. In particular, the required sample size exhibits a kind of cut-off in the logarithmic scale. The theory is applied to obtain a general formula for the sample size required in importance sampling for one-parameter exponential families (Gibbs measures).

Article information

Ann. Appl. Probab., Volume 28, Number 2 (2018), 1099-1135.

Received: November 2015
Revised: June 2017
First available in Project Euclid: 11 April 2018

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 65C05: Monte Carlo methods 65C60: Computational problems in statistics 60F05: Central limit and other weak theorems 82B80: Numerical methods (Monte Carlo, series resummation, etc.) [See also 65-XX, 81T80]

Importance sampling Monte Carlo methods Gibbs measure phase transition


Chatterjee, Sourav; Diaconis, Persi. The sample size required in importance sampling. Ann. Appl. Probab. 28 (2018), no. 2, 1099--1135. doi:10.1214/17-AAP1326.

Export citation


  • [1] Agapiou, S., Papaspiliopoulos, O., Sanz-Alonso, D. and Stuart, A. M. (2017). Importance sampling: Computational complexity and intrinsic dimension. Preprint. Available at arXiv:1511.06196.
  • [2] Asmussen, S. and Glynn, P. W. (2007). Stochastic Simulation: Algorithms and Analysis. Stochastic Modelling and Applied Probability 57. Springer, New York.
  • [3] Bahadur, R. R. (1960). Some approximations to the binomial distribution function. Ann. Math. Stat. 31 43–54.
  • [4] Bassetti, F. and Diaconis, P. (2006). Examples comparing importance sampling and the Metropolis algorithm. Illinois J. Math. 50 67–91.
  • [5] Baxter, R. J. (1982). Exactly Solved Models in Statistical Mechanics. Academic Press, London.
  • [6] Bhattacharya, B. B., Ganguly, S., Lubetzky, E. and Zhao, Y. (2015). Upper tails and independence polynomials in random graphs. Preprint. Available at arXiv:1507.04074.
  • [7] Blanchet, J. and Glynn, P. (2008). Efficient rare-event simulation for the maximum of heavy-tailed random walks. Ann. Appl. Probab. 18 1351–1378.
  • [8] Blanchet, J., Glynn, P. and Leder, K. (2012). On Lyapunov inequalities and subsolutions for efficient importance sampling. ACM Trans. Model. Comput. Simul. 22 1104–1128.
  • [9] Blanchet, J. and Liu, J. (2008). State-dependent importance sampling for regularly varying random walks. Adv. Appl. Probab. 40 1104–1128.
  • [10] Blanchet, J. and Liu, J. (2010). Efficient importance sampling in ruin problems for multidimensional regularly varying random walks. J. Appl. Probab. 47 301–322.
  • [11] Blitzstein, J. and Diaconis, P. (2010). A sequential importance sampling algorithm for generating random graphs with prescribed degrees. Internet Math. 6 489–522.
  • [12] Bousquet-Mélou, M. (2014). On the importance sampling of self-avoiding walks. Combin. Probab. Comput. 23 725–748.
  • [13] Cappé, O., Moulines, E. and Rydén, T. (2005). Inference in Hidden Markov Models. Springer, New York.
  • [14] Chan, H. P. and Lai, T. L. (2007). Efficient importance sampling for Monte Carlo evaluation of exceedance probabilities. Ann. Appl. Probab. 17 440–473.
  • [15] Chan, H. P. and Lai, T. L. (2011). A sequential Monte Carlo approach to computing tail probabilities in stochastic models. Ann. Appl. Probab. 21 2315–2342.
  • [16] Chatterjee, S. and Diaconis, P. (2013). Estimating and understanding exponential random graph models. Ann. Statist. 41 2428–2461.
  • [17] Chen, Y., Diaconis, P., Holmes, S. P. and Liu, J. S. (2005). Sequential Monte Carlo methods for statistical analysis of tables. J. Amer. Statist. Assoc. 100 109–120.
  • [18] Chen, Y. and Liu, J. S. (2007). Sequential Monte Carlo methods for permutation tests on truncated data. Statist. Sinica 17 857–872.
  • [19] Del Moral, P. (2004). Feynman–Kac Formulae: Genealogical and Interacting Particle Systems with Applications. Springer, New York.
  • [20] Del Moral, P. (2013). Mean Field Simulation for Monte Carlo Integration. CRC Press, Boca Raton, FL.
  • [21] Del Moral, P., Kohn, R. and Patras, F. (2015). A duality formula for Feynman–Kac path particle models. C. R. Math. Acad. Sci. Paris 353 465–469.
  • [22] Diaconis, P. and Zabell, S. (1991). Closed form summation for classical distributions: Variations on a theme of de Moivre. Statist. Sci. 6 284–302.
  • [23] Doucet, A., de Freitas, N. and Gordon, N., eds. (2001). Sequential Monte Carlo Methods in Practice. Springer, New York.
  • [24] Dupuis, P., Spiliopoulos, K. and Wang, H. (2012). Importance sampling for multiscale diffusions. Multiscale Model. Simul. 10 1–27.
  • [25] Dupuis, P. and Wang, H. (2004). Importance sampling, large deviations, and differential games. Stoch. Stoch. Rep. 76 481–508.
  • [26] Efron, B. (2012). Bayesian inference and the parametric bootstrap. Ann. Appl. Stat. 6 1971–1997.
  • [27] Freer, C. E., Mansinghka, V. K. and Roy, D. M. (2010). When are probabilistic programs probably computationally tractable? Presented at the NIPS Workshop on Monte Carlo Methods for Modern Applications, 2010. Available at
  • [28] Gelman, A. and Meng, X.-L. (1998). Simulating normalizing constants: From importance sampling to bridge sampling to path sampling. Statist. Sci. 13 163–185.
  • [29] Hammersley, J. M. and Handscomb, D. C. (1965). Monte Carlo Methods. Methuen & Co., Ltd., London.
  • [30] Hesterberg, T. (1995). Weighted average importance sampling and defensive mixture distributions. Technometrics 37 185–194.
  • [31] Huggins, J. H. and Roy, D. M. (2015). Convergence of sequential Monte Carlo-based sampling methods. Preprint. Available at arXiv:1503.00966.
  • [32] Hult, H. and Nyquist, P. (2016). Large deviations for weighted empirical measures arising in importance sampling. Stochastic Process. Appl. 126 138–170.
  • [33] Kahn, H. and Marshall, A. W. (1953). Methods of reducing sample size in Monte Carlo computations. J. Oper. Res. Soc. Am. 1 263–278.
  • [34] Kenyon, R., Kral, D., Radin, C. and Winkler, P. (2015). A variational principle for permutations. Preprint. Available at arXiv:1506.02340.
  • [35] Kenyon, R., Radin, C., Ren, K. and Sadun, L. (2014). Multipodal structure and phase transitions in large constrained graphs. Preprint. Available at arXiv:1405.0599.
  • [36] Kenyon, R. and Yin, M. (2014). On the asymptotics of constrained exponential random graphs. Preprint. Available at arXiv:1406.3662.
  • [37] Knuth, D. E. (1976). Mathematics and computer science: Coping with finiteness. Science 194 1235–1242.
  • [38] Knuth, D. E. (1996). Selected Papers on Computer Science. CSLI Lecture Notes 59. CSLI Publications, Stanford, CA; Cambridge University Press, Cambridge.
  • [39] Lelièvre, T., Rousset, M. and Stoltz, G. (2010). Free Energy Computations: A Mathematical Perspective. World Scientific, Singapore.
  • [40] Liu, J. S. (2008). Monte Carlo Strategies in Scientific Computing. Springer, New York.
  • [41] Liu, J. S. and Chen, R. (1995). Blind deconvolution via sequential imputations. J. Amer. Statist. Assoc. 90 567–576.
  • [42] Madras, N. (1998). Umbrella sampling and simulated tempering. In Numerical Methods for Polymeric Systems (Minneapolis, MN, 1996). IMA Vol. Math. Appl. 102 19–32. Springer, New York.
  • [43] Madras, N. and Piccioni, M. (1999). Importance sampling for families of distributions. Ann. Appl. Probab. 9 1202–1225.
  • [44] McCoy, B. M. (2010). Advanced Statistical Mechanics. International Series of Monographs on Physics 146. Oxford Univ. Press, Oxford.
  • [45] Mukherjee, S. (2013). Estimation in exponential families on permutations. Preprint. Available at arXiv:1307.0978.
  • [46] Naiman, D. Q. and Wynn, H. P. (1997). Abstract tubes, improved inclusion-exclusion identities and inequalities and importance sampling. Ann. Statist. 25 1954–1983.
  • [47] Owen, A. and Zhou, Y. (1999). Adaptive importance sampling by mixtures of products of beta distributions. Technical report No. 1999–25, Dept. Statistics, Stanford Univ., Stanford, CA.
  • [48] Owen, A. and Zhou, Y. (2000). Safe and effective importance sampling. J. Amer. Statist. Assoc. 95 135–143.
  • [49] Owen, A. B. (2005). Multidimensional variation for quasi-Monte Carlo. In Contemporary Multivariate Analysis and Design of Experiments. Ser. Biostat. 2 49–74. World Sci. Publ., Hackensack, NJ.
  • [50] Owen, A. B. (2006). Quasi-Monte Carlo for integrands with point singularities at unknown locations. In Monte Carlo and Quasi-Monte Carlo Methods 2004 403–417. Springer, Berlin.
  • [51] Robert, C. P. and Casella, G. (2004). Monte Carlo Statistical Methods, 2nd ed. Springer, New York.
  • [52] Rosenbluth, M. N. and Rosenbluth, A. W. (1955). Monte Carlo calculation of the average extension of molecular chains. J. Chem. Phys. 23 356–359.
  • [53] Shi, J., Siegmund, D. and Yakir, B. (2007). Importance sampling for estimating $p$ values in linkage analysis. J. Amer. Statist. Assoc. 102 929–937.
  • [54] Siegmund, D. (1976). Importance sampling in the Monte Carlo study of sequential tests. Ann. Statist. 4 673–684.
  • [55] Srinivasan, R. (2002). Importance Sampling: Applications in Communications and Detection. Springer, Berlin.
  • [56] Starr, S. (2009). Thermodynamic limit for the Mallows model on $S_{n}$. J. Math. Phys. 50 095208.
  • [57] Torrie, G. M. and Valleau, J. P. (1977). Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling. J. Comput. Phys. 23 187–199.
  • [58] Whiteley, N., Lee, A. and Heine, K. (2016). On the role of interaction in sequential Monte Carlo algorithms. Bernoulli 22 494–529.