Electronic Journal of Statistics

Importance sampling and its optimality for stochastic simulation models

Yen-Chi Chen and Youngjun Choe

Full-text: Open access


We consider the problem of estimating an expected outcome from a stochastic simulation model. Our goal is to develop a theoretical framework on importance sampling for such estimation. By investigating the variance of an importance sampling estimator, we propose a two-stage procedure that involves a regression stage and a sampling stage to construct the final estimator. We introduce a parametric and a nonparametric regression estimator in the first stage and study how the allocation between the two stages affects the performance of the final estimator. We analyze the variance reduction rates and derive oracle properties of both methods. We evaluate the empirical performances of the methods using two numerical examples and a case study on wind turbine reliability evaluation.

Article information

Electron. J. Statist., Volume 13, Number 2 (2019), 3386-3423.

Received: October 2018
First available in Project Euclid: 25 September 2019

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G20: Asymptotic properties
Secondary: 62G86: Nonparametric inference and fuzziness 62H30: Classification and discrimination; cluster analysis [See also 68T10, 91C20]

Nonparametric estimation stochastic simulation model oracle property variance reduction Monte Carlo

Creative Commons Attribution 4.0 International License.


Chen, Yen-Chi; Choe, Youngjun. Importance sampling and its optimality for stochastic simulation models. Electron. J. Statist. 13 (2019), no. 2, 3386--3423. doi:10.1214/19-EJS1604. https://projecteuclid.org/euclid.ejs/1569377057

Export citation


  • D. H. Ackley., A connectionist machine for genetic hillclimbing. Boston: Kluwer Academic Publishers, 1987.
  • B. Ankenman, B. L. Nelson, and J. Staum. Stochastic kriging for simulation metamodeling., Operations Research, 58(2):371–382, 2010.
  • S. Au and J. L. Beck. A new adaptive importance sampling scheme for reliability calculations., Structural Safety, 21(2):135–158, 1999.
  • S. K. Au and J. L. Beck. Important sampling in high dimensions., Structural Safety, 25(2):139–163, 2003.
  • S. Balakrishnan, S. Narayanan, A. Rinaldo, A. Singh, and L. Wasserman. Cluster trees on manifolds. In, Advances in Neural Information Processing Systems, pages 2679–2687, 2013.
  • J. Blanchet and H. Lam. Importance sampling for actuarial cost analysis under a heavy traffic model. In, Proceedings of the Winter Simulation Conference, pages 3817–3828. Winter Simulation Conference, 2011.
  • A. Brennan, S. Kharroubi, A. O’Hagan, and J. Chilcott. Calculating partial expected value of perfect information via Monte Carlo sampling algorithms., Medical Decision Making, 27(4):448–470, 2007.
  • J. Bucklew., Introduction to rare event simulation. New York: Springer-Verlag, 2004.
  • Y.-C. Chen. Generalized cluster trees and singular measures., arXiv preprint arXiv:1611.02762, 2016.
  • Y.-C. Chen, C. R. Genovese, S. Ho, and L. Wasserman. Optimal ridge detection using coverage risk. In, Advances in Neural Information Processing Systems, pages 316–324, 2015a.
  • Y.-C. Chen, C. R. Genovese, and L. Wasserman. Asymptotic theory for density ridges., The Annals of Statistics, 43(5) :1896–1928, 2015b.
  • Y.-C. Chen, C. R. Genovese, R. J. Tibshirani, and L. Wasserman. Nonparametric modal regression., The Annals of Statistics, 44(2):489–514, 2016.
  • Y.-C. Chen, C. R. Genovese, and L. Wasserman. Density level sets: Asymptotics, inference, and visualization., Journal of the American Statistical Association, pages 1–13, 2017.
  • Y. Choe. Information criterion for minimum cross-entropy model selection., arXiv preprint arXiv:1704.04315, 2017.
  • Y. Choe, E. Byon, and N. Chen. Importance sampling for reliability evaluation with stochastic simulation models., Technometrics, 57(3):351–361, 2015.
  • Y. Choe, Q. Pan, and E. Byon. Computationally efficient uncertainty minimization in wind turbine extreme load assessments., Journal of Solar Energy Engineering, 138(4) :041012–041012–8, 2016.
  • Y. Choe, H. Lam, and E. Byon. Uncertainty quantification of stochastic simulation for black-box computer experiments., Methodology and Computing in Applied Probability, Oct 2017. ISSN 1573-7713.
  • J.-M. Cornuet, J.-M. Marin, A. Mira, and C. P. Robert. Adaptive multiple importance sampling., Scandinavian Journal of Statistics, 39(4):798–812, 2012.
  • B. Efron., The jackknife, the bootstrap and other resampling plans. SIAM, 1982.
  • B. Efron. Bootstrap methods: another look at the jackknife. In, Breakthroughs in Statistics, pages 569–593. Springer, 1992.
  • U. Einmahl and D. M. Mason. Uniform in bandwidth consistency of kernel-type function estimators., The Annals of Statistics, 33(3) :1380–1403, 2005.
  • V. Elvira, L. Martino, and C. P. Robert. Rethinking the effective sample size., arXiv preprint arXiv:1809.04129, 2018.
  • M. Evans and T. Swartz. Methods for approximating integrals in statistics with special emphasis on Bayesian integration problems., Statistical Science, 10(3):254–272, 1995.
  • C. R. Genovese, M. Perone-Pacifico, I. Verdinelli, and L. Wasserman. Nonparametric ridge estimation., The Annals of Statistics, 42(4) :1511–1545, 2014.
  • D. T. Gillespie. Approximate accelerated stochastic simulation of chemically reacting systems., The Journal of Chemical Physics, 115(4) :1716–1733, 2001.
  • E. Giné and A. Guillou. Rates of strong uniform consistency for multivariate kernel density estimators. In, Annales de l’Institut Henri Poincare (B) Probability and Statistics, volume 38, pages 907–921. Elsevier, 2002.
  • P. Glasserman and J. Li. Importance sampling for portfolio credit risk., Management Science, 51(11) :1643–1656, 2005.
  • P. W. Glynn and D. L. Iglehart. Importance sampling for stochastic simulations., Management Science, 35(11) :1367–1392, 1989.
  • P. Graf, R. R. Damiani, K. Dykes, and J. M. Jonkman. Advances in the assessment of wind turbine operating extreme loads via more efficient calculation approaches. In, 35th Wind Energy Symposium, pages 1–19, Grapevine, TX, 2017. AIAA SciTech Forum.
  • P. A. Graf, G. Stewart, M. Lackner, K. Dykes, and P. Veers. High-throughput computation and the applicability of Monte Carlo integration in fatigue load estimation of floating offshore wind turbines., Wind Energy, 19(5):861–872, 2016.
  • R. B. Gramacy and H. K. Lee. Bayesian treed Gaussian process models with an application to computer modeling., Journal of the American Statistical Association, 103(483) :1119–1130, 2012.
  • L. Györfi, M. Kohler, A. Krzyzak, and H. Walk., A distribution-free theory of nonparametric regression. Springer Science & Business Media, 2006.
  • P. Heidelberger. Fast simulation of rare events in queueing and reliability models., ACM Transactions on Modeling and Computer Simulation (TOMACS), 5(1):43–85, 1995.
  • D. A. Henderson, R. J. Boys, K. J. Krishnan, C. Lawless, and D. J. Wilkinson. Bayesian emulation and calibration of a stochastic computer model of mitochondrial dna deletions in substantia nigra neurons., Journal of the American Statistical Association, 104(485):76–87, 2012.
  • International Electrotechnical Commission. IEC/TC88, 61400-1 ed. 3, Wind Turbines - Part 1: Design Requirements., 2005.
  • B. J. Jonkman. TurbSim user’s guide: version 1.50. Technical Report NREL/TP-500 -46198, National Renewable Energy Laboratory, Golden, Colorado, 2009.
  • J. M. Jonkman and M. L. Buhl Jr. FAST User’s Guide. Technical Report NREL/EL-500 -38230, National Renewable Energy Laboratory, Golden, Colorado, 2005.
  • J. M. Jonkman, S. Butterfield, W. Musial, and G. Scott. Definition of a 5-MW reference wind turbine for offshore system development. Technical Report NREL/TP-500 -38060, National Renewable Energy Laboratory, Golden, Colorado, 2009.
  • H. Kahn and A. W. Marshall. Methods of reducing sample size in Monte Carlo computations., Journal of the Operations Research Society of America, 1(5):263–278, 1953.
  • A. Kong. A note on importance sampling using standardized weights., University of Chicago, Dept. of Statistics, Tech. Rep, 348, 1992.
  • D. P. Kroese, T. Taimre, and Z. I. Botev., Handbook of Monte Carlo methods. New York: John Wiley and Sons., 2011.
  • E. Lawrence, S. V. Wiel, and R. Bent. Model bank state estimation for power grids using importance sampling., Technometrics, 55(4):426–435, 2013.
  • W. Li, Z. Tan, and R. Chen. Two-stage importance sampling with mixture proposals., Journal of the American Statistical Association, 108(504) :1350–1365, 2013.
  • L. Manuel, H. H. Nguyen, and M. F. Barone. On the use of a large database of simulated wind turbine loads to aid in assessing design standard provisions. In, Proceedings of the 51st AIAA Aerospace Sciences Meeting Including the New Horizons Forum and Aerospace Exposition, Grapevine, Texas, Jan. 2013.
  • P. Moriarty. Database for validation of design load extrapolation techniques., Wind Energy, 11(6):559–576, 2008.
  • E. A. Nadaraya. On estimating regression., Theory of Probability & Its Applications, 9(1):141–142, 1964.
  • J. C. Neddermeyer. Computationally efficient nonparametric importance sampling., Journal of the American Statistical Association, 104(486):788–802, 2009.
  • A. Owen and Y. Zhou. Safe and effective importance sampling., Journal of the American Statistical Association, 95(449):135–143, 2000.
  • A. B. Owen. Importance sampling. In, Monte Carlo Theory, Methods and Examples, chapter 9. 2018.
  • A. B. Owen, Y. Maximov, and M. Chertkov. Importance sampling the union of rare events with an application to power systems analysis., Electronic Journal of Statistics, 13(1):231–254, 2019.
  • V. Picheny, D. Ginsbourger, Y. Richet, and G. Caplin. Quantile-based optimization of noisy computer experiments with tunable precision., Technometrics, 55(1):2–13, 2013.
  • M. Plumlee and R. Tuo. Building accurate emulators for stochastic simulations via quantile kriging., Technometrics, 56(4):466–473, 2014.
  • Z. Qian, C. C. Seepersad, V. R. Joseph, J. K. Allen, and C. J. Wu. Building surrogate models based on detailed and approximate simulations., Journal of Mechanical Design, 128(4):668–677, 2006.
  • R. A. Rigby and D. M. Stasinopoulos. Generalized additive models for location, scale and shape., Journal of the Royal Statistical Society: Series C (Applied Statistics), 54(3):507–554, 2005.
  • D. W. Scott., Multivariate density estimation: theory, practice, and visualization. John Wiley & Sons, 2015.
  • J. Staum., Monte Carlo computation in finance, pages 19–42. Springer Berlin Heidelberg, Berlin, Heidelberg, 2009.
  • Y. Sun, D. W. Apley, and J. Staum. Efficient nested simulation for estimating the variance of a conditional expectation., Operations Research, 59(4):998 –1007, 2011.
  • A. B. Tsybakov. Introduction to nonparametric estimation. Revised and extended from the 2004 french original. Translated by Vladimir Zaiats, 2009.
  • A. W. van der Vaart., Asymptotic statistics, volume 3. Cambridge University Press, 2000.
  • A. W. van der Vaart and J. A. Wellner. Weak convergence. In, Weak Convergence and Empirical Processes, pages 16–28. Springer, 1996.
  • G. G. Wang and S. Shan. Review of metamodeling techniques in support of engineering design optimization., Journal of Mechanical design, 129(4):370–380, 2007.
  • L. Wasserman., All of nonparametric statistics. Springer Science & Business Media, 2006.
  • G. S. Watson. Smooth regression analysis., Sankhyā: The Indian Journal of Statistics, Series A, pages 359–372, 1964.
  • C. F. J. Wu. Post-Fisherian experimentation: From physical to virtual., Journal of the American Statistical Association, 110(510):612–620, 2015.
  • M. You, E. Byon, J. J. Jin, and G. Lee. When wind travels through turbines: A new statistical approach for characterizing heterogeneous wake effects in multi-turbine wind farms., IISE Transactions, 49(1):84–95, 2017.
  • P. Zhang. Nonparametric importance sampling., Journal of the American Statistical Association, 91(435) :1245–1253, 1996.