The Annals of Applied Statistics

Efficient emulators of computer experiments using compactly supported correlation functions, with an application to cosmology

Cari G. Kaufman, Derek Bingham, Salman Habib, Katrin Heitmann, and Joshua A. Frieman

Full-text: Open access


Statistical emulators of computer simulators have proven to be useful in a variety of applications. The widely adopted model for emulator building, using a Gaussian process model with strictly positive correlation function, is computationally intractable when the number of simulator evaluations is large. We propose a new model that uses a combination of low-order regression terms and compactly supported correlation functions to recreate the desired predictive behavior of the emulator at a fraction of the computational cost. Following the usual approach of taking the correlation to be a product of correlations in each input dimension, we show how to impose restrictions on the ranges of the correlations, giving sparsity, while also allowing the ranges to trade off against one another, thereby giving good predictive performance. We illustrate the method using data from a computer simulator of photometric redshift with 20,000 simulator evaluations and 80,000 predictions.

Article information

Ann. Appl. Stat., Volume 5, Number 4 (2011), 2470-2492.

First available in Project Euclid: 20 December 2011

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Emulators Gaussian processes computer experiments photometric redshift


Kaufman, Cari G.; Bingham, Derek; Habib, Salman; Heitmann, Katrin; Frieman, Joshua A. Efficient emulators of computer experiments using compactly supported correlation functions, with an application to cosmology. Ann. Appl. Stat. 5 (2011), no. 4, 2470--2492. doi:10.1214/11-AOAS489.

Export citation


  • Abbott, T. et al. (2005). The dark energy survey. Preprint. Available at Astro-ph/0510346.
  • An, J. and Owen, A. (2001). Quasi-regression. J. Complexity 17 588–607.
  • Andrieu, C. and Thoms, J. (2008). A tutorial on adaptive MCMC. Stat. Comput. 18 343–373.
  • Barry, R. P. and Pace, R. K. (1997). Kriging with large data sets using sparse matrix techniques. Comm. Statist. Simulation Comput. 26 619–629.
  • Bayarri, M. J., Berger, J. O., Paulo, R., Sacks, J., Cafeo, J. A., Cavendish, J., Lin, C.-H. and Tu, J. (2007). A framework for validation of computer models. Technometrics 49 138–154.
  • Berger, J. O., De Oliveira, V. and Sansó, B. (2001). Objective Bayesian analysis of spatially correlated data. J. Amer. Statist. Assoc. 96 1361–1374.
  • Cressie, N. A. C. (1993). Statistics for Spatial Data. Wiley, New York.
  • Denison, D. and George, E. (2000). Bayesian prediction using adaptive ridge estimators. Technical report, Dept. Mathematics, Imperial College, London, UK.
  • Frieman, J. A., Turner, M. S. and Huterer, D. (2008). Dark energy and the accelerating universe. Annual Review of Astronomy and Astrophysics 46 385–432.
  • Furrer, R., Genton, M. G. and Nychka, D. (2006). Covariance tapering for interpolation of large spatial datasets. J. Comput. Graph. Statist. 15 502–523.
  • Furrer, R. and Sain, S. R. (2010). spam: A sparse matrix R package with emphasis on MCMC methods for Gaussian Markov random fields. Journal of Statistical Software 36 1–25.
  • Gneiting, T. (2001). Criteria of Pólya type for radial positive definite functions. Proc. Amer. Math. Soc. 129 2309–2318 (electronic).
  • Gneiting, T. (2002). Compactly supported correlation functions. J. Multivariate Anal. 83 493–508.
  • Golubov, B. I. (1981). On Abel–Poisson type and Riesz means. Anal. Math. 7 161–184.
  • Irvine, K. M., Gitelman, A. I. and Hoeting, J. A. (2007). Spatial designs and properties of spatial correlation: Effects on covariance estimation. J. Agric. Biol. Environ. Stat. 12 450–469.
  • Kaufman, C. G., Schervish, M. J. and Nychka, D. W. (2008). Covariance tapering for likelihood-based estimation in large spatial data sets. J. Amer. Statist. Assoc. 103 1545–1555.
  • Kennedy, M. C. and O’Hagan, A. (2001). Bayesian calibration of computer models. J. R. Stat. Soc. Ser. B Stat. Methodol. 63 425–464.
  • Linkletter, C., Bingham, D., Hengartner, N., Higdon, D. and Ye, K. Q. (2006). Variable selection for Gaussian process models in computer experiments. Technometrics 48 478–490.
  • McKay, M. D., Beckman, R. J. and Conover, W. J. (1979). A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 21 239–245.
  • Oakley, J. E. and O’Hagan, A. (2004). Probabilistic sensitivity analysis of complex models: A Bayesian approach. J. R. Stat. Soc. Ser. B Stat. Methodol. 66 751–769.
  • Oyaizu, H., Cunha, C., Lima, M., Lin, H. and Frieman, J. (2006). Photometric redshifts for the Dark Energy Survey. In Bulletin of the American Astronomical Society 38 140.
  • Paulo, R. (2005). Default priors for Gaussian processes. Ann. Statist. 33 556–582.
  • Perlmutter, S., Aldering, G., Goldhaber, G., Knop, R., Nugent, P., Castro, P., Deustua, S., Fabbro, S., Goobar, A., Groom, D. et al. (1999). Measurements of [Omega] and [Lambda] from 42 high-redshift supernovae. The Astrophysical Journal 517 565–586.
  • Pissanetzky, S. (1984). Sparse Matrix Technology. Academic Press, London.
  • Riess, A. G., Filippenko, A. V., Challis, P., Clocchiatti, A., Diercks, A., Garnavich, P. M., Gilliland, R. L., Hogan, C. J., Jha, S., Kirshner, R. P. et al. (1998). Observational evidence from supernovae for an accelerating universe and a cosmological constant. Astronomical Journal 116 1009–1038.
  • Roberts, G. O. and Rosenthal, J. S. (2009). Examples of adaptive MCMC. J. Comput. Graph. Statist. 18 349–367.
  • Sacks, J., Welch, W. J., Mitchell, T. J. and Wynn, H. P. (1989). Design and analysis of computer experiments. Statist. Sci. 4 409–435.
  • Santner, T. J., Williams, B. J. and Notz, W. I. (2003). The Design and Analysis of Computer Experiments. Springer, New York.
  • Shaby, B. and Wells, M. T. (2011). Exploring an adaptive Metropolis algorithm. Technical Report 2011-14, Dept. Statistical Science, Duke Univ., Durham, NC.
  • Stein, M. L. (2008). A modeling approach for large spatial datasets. J. Korean Statist. Soc. 37 3–10.
  • Stein, M. L., Chi, Z. and Welty, L. J. (2004). Approximating likelihoods for large spatial data sets. J. R. Stat. Soc. Ser. B Stat. Methodol. 66 275–296.
  • Tang, B. (1993). Orthogonal array-based Latin hypercubes. J. Amer. Statist. Assoc. 88 1392–1397.
  • Welch, W. J., Buck, R. J., Sacks, J., Wynn, H. P., Mitchell, T. J. and Morris, M. D. (1992). Screening, predicting, and computer experiments. Technometrics 34 15–25.
  • Wikle, C. K. (2010). Low-rank representations for spatial processes. In Handbook of Spatial Statistics (A. E. Gelfand, P. Diggle, M. Fuentes and P. Guttorp, eds.) 107–118. CRC Press, Boca Raton, FL.