The Annals of Statistics
- Ann. Statist.
- Volume 46, Number 6A (2018), 2562-2592.
This paper introduces a new way to compact a continuous probability distribution $F$ into a set of representative points called support points. These points are obtained by minimizing the energy distance, a statistical potential measure initially proposed by Székely and Rizzo [InterStat 5 (2004) 1–6] for testing goodness-of-fit. The energy distance has two appealing features. First, its distance-based structure allows us to exploit the duality between powers of the Euclidean distance and its Fourier transform for theoretical analysis. Using this duality, we show that support points converge in distribution to $F$, and enjoy an improved error rate to Monte Carlo for integrating a large class of functions. Second, the minimization of the energy distance can be formulated as a difference-of-convex program, which we manipulate using two algorithms to efficiently generate representative point sets. In simulation studies, support points provide improved integration performance to both Monte Carlo and a specific quasi-Monte Carlo method. Two important applications of support points are then highlighted: (a) as a way to quantify the propagation of uncertainty in expensive simulations and (b) as a method to optimally compact Markov chain Monte Carlo (MCMC) samples in Bayesian computation.
Ann. Statist., Volume 46, Number 6A (2018), 2562-2592.
Received: August 2016
Revised: August 2017
First available in Project Euclid: 7 September 2018
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Primary: 62E17: Approximations to distributions (nonasymptotic)
Mak, Simon; Joseph, V. Roshan. Support points. Ann. Statist. 46 (2018), no. 6A, 2562--2592. doi:10.1214/17-AOS1629. https://projecteuclid.org/euclid.aos/1536307226
- Supplement A: Additional proofs and results. We provide in this supplement further details on technical results and simulation studies.