The Annals of Statistics
- Ann. Statist.
- Volume 23, Number 6 (1995), 2241-2263.
On bandwidth choice for density estimation with dependent data
We address the empirical bandwidth choice problem in cases where the range of dependence may be virtually arbitrarily long. Assuming that the observed data derive from an unknown function of a Gaussian process, it is argued that, unlike more traditional contexts of statistical inference, in density estimation there is no clear role for the classical distinction between short- and long-range dependence. Indeed, the "boundaries" that separate different modes of behaviour for optimal bandwidths and mean squared errors are determined more by kernel order than by traditional notions of strength of dependence, for example, by whether or not the sum of the covariances converges. We provide surprising evidence that, even for some strongly dependent data sequences, the asymptotically optimal bandwidth for independent data is a good choice. A plug-in empirical bandwidth selector based on this observation is suggested. We determine the properties of this choice for a wide range of different strengths of dependence. Properties of cross-validation are also addressed.
Ann. Statist., Volume 23, Number 6 (1995), 2241-2263.
First available in Project Euclid: 15 October 2002
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Primary: 62G07: Density estimation
Secondary: 62M10: Time series, auto-correlation, regression, etc. [See also 91B84]
Bandwidth choice cross-validation density estimation Gaussian process integrated squared error kernel methods long-range dependence mean integrated squared error plug-in rule short-range dependence window width
Hall, Peter; Lahiri, Soumendra Nath; Truong, Young K. On bandwidth choice for density estimation with dependent data. Ann. Statist. 23 (1995), no. 6, 2241--2263. doi:10.1214/aos/1034713655. https://projecteuclid.org/euclid.aos/1034713655