Electronic Journal of Statistics

Quantifying the cost of simultaneous non-parametric approximation of several samples

P.L. Davies and A. Kovac

Full-text: Open access


We consider the standard non-parametric regression model with Gaussian errors but where the data consist of different samples. The question to be answered is whether the samples can be adequately represented by the same regression function. To do this we define for each sample a universal, honest and non-asymptotic confidence region for the regression function. Any subset of the samples can be represented by the same function if and only if the intersection of the corresponding confidence regions is non-empty. If the empirical supports of the samples are disjoint then the intersection of the confidence regions is always non–empty and a negative answer can only be obtained by placing shape or quantitative smoothness conditions on the joint approximation, or by making additional assumptions about the support points. Alternatively, a simplest joint approximation function can be calculated which gives a measure of the cost of the joint approximation, for example, the number of extra peaks required.

Article information

Electron. J. Statist., Volume 3 (2009), 747-780.

First available in Project Euclid: 11 August 2009

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G08: Nonparametric regression
Secondary: 62G15: Tolerance and confidence regions 62P35: Applications to physics 82D25: Crystals {For crystallographic group theory, see 20H15}

Modality, non-parametric regression, penalization, regularization, total variation


Davies, P.L.; Kovac, A. Quantifying the cost of simultaneous non-parametric approximation of several samples. Electron. J. Statist. 3 (2009), 747--780. doi:10.1214/08-EJS298. https://projecteuclid.org/euclid.ejs/1249996007

Export citation


