Electronic Journal of Statistics

Adaptive variable selection in nonparametric sparse additive models

Cristina Butucea and Natalia Stepanova

Full-text: Open access

Abstract

We consider the problem of recovery of an unknown multivariate signal $f$ observed in a $d$-dimensional Gaussian white noise model of intensity $\varepsilon $. We assume that $f$ belongs to a class of smooth functions in $L_{2}([0,1]^{d})$ and has an additive sparse structure determined by the parameter $s$, the number of non-zero univariate components contributing to $f$. We are interested in the case when $d=d_{\varepsilon }\to \infty $ as $\varepsilon \to 0$ and the parameter $s$ stays “small” relative to $d$. With these assumptions, the recovery problem in hand becomes that of determining which sparse additive components are non-zero.

Attempting to reconstruct most, but not all, non-zero components of $f$, we arrive at the problem of almost full variable selection in high-dimensional regression. For two different choices of a class of smooth functions, we establish conditions under which almost full variable selection is possible, and provide a procedure that achieves this goal. Our procedure is the best possible (in the asymptotically minimax sense) for selecting most non-zero components of $f$. Moreover, it is adaptive in the parameter $s$. In addition to that, we complement the findings of [17] by obtaining an adaptive exact selector for the class of infinitely-smooth functions. Our theoretical results are illustrated with numerical experiments.

Article information

Source
Electron. J. Statist., Volume 11, Number 1 (2017), 2321-2357.

Dates
Received: April 2016
First available in Project Euclid: 27 May 2017

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1495850627

Digital Object Identifier
doi:10.1214/17-EJS1275

Mathematical Reviews number (MathSciNet)
MR3656494

Zentralblatt MATH identifier
1365.62133

Subjects
Primary: 62G08: Nonparametric regression
Secondary: 62G20: Asymptotic properties

Keywords
High-dimensional nonparametric regression sparse additive signals adaptive variable selection exact and almost full selectors

Rights
Creative Commons Attribution 4.0 International License.

Citation

Butucea, Cristina; Stepanova, Natalia. Adaptive variable selection in nonparametric sparse additive models. Electron. J. Statist. 11 (2017), no. 1, 2321--2357. doi:10.1214/17-EJS1275. https://projecteuclid.org/euclid.ejs/1495850627


Export citation

References

  • [1] Abramovich, F., De Feis, I. and Sapatinas, T. (2009). Optimal testing for additivity in multiple nonparametric regression., Annals of the Insitute of Statistical Mathematics 61 (3) 691–714.
  • [2] Bernstein S. N. (1946)., Probability Theory. OGIZ, Moscow–Leningrad. In Russian.
  • [3] Butucea, C., Stepanova, N. A. and Tsybakov, A. B. (2017). Variable selection with Hamming loss., Annals of Statistics, to appear.
  • [4] Chouldechova, A. and Hastie, T. (2015). Generalized additive model selection., http://arxiv.org/abs/1506.03850.
  • [5] Comminges, L. and Dalalyan, A. S. (2012). Tight conditions for consistency of variable selection in the context of high dimensionality., Annals of Statistics 40 (5) 2667–2696.
  • [6] DeGroot, M. (1970)., Optimal Statistical Decisions. McGraw-Hill Book Company, New York.
  • [7] Donoho, D. (2006). For most large underdetermined systems of linear equations the minimal $l^1$-norm solution is also the sparsest solution., Communications on Pure and Applied Mathematics 59 (7) 907–934.
  • [8] Ermakov, M. S. (1990). Minimax detection of a signal in a Gaussian white noise., Theory of Probability and Its Applications 35 (4) 667–679.
  • [9] Fan, J. and Lv, J. (2008). Sure independence screening for ultra-high dimensional feature space (with discussion)., Journal of Royal Statistical Society, Series B 70 849–911.
  • [10] Gayraud, G. and Ingster, Yu. I. (2012). Detection of sparse additive functions., Electronic Journal of Statistics 6 1409–1448.
  • [11] Genovese, C. R., Jin, J., Wasserman, L. and Yao, Z. (2012). A comparison of the lasso and marginal regression., Journal of Machine Learning Research 13 2107–2143.
  • [12] Golubev, Y. K. and Levit, B. Y. (1996). Asymptotically efficient estimation for analytic distributions., Mathematical Methods of Statisics 3 357–368.
  • [13] Huang. J., Horowitz, J. L. and Wei, F. (2010). Variable selection in nonparametric additive models., Annals of Statistics 38 2282–2313.
  • [14] Ingster, Yu. I. (1993). Asymptotically minimax hypothesis testing for nonparametric alternatives. I., Mathematical Methods of Statistics 2 (2) 85–114.
  • [15] Ingster, Yu. I. (1993). Asymptotically minimax hypothesis testing for nonparametric alternatives. II., Mathematical Methods of Statistics 2 (3) 171–189.
  • [16] Ingster, Yu. I. (1993). Asymptotically minimax hypothesis testing for nonparametric alternatives. III., Mathematical Methods of Statistics 2 (4) 249–268.
  • [17] Ingster, Yu. I. and Stepanova, N. A. (2014). Adaptive variable selection in nonparametric sparse regression., Journal of Mathematical Sciences 199 (2) 184–201.
  • [18] Ingster, Yu. I. and Suslina, I. A. (2003)., Nonparametric Goodness-of-Fit Testing Under Gaussian Models. Lecture Notes in Statistics, Vol. 169, Springer-Verlag, New York.
  • [19] Ingster, Yu. I. and Suslina, I. A. (2005). On estimation and detection of smooth function of many variables., Mathematical Methods of Statistics 14 299–331.
  • [20] Ingster, Yu. I. and Suslina, I. A. (2015). Detection of a sparse variable function., Journal of Mathematical Sciences 206 (2) 181–196.
  • [21] Ji, O. and Jin, J. (2012). UPS delivers optimal phase diagram in high-dimensional variable selection., Annals of Statistics 40 (1) 73–103.
  • [22] Lepski, O. V. (1991). One problem of adaptive estimation in Gaussian white noise., Theory of Probability and Its Applications 35 (3), 454–466.
  • [23] Meier, L., van de Geer, S. and Bühlmann, P. (2009). High-dimensional additive modeling., Annals of Statistics 37 3779–3821.
  • [24] Raskutti, G., Wainwright, M. J. and Yu, B. (2012). Minimax-optimal rates for high-dimensional sparse additive models over kernel classes., Journal of Machine Learning Research 13 281–319.
  • [25] Ravikumar, P., Liu, H., Lafferty, J. and Wasserman, L. (2007). SpAM: sparse additive models. In:, Advances in Neural Information Processing Systems, Vol. 20 (eds. J. C. Platt, D. Koller, Y. Singer, and S. Roweis), pp. 1202–1208, Cambridge, MA: MIT Press.