Annals of Statistics

Minimax estimation of linear and quadratic functionals on sparsity classes

Olivier Collier, Laëtitia Comminges, and Alexandre B. Tsybakov

Full-text: Open access


For the Gaussian sequence model, we obtain nonasymptotic minimax rates of estimation of the linear, quadratic and the $\ell_{2}$-norm functionals on classes of sparse vectors and construct optimal estimators that attain these rates. The main object of interest is the class $B_{0}(s)$ of $s$-sparse vectors $\theta=(\theta_{1},\dots,\theta_{d})$, for which we also provide completely adaptive estimators (independent of $s$ and of the noise variance $\sigma $) having logarithmically slower rates than the minimax ones. Furthermore, we obtain the minimax rates on the $\ell_{q}$-balls $B_{q}(r)=\{\theta\in\mathbb{R}^{d}:\|\theta\|_{q}\le r\}$ where $0<q\le2$, and $\|\theta\|_{q}=(\sum_{i=1}^{d}|\theta_{i}|^{q})^{1/q}$. This analysis shows that there are, in general, three zones in the rates of convergence that we call the sparse zone, the dense zone and the degenerate zone, while a fourth zone appears for estimation of the quadratic functional. We show that, as opposed to estimation of $\theta$, the correct logarithmic terms in the optimal rates for the sparse zone scale as $\log(d/s^{2})$ and not as $\log(d/s)$. For the class $B_{0}(s)$, the rates of estimation of the linear functional and of the $\ell_{2}$-norm have a simple elbow at $s=\sqrt{d}$ (boundary between the sparse and the dense zones) and exhibit similar performances, whereas the estimation of the quadratic functional $Q(\theta)$ reveals more complex effects: the minimax risk on $B_{0}(s)$ is infinite and the sparseness assumption needs to be combined with a bound on the $\ell_{2}$-norm. Finally, we apply our results on estimation of the $\ell_{2}$-norm to the problem of testing against sparse alternatives. In particular, we obtain a nonasymptotic analog of the Ingster–Donoho–Jin theory revealing some effects that were not captured by the previous asymptotic analysis.

Article information

Ann. Statist., Volume 45, Number 3 (2017), 923-958.

Received: February 2015
Revised: October 2015
First available in Project Euclid: 13 June 2017

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62J05: Linear regression 62G05: Estimation

Nonasymptotic minimax estimation linear functional quadratic functional sparsity unknown noise variance thresholding


Collier, Olivier; Comminges, Laëtitia; Tsybakov, Alexandre B. Minimax estimation of linear and quadratic functionals on sparsity classes. Ann. Statist. 45 (2017), no. 3, 923--958. doi:10.1214/15-AOS1432.

Export citation


  • [1] Abramovich, F. and Grinshtein, V. (2010). MAP model selection in Gaussian regression. Electron. J. Stat. 4 932–949.
  • [2] Aldous, D. J. (1985). Exchangeability and related topics. In École D’été de Probabilités de Saint-Flour, XIII—1983. Lecture Notes in Math. 1117 1–198. Springer, Berlin.
  • [3] Arias-Castro, E., Candès, E. J. and Plan, Y. (2011). Global testing under sparse alternatives: ANOVA, multiple comparisons and the higher criticism. Ann. Statist. 39 2533–2556.
  • [4] Baraud, Y. (2002). Non-asymptotic minimax rates of testing in signal detection. Bernoulli 8 577–606.
  • [5] Birgé, L. and Massart, P. (2001). Gaussian model selection. J. Eur. Math. Soc. (JEMS) 3 203–268.
  • [6] Birnbaum, Z. W. (1942). An inequality for Mill’s ratio. Ann. Math. Stat. 13 245–246.
  • [7] Butucea, C. (2007). Goodness-of-fit testing and quadratic functional estimation from indirect observations. Ann. Statist. 35 1907–1930.
  • [8] Butucea, C. and Comte, F. (2009). Adaptive estimation of linear functionals in the convolution model and applications. Bernoulli 15 69–98.
  • [9] Cai, T. T. and Low, M. G. (2004). Minimax estimation of linear functionals over nonconvex parameter spaces. Ann. Statist. 32 552–576.
  • [10] Cai, T. T. and Low, M. G. (2005). On adaptive estimation of linear functionals. Ann. Statist. 33 2311–2343.
  • [11] Cai, T. T. and Low, M. G. (2005). Nonquadratic estimators of a quadratic functional. Ann. Statist. 33 2930–2956.
  • [12] Donoho, D. and Jin, J. (2004). Higher criticism for detecting sparse heterogeneous mixtures. Ann. Statist. 32 962–994.
  • [13] Donoho, D. L. and Johnstone, I. M. (1994). Minimax risk over $\ell_{p}$-balls for $\ell_{q}$-error. Probab. Theory Related Fields 99 277–303.
  • [14] Donoho, D. L. and Liu, R. (1991). Geometrizing rates of convergence. III. Ann. Statist. 19 668–701.
  • [15] Donoho, D. L. and Nussbaum, M. (1990). Minimax quadratic estimation of a quadratic functional. J. Complexity 6 290–323.
  • [16] Efromovich, S. and Low, M. L. (1996). On optimal adaptive estimation of a quadratic functional. Ann. Statist. 24 1106–1125.
  • [17] Goldenshluger, A. and Pereverzev, S. V. (2003). On adaptive inverse estimation of linear functionals. Bernoulli 9 783–807.
  • [18] Golubev, G. K. (2004). The method of risk envelopes in the estimation of linear functionals. Problemy Peredachi Informatsii 40 58–72.
  • [19] Golubev, Y. and Levit, B. (2004). An oracle approach to adaptive estimation of linear functionals in a Gaussian model. Math. Methods Statist. 13 392–408 (2005).
  • [20] Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc. 58 13–30.
  • [21] Ibragimov, I. A. and Hasminskii, R. Z. (1984). Nonparametric estimation of the value of a linear functional in Gaussian white noise. Theory Probab. Appl. 29 19–32.
  • [22] Ingster, Yu. I. (1997). Some problems of hypothesis testing leading to infinitely divisible distributions. Math. Methods Statist. 6 47–69.
  • [23] Ingster, Y. I., Pouet, C. and Tsybakov, A. B. (2009). Classification of sparse high-dimensional vectors. Philos. Trans. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 367 4427–4448.
  • [24] Ingster, Y. I. and Suslina, I. A. (2003). Nonparametric Goodness-of-Fit Testing Under Gaussian Models. Lecture Notes in Statist. 169. Springer, New York.
  • [25] Ingster, Y. I., Tsybakov, A. B. and Verzelen, N. (2010). Detection boundary in sparse regression. Electron. J. Stat. 4 1476–1526.
  • [26] Johnstone, I. (2001). Thresholding for weighted $\chi^{2}$. Statist. Sinica 11 691–704.
  • [27] Johnstone, I. M. (2001). Chi-square oracle inequalities. IMS Lecture Notes Monogr. Ser. 36 399–418.
  • [28] Johnstone, I. M. (2013). Gaussian Estimation: Sequence and Wavelet Models. Book draft.
  • [29] Juditsky, A. and Nemirovski, A. (2009). Nonparametric estimation by convex programming. Ann. Statist. 37 2278–2300.
  • [30] Klemelä, J. (2006). Sharp adaptive estimation of quadratic functionals. Probab. Theory Related Fields 134 539–564.
  • [31] Klemelä, J. and Tsybakov, A. B. (2001). Sharp adaptive estimation of linear functionals. Ann. Statist. 29 1567–1600.
  • [32] Laurent, B., Ludeña, C. and Prieur, C. (2008). Adaptive estimation of linear functionals by model selection. Electron. J. Stat. 2 993–1020.
  • [33] Laurent, B. and Massart, P. (2000). Adaptive estimation of a quadratic functional by model selection. Ann. Statist. 28 1302–1338.
  • [34] Lepski, O., Nemirovski, A. and Spokoiny, V. (1999). On estimation of the $L_{r}$ norm of a regression function. Probab. Theory Related Fields 113 221–253.
  • [35] Le Cam, L. (1973). Convergence of estimates under dimensionality restrictions. Ann. Statist. 1 38–53.
  • [36] Nemirovski, A. (2000). Topics in Nonparametric Statistics. Ecole d’été de Probabilités de Saint Flour 1998. Lecture Notes in Math. 1738. Springer, New York.
  • [37] Rigollet, P. and Tsybakov, A. B. (2011). Exponential screening and optimal rates of sparse estimation. Ann. Statist. 39 731–771.
  • [38] Sampford, M. R. (1953). Some inequalities on Mills ratio and related functions. Ann. Math. Stat. 24 132–134.
  • [39] Tsybakov, A. B. (2009). Introduction to Nonparametric Estimation. Springer, New York.
  • [40] Verzelen, N. (2012). Minimax risks for sparse regressions: Ultra-high dimensional phenomenons. Electron. J. Stat. 6 38–90.