## Bernoulli

• Bernoulli
• Volume 25, Number 2 (2019), 1289-1325.

### Minimax optimal estimation in partially linear additive models under high dimension

#### Abstract

In this paper, we derive minimax rates for estimating both parametric and nonparametric components in partially linear additive models with high dimensional sparse vectors and smooth functional components. The minimax lower bound for Euclidean components is the typical sparse estimation rate that is independent of nonparametric smoothness indices. However, the minimax lower bound for each component function exhibits an interplay between the dimensionality and sparsity of the parametric component and the smoothness of the relevant nonparametric component. Indeed, the minimax risk for smooth nonparametric estimation can be slowed down to the sparse estimation rate whenever the smoothness of the nonparametric component or dimensionality of the parametric component is sufficiently large. In the above setting, we demonstrate that penalized least square estimators can nearly achieve minimax lower bounds.

#### Article information

Source
Bernoulli, Volume 25, Number 2 (2019), 1289-1325.

Dates
Revised: January 2018
First available in Project Euclid: 6 March 2019

https://projecteuclid.org/euclid.bj/1551862851

Digital Object Identifier
doi:10.3150/18-BEJ1021

Mathematical Reviews number (MathSciNet)
MR3920373

Zentralblatt MATH identifier
07049407

#### Citation

Yu, Zhuqing; Levine, Michael; Cheng, Guang. Minimax optimal estimation in partially linear additive models under high dimension. Bernoulli 25 (2019), no. 2, 1289--1325. doi:10.3150/18-BEJ1021. https://projecteuclid.org/euclid.bj/1551862851

#### References

• [1] Bickel, P.J., Klaassen, C.A.J., Ritov, Y. and Wellner, J.A. (1993). Efficient and Adaptive Estimation for Semiparametric Models. Johns Hopkins Series in the Mathematical Sciences. Baltimore, MD: Johns Hopkins Univ. Press.
• [2] Bühlmann, P. and van de Geer, S. (2011). Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer Series in Statistics. Heidelberg: Springer.
• [3] Cheng, G., Zhang, H.H. and Shang, Z. (2015). Sparse and efficient estimation for partial spline models with increasing dimension. Ann. Inst. Statist. Math. 67 93–127.
• [4] Gilbert, E.N. (1952). A comparison of signalling alphabets. Bell Syst. Tech. J. 31 504–522.
• [5] Härdle, W., Liang, H. and Gao, J. (2000). Partially Linear Models. Contributions to Statistics. Heidelberg: Physica-Verlag.
• [6] Horowitz, J., Klemelä, J. and Mammen, E. (2006). Optimal estimation in additive regression models. Bernoulli 12 271–298.
• [7] Koltchinskii, V. and Yuan, M. (2010). Sparsity in multiple kernel learning. Ann. Statist. 38 3660–3695.
• [8] Ma, C. and Huang, J. (2016). Asymptotic properties of lasso in high-dimensional partially linear models. Sci. China Math. 59 769–788.
• [9] Massart, P. (2007). Concentration Inequalities and Model Selection. Lecture Notes in Math. 1896. Berlin: Springer. Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23, 2003, With a foreword by Jean Picard.
• [10] Müller, P. and van de Geer, S. (2015). The partial linear model in high dimensions. Scand. J. Stat. 42 580–608.
• [11] Nickl, R. and van de Geer, S. (2013). Confidence sets in sparse regression. Ann. Statist. 41 2852–2876.
• [12] Nussbaum, M. (1985). Spline smoothing in regression models and asymptotic efficiency in $L_{2}$. Ann. Statist. 13 984–997.
• [13] Pinsker, M.S. Optimal filtration of square-integrable signals in Gaussian noise.
• [14] Raskutti, G., Wainwright, M.J. and Yu, B. (2011). Minimax rates of estimation for high-dimensional linear regression over $\ell_{q}$-balls. IEEE Trans. Inform. Theory 57 6976–6994.
• [15] Raskutti, G., Wainwright, M.J. and Yu, B. (2012). Minimax-optimal rates for sparse additive models over kernel classes via convex programming. J. Mach. Learn. Res. 13 389–427.
• [16] Stone, C.J. (1985). Additive regression and other nonparametric models. Ann. Statist. 13 689–705.
• [17] Suzuki, T. and Sugiyama, M. (2013). Fast learning rate of multiple kernel learning: Trade-off between sparsity and smoothness. Ann. Statist. 41 1381–1405.
• [18] Tsybakov, A.B. (2009). Introduction to nonparametric estimation. Springer Series in Statistics. New York: Springer.
• [19] van de Geer, S. (2014). On the uniform convergence of empirical norms and inner products, with application to causal inference. Electron. J. Stat. 8 543–574.
• [20] van de Geer, S. and Muro, A. (2015). Penalized least squares estimation in the additive model with different smoothness for the components. J. Statist. Plann. Inference 162 43–61.
• [21] Vershynin, R. (2012). Introduction to the non-asymptotic analysis of random matrices. In Compressed Sensing 210–268. Cambridge: Cambridge Univ. Press.
• [22] Verzelen, N. (2012). Minimax risks for sparse regressions: Ultra-high dimensional phenomenons. Electron. J. Stat. 6 38–90.
• [23] Xie, H. and Huang, J. (2009). SCAD-penalized regression in high-dimensional partially linear models. Ann. Statist. 37 673–696.
• [24] Ye, F. and Zhang, C.-H. (2010). Rate minimaxity of the Lasso and Dantzig selector for the $\ell_{q}$ loss in $\ell_{r}$ balls. J. Mach. Learn. Res. 11 3519–3540.
• [25] Yu, K., Mammen, E. and Park, B.U. (2011). Semi-parametric regression: Efficiency gains from modeling the nonparametric part. Bernoulli 17 736–748.
• [26] Yuan, M. and Zhou, D.-X. (2016). Minimax optimal rates of estimation in high dimensional additive models. Ann. Statist. 44 2564–2593.
• [27] Zhang, H.H., Cheng, G. and Liu, Y. (2011). Linear or nonlinear? Automatic structure discovery for partially linear models. J. Amer. Statist. Assoc. 106 1099–1112.
• [28] Zhu, Y. (2017). Nonasymptotic analysis of semiparametric regression models with high-dimensional parametric coefficients. Ann. Statist. 45 2274–2298.