## Electronic Journal of Statistics

### Screening-based Bregman divergence estimation with NP-dimensionality

#### Abstract

Feature screening via the marginal screening ([5]; [7]) has gained special attention for high dimensional regression problems. However, their results are confined to the generalized linear model ($\mathrm{GLM}$) with the exponential family of distributions. This inspires us to explore the suitability of applying screening procedures to more general models, for example without assuming either the explicit form of distributions or parametric forms between response and covariates. In this paper, we extend the marginal screening procedure, by means of Bregman divergence (${\mathrm{BD}}$) as the loss function, to include not only the $\mathrm{GLM}$ but also the quasi-likelihood model. A sure screening property for the resulting screening procedure is established under this very general framework, assuming only certain moment conditions and tail properties, where the dimensionality $p_{n}$ is allowed to grow with the sample size $n$ as fast as $\log(p_{n})=O(n^{a})$ for some $a\in(0,1)$. Simulation and real data studies illustrate that a two-step procedure, which combines the feature screening in the first step and a penalized-${\mathrm{BD}}$ estimation in the second step, is practically applicable to identifying the set of relevant variables and achieving good estimation of model parameters, with the computational cost much less than those without using the screening step.

#### Article information

Source
Electron. J. Statist., Volume 10, Number 2 (2016), 2039-2065.

Dates
First available in Project Euclid: 18 July 2016

https://projecteuclid.org/euclid.ejs/1468849970

Digital Object Identifier
doi:10.1214/16-EJS1157

Mathematical Reviews number (MathSciNet)
MR3522668

Zentralblatt MATH identifier
1345.62054

#### Citation

Zhang, Chunming; Guo, Xiao; Chai, Yi. Screening-based Bregman divergence estimation with NP-dimensionality. Electron. J. Statist. 10 (2016), no. 2, 2039--2065. doi:10.1214/16-EJS1157. https://projecteuclid.org/euclid.ejs/1468849970

#### References

• [1] Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybarra, S., Mack, D., and Levine, A.J. (1999). Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays., Proc. Natl. Acad. Sci. USA, 96, 6745–6750.
• [2] Brègman, L. M. (1967). A relaxation method of finding a common point of convex sets and its application to the solution of problems in convex programming., U.S.S.R. Comput. Math. and Math. Phys., 7, 620–631.
• [3] Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when p is much larger than n., Ann. Statist., 35, 2313–2351.
• [4] Chang, J., Tang, C. Y. and Wu, Y. (2013). Marginal empirical likelihood and sure indpendence feature screening., Ann. Statist., 41, 2123–2148.
• [5] Fan, J. and Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space., J. R. Stat. Soc. Ser. B, 70, 849–911.
• [6] Fan, J. (1997). Comment on “Wavelets in statistics: A review”., A. Antoniadis. J. Italian Statisit. Soc., 6, 131–138.
• [7] Fan, J. and Song, R. (2010). Sure independence screening in generalized linear models with NP-dimensionality., Ann. Statist., 38, 3567–3604.
• [8] Frank, I. E. and Friedman, J. H. (1993). A statistical view of some chemometrics regression tools., Technometrics, 35, 109–148.
• [9] Friedman, J., Hastie, T., Höfling, H. and Tibshirani, R. (2007). Pathwise coordinate optimization., Ann. Appl. Statist., 1, 302–332.
• [10] Friedman, J., Hastie, T. and Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent., Journal of Statistical Software, 33, 1–22.
• [11] Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfiel, C.D. and Lander, E.S. (1999). Molecular classification of cancer: class discovery and class prediction by gene expression monitoring., Science, 286, 531–536.
• [12] Hastie, T., Tibshirani, R. and Friedman, J. (2001)., The Elements of Statistical Learning, Springer.
• [13] Li, G., Peng, H., Zhang, J. and Zhu, L. (2012). Robust rank correlation based screening., Ann. Statist., 40, 1846–1877.
• [14] Li, R., Zhong, W. and Zhu, L. (2012). Feature screening via distance correlation learning., J. Amer. Statist. Assoc., 107, 1129–1139.
• [15] McCullagh, P. and Nelder, J. (1989)., Generalized Linear Models, Chapman Hall CRC, Boca Raton.
• [16] McCullagh, P. (1983). Quasi-likelihood functions., Ann. Statist., 11, 59–67.
• [17] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso., J. Roy. Statist. Soc. Ser. B, 58, 267–288.
• [18] van de Geer, S. (2008). High-dimensional generalized linear models and the lasso., Ann. Statist., 36, 614–645.
• [19] Vapnik, V. (1996)., The Nature of Statistical Learning Theory. Springer-Verlag, New York.
• [20] Wasserman, L. and Roeder, K. (2009). High-dimensional variable selection., Ann. Statist., 37, 2178–2201.
• [21] Zhang, C. H. (2010). Nearly unbiased variable selection under minimax concave penalty., Ann. Statist., 38, 894–942.
• [22] Zhang, C. M., Jiang, Y. and Shang, Z. (2009). New aspects of Bregman divergence in regression and classification with parametric and nonparametric estimation., Canad. J. Statist., 37, 119–139.
• [23] Zhang, C. M., Jiang, Y. and Chai, Y. (2010). Penalized Bregman divergence for large-dimensional regression and classification., Biometrika, 97, 551–566.
• [24] Zhu, L., Li, L., Li, R. and Zhu, L. (2011). Model-free feature screening for ultrahigh-dimensional data., J. Amer. Statist. Assoc., 106, 1464–1475.