Electronic Journal of Statistics

Maximum likelihood estimation in logistic regression models with a diverging number of covariates

Hua Liang and Pang Du

Full-text: Open access


Binary data with high-dimensional covariates have become more and more common in many disciplines. In this paper we consider the maximum likelihood estimation for logistic regression models with a diverging number of covariates. Under mild conditions we establish the asymptotic normality of the maximum likelihood estimate when the number of covariates $p$ goes to infinity with the sample size $n$ in the order of $p=o(n)$. This remarkably improves the existing results that can only allow $p$ growing in an order of $o(n^{\alpha})$ with $\alpha\in[1/5,1/2]$ [12, 14]. A major innovation in our proof is the use of the injective function.

Article information

Electron. J. Statist., Volume 6 (2012), 1838-1846.

First available in Project Euclid: 4 October 2012

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62F12: Asymptotic properties of estimators
Secondary: 62J12: Generalized linear models

High dimensional asymptotic normality injective function “large $n$, diverging $p$” logistic regression


Liang, Hua; Du, Pang. Maximum likelihood estimation in logistic regression models with a diverging number of covariates. Electron. J. Statist. 6 (2012), 1838--1846. doi:10.1214/12-EJS731. https://projecteuclid.org/euclid.ejs/1349355604

Export citation


  • [1] K. Chen, I. Hu, and Z. Ying. Strong consistency of maximum quasi-likelihood estimators in generalized linear models with fixed and adaptive designs., Ann. Statist., 27 :1155–1163, 1999.
  • [2] L. Fahrmeir and H. Kaufmann. Consistency and asymptotic normality of the maximum likelihood estimator in generalized linear models., Ann. Statist., 13:342–368, 1985.
  • [3] J. Fan and J. Lv. Non-Concave Penalized Likelihood with NP-Dimensionality., IEEE Trans. on Inform. Theory, 57 :5467–5484, 2011.
  • [4] J. Fan and R. Song. Sure independence screening in generalized linear models with NP-dimensionality., Ann. Statist., 38 :3567–3604, 2010.
  • [5] H. Heuser., Lehrbuch der Analysis. Teil 2. B. G. Teubner, Stuttgart, 1981.
  • [6] D. W. Hosmer and S. Lemeshow., Applied Logistic Regression. John Wiley & Sons, New York, 1989.
  • [7] J. Huang, S. Ma, and C.-H. Zhang. The iterated LASSO for high-dimensional logistic regression., Technical report, 2009.
  • [8] T. L. Lai and C. Z. Wei. Least squares estimates in stochastic regression models with applications to identification and control of dynamic systems., Ann. Statist., 10:154–166, 1982.
  • [9] J. K. Lindsey., Applying Generalized Linear Models. Springer Texts in Statistics. Springer, 1997.
  • [10] P. McCullagh and J. A. Nelder., Generalized Linear Models, volume 37 of Monographs on Statistics and Applied Probability. Chapman and Hall, London, 2 edition, 1989.
  • [11] J. A. Nelder and R. W. M. Wedderburn. Generalized linear models., J. R. Stat. Soc. Ser. A Statist. Soc., 135:370–384, 1972.
  • [12] S. Portnoy. Asymptotic behavior of likelihood methods for exponential families when the number of parameters tends to infinity., Ann. Statist., 16:356–366, 1988.
  • [13] S. A. van de Geer. High-dimensional generalized linear models and the LASSO., Ann. Statist., 36:614–645, 2008.
  • [14] L. Wang. GEE analysis of clustered binary data with diverging number of covariates., Ann. Statist., 39:389–417, 2011.
  • [15] C. Yin, L. Zhao, and C. Wei. Asymptotic normality and strong consistency of maximum quasi-likelihood estimates in generalized linear models., Sci. China Ser. A, 49:145–157, 2006.