The Annals of Statistics

Nonconcave penalized composite conditional likelihood estimation of sparse Ising models

Lingzhou Xue, Hui Zou, and Tianxi Cai

Full-text: Open access


The Ising model is a useful tool for studying complex interactions within a system. The estimation of such a model, however, is rather challenging, especially in the presence of high-dimensional parameters. In this work, we propose efficient procedures for learning a sparse Ising model based on a penalized composite conditional likelihood with nonconcave penalties. Nonconcave penalized likelihood estimation has received a lot of attention in recent years. However, such an approach is computationally prohibitive under high-dimensional Ising models. To overcome such difficulties, we extend the methodology and theory of nonconcave penalized likelihood to penalized composite conditional likelihood estimation. The proposed method can be efficiently implemented by taking advantage of coordinate-ascent and minorization–maximization principles. Asymptotic oracle properties of the proposed method are established with NP-dimensionality. Optimality of the computed local solution is discussed. We demonstrate its finite sample performance via simulation studies and further illustrate our proposal by studying the Human Immunodeficiency Virus type 1 protease structure based on data from the Stanford HIV drug resistance database. Our statistical learning results match the known biological findings very well, although no prior biological information is used in the data analysis procedure.

Article information

Ann. Statist., Volume 40, Number 3 (2012), 1403-1429.

First available in Project Euclid: 10 August 2012

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G20: Asymptotic properties 62P10: Applications to biology and medical sciences
Secondary: 90-08: Computational methods

Composite likelihood coordinatewise optimization Ising model minorization–maximization principle NP-dimension asymptotic theory HIV drug resistance database


Xue, Lingzhou; Zou, Hui; Cai, Tianxi. Nonconcave penalized composite conditional likelihood estimation of sparse Ising models. Ann. Statist. 40 (2012), no. 3, 1403--1429. doi:10.1214/12-AOS1017.

Export citation


  • Atchley, W. R., Wollenberg, K. R., Fitch, W. M., Terhalle, W. and Dress, A. W. (2000). Correlations among amino acid sites in bHLH protein domains: An information theoretic analysis. Mol. Biol. Evol. 17 164–178.
  • Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems. J. R. Stat. Soc. Ser. B Stat. Methodol. 36 192–236.
  • Bradic, J., Fan, J. and Wang, W. (2011). Penalized composite quasi-likelihood for ultrahigh dimensional variable selection. J. R. Stat. Soc. Ser. B Stat. Methodol. 73 325–349.
  • Bradic, J., Fan, J. and Jiang, J. (2011). Regularization for Cox’s proportional hazards model with NP-dimensionality. Ann. Statist. 39 3092–3120.
  • Bühlmann, P. and Meier, L. (2008). Discussion: “One-step sparse estimates in nonconcave penalized likelihood models,” by H. Zou and R. Li. Ann. Statist. 36 1534–1541.
  • Candès, E. J., Wakin, M. B. and Boyd, S. P. (2008). Enhancing sparsity by reweighted $l_1$ minimization. J. Fourier Anal. Appl. 14 877–905.
  • Daubechies, I., Defrise, M. and De Mol, C. (2004). An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Comm. Pure Appl. Math. 57 1413–1457.
  • Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Stat. Soc. Ser. B Stat. Methodol. 39 1–38.
  • Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348–1360.
  • Fan, J. and Lv, J. (2010). A selective overview of variable selection in high dimensional feature space. Statist. Sinica 20 101–148.
  • Fan, J. and Lv, J. (2011). Non-concave penalized likelihood with NP-dimensionality. IEEE Trans. Inform. Theory 57 5467–5484.
  • Friedman, J., Hastie, T. and Tibshirani, R. (2010). Regularized paths for generalized linear models via coordinate descent. Journal of Statistical Software 33 1–22.
  • Fu, W. J. (1998). Penalized regressions: The bridge versus the lasso. J. Comput. Graph. Statist. 7 397–416.
  • Genkin, A., Lewis, D. D. and Madigan, D. (2007). Large-scale Bayesian logistic regression for text categorization. Technometrics 49 291–304.
  • Höfling, H. and Tibshirani, R. (2009). Estimation of sparse binary pairwise Markov networks using pseudo-likelihoods. J. Mach. Learn. Res. 10 883–906.
  • Hunter, D. R. and Lange, K. (2004). A tutorial on MM algorithms. Amer. Statist. 58 30–37.
  • Hunter, D. R. and Li, R. (2005). Variable selection using MM algorithms. Ann. Statist. 33 1617–1642.
  • Irback, A., Peterson, C. and Potthast, F. (1996). Evidence for nonrandom hydrophobicity structures in protein chains. Proc. Natl. Acad. Sci. USA 93 533–538.
  • Ising, E. (1925). Beitrag zur theorie des ferromagnetismus. Z. Physik 31 53–258.
  • Lange, K., Hunter, D. R. and Yang, I. (2000). Optimization transfer using surrogate objective functions (with discussion). J. Comput. Graph. Statist. 9 1–59.
  • Lindsay, B. G. (1988). Composite likelihood methods. In Statistical Inference from Stochastic Processes (Ithaca, NY, 1987). Contemporary Mathematics 80 221–239. Amer. Math. Soc., Providence, RI.
  • Liu, Y., Eyal, E. and Bahar, I. (2008). Analysis of correlated mutations in HIV-1 protease using spectral clustering. Bioinformatics 24 1243–1250.
  • Lv, J. and Fan, Y. (2009). A unified approach to model selection and sparse recovery using regularized least squares. Ann. Statist. 37 3498–3528.
  • Majewski, J., Li, H. and Ott, J. (2001). The Ising model in physics and statistical genetics. Am. J. Hum. Genet. 69 853–862.
  • Markowitz, M., Mo, H., Kempf, D. J., Norbeck, D. W., Bhat, T. N., Erickson, J. W. and Ho, D. D. (1995). Selection and analysis of human immunodeficiency virus type 1 variants with increased resistance to ABT-538, a novel protease inhibitor. Journal of Virology 69 701–706.
  • Meier, L., van de Geer, S. and Bühlmann, P. (2008). The group Lasso for logistic regression. J. R. Stat. Soc. Ser. B Stat. Methodol. 70 53–71.
  • Meinshausen, N. (2007). Relaxed Lasso. Comput. Statist. Data Anal. 52 374–393.
  • Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. Ann. Statist. 34 1436–1462.
  • Meinshausen, N. and Bühlmann, P. (2010). Stability selection. J. R. Stat. Soc. Ser. B Stat. Methodol. 72 417–473.
  • Muzammil, S., Ross, P. and Freire, E. (2003). A major role for a set of non-Active site mutations in the development of HIV-1 protease drug resistance. Biochemistry 42 631–638.
  • Ohtaka, H., Schön, A. and Freire, E. (2003). Multidrug resistance to HIV-1 protease inhibition requires cooperative coupling between distal mutations. Biochemistry 42 13659–13666.
  • Ravikumar, P., Wainwright, M. J. and Lafferty, J. (2010). High-dimensional Ising model selection using $\ell_1$-regularized logistic regression. Ann. Statist. 38 1287–1319.
  • Rhee, S.-Y., Liu, T., Ravela, J., Gonzales, M. J. and Shafer, R. W. (2004). Distribution of human immunodeficiency virus type 1 protease and reverse transcriptase mutation patterns in 4,183 persons undergoing genotypic resistance testing. Antimicrob. Agents Chemother. 48 3122–3126.
  • Rhee, S. Y., Taylor, J., Wadhera, G., Ben-Hur, A., Brutlag, D. L. and Shafer, R. W. (2006). Genotypic predictors of human immunodeficiency virus type 1 drug resistance. Proc. Natl. Acad. Sci. USA 103 17355–17360.
  • Schelldorfer, J., Bühlmann, P. and van de Geer, S. (2011). Estimation for high-dimensional linear mixed-effects models using $\ell_1$-penalization. Scand. J. Stat. 38 197–214.
  • Städler, N., Bühlmann, P. and van de Geer, S. (2010). $\ell_1$-penalization for mixture regression models. TEST 19 209–256.
  • Stauffer, D. (2008). Social applications of two-dimensional Ising models. American Journal of Physics 76 470–473.
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 58 267–288.
  • Tisdale, M., Myers, R. E., Maschera, B., Parry, N. R., Oliver, N. M. and Blair, E. D. (1995). Cross-resistance analysis of human immunodeficiency virus type 1 variants individually selected for resistance to five different protease inhibitors. Antimicrob. Agents Chemother. 39 1704–1710.
  • Tseng, P. (1988). Coordinate ascent for maximizing nondifferentiable concave functions. Technical Report LIDS-P, 1840, Massachusetts Institute of Technology, Laboratory for Information and Decision Systems.
  • Varin, C. (2008). On composite marginal likelihoods. AStA Adv. Stat. Anal. 92 1–28.
  • Varin, C., Reid, N. and Firth, D. (2011). An overview of composite likelihood methods. Statist. Sinica 21 5–42.
  • Wang, H., Li, R. and Tsai, C.-L. (2007). Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika 94 553–568.
  • Wu, M., Cai, T. and Lin, X. (2010). Testing for regression coefficients in lasso regularized regression. Technical report, Harvard Univ.
  • Wu, T. T. and Lange, K. (2008). Coordinate descent algorithms for lasso penalized regression. Ann. Appl. Stat. 2 224–244.
  • Wu, T. D., Schiffer, C. A., Gonzales, M. J., Taylor, J., Kantor, R., Chou, S., Israelski, D., Zolopa, A. R., Fessel, W. J. and Shafer, R. W. (2003). Mutation patterns and structural correlates in human immunodeficiency virus type 1 protease following different protease inhibitor treatments. J. Virol. 77 4836–4847.
  • Xue, L., Zou, H. and Cai, T. (2010). Supplement to “Nonconcave penalized composite conditional likelihood estimation of sparse Ising models.” Technical report, School of Statistics, Univ. Minnesota. Available at
  • Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B Stat. Methodol. 68 49–67.
  • Zhang, C.-H. (2010a). Nearly unbiased variable selection under minimax concave penalty. Ann. Statist. 38 894–942.
  • Zhang, T. (2010b). Analysis of multi-stage convex relaxation for sparse regularization. J. Mach. Learn. Res. 11 1081–1107.
  • Zhao, P. and Yu, B. (2006). On model selection consistency of Lasso. J. Mach. Learn. Res. 7 2541–2563.
  • Zou, H. (2006). The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101 1418–1429.
  • Zou, H. and Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models. Ann. Statist. 36 1509–1533.

Supplemental materials

  • Supplementary material: Supplementary materials for “Non-concave penalized composite likelihood estimation of sparse Ising models”. In this supplementary file, we provide a complete theoretical analysis of the LASSO-penalized composite likelihood estimator for sparse Ising models.