The Annals of Applied Statistics

Linking lung airway structure to pulmonary function via composite bridge regression

Kun Chen, Eric A. Hoffman, Indu Seetharaman, Feiran Jiao, Ching-Long Lin, and Kung-Sik Chan

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


The human lung airway is a complex inverted tree-like structure. Detailed airway measurements can be extracted from MDCT-scanned lung images, such as segmental wall thickness, airway diameter, parent-child branch angles, etc. The wealth of lung airway data provides a unique opportunity for advancing our understanding of the fundamental structure-function relationships within the lung. An important problem is to construct and identify important lung airway features in normal subjects and connect these to standardized pulmonary function test results such as FEV1%. Among other things, the problem is complicated by the fact that a particular airway feature may be an important (relevant) predictor only when it pertains to segments of certain generations. Thus, the key is an efficient, consistent method for simultaneously conducting group selection (lung airway feature types) and within-group variable selection (airway generations), i.e., bi-level selection. Here we streamline a comprehensive procedure to process the lung airway data via imputation, normalization, transformation and groupwise principal component analysis, and then adopt a new composite penalized regression approach for conducting bi-level feature selection. As a prototype of composite penalization, the proposed composite bridge regression method is shown to admit an efficient algorithm, enjoy bi-level oracle properties and outperform several existing methods. We analyze the MDCT lung image data from a cohort of 132 subjects with normal lung function. Our results show that lung function in terms of FEV1% is promoted by having a less dense and more homogeneous lung comprising an airway whose segments enjoy more heterogeneity in wall thicknesses, larger mean diameters, lumen areas and branch angles. These data hold the potential of defining more accurately the “normal” subject population with borderline atypical lung functions that are clearly influenced by many genetic and environmental factors.

Article information

Ann. Appl. Stat., Volume 10, Number 4 (2016), 1880-1906.

Received: January 2015
Revised: April 2016
First available in Project Euclid: 5 January 2017

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Bi-level variable selection composite penalization feature extraction lung airway data pulmonary function tests


Chen, Kun; Hoffman, Eric A.; Seetharaman, Indu; Jiao, Feiran; Lin, Ching-Long; Chan, Kung-Sik. Linking lung airway structure to pulmonary function via composite bridge regression. Ann. Appl. Stat. 10 (2016), no. 4, 1880--1906. doi:10.1214/16-AOAS947.

Export citation


  • Becklake, M. R. (1985). Concepts of normality applied to the measurement of lung function. Am. J. Med. 80 1158–1164.
  • Breheny, P. (2015). The group exponential lasso for bi-level variable selection. Biometrics 71 731–740.
  • Breheny, P. and Huang, J. (2009). Penalized methods for bi-level variable selection. Stat. Interface 2 369–380.
  • Breheny, P. and Huang, J. (2011). Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Ann. Appl. Stat. 5 232–253.
  • Chen, K. and Chan, K.-S. (2011). Subset ARMA selection via the adaptive Lasso. Stat. Interface 4 197–205.
  • Chen, K., Chan, K.-S. and Stenseth, N. Chr. (2012). Reduced rank stochastic regression with a sparse singular value decomposition. J. R. Stat. Soc. Ser. B. Stat. Methodol. 74 203–221.
  • Chen, L. and Huang, J. Z. (2012). Sparse reduced-rank regression for simultaneous dimension reduction and variable selection. J. Amer. Statist. Assoc. 107 1533–1545.
  • Chen, K., Hoffman, E. A., Seetharaman, I., Jiao, F., Lin, C.-L. and Chan, K.-S. (2016). Supplement to “Linking lung airway structure to pulmonary function via composite bridge regression.” DOI:10.1214/16-AOAS947SUPP.
  • Efron, B. (2004). The estimation of prediction error: Covariance penalties and cross-validation. J. Amer. Statist. Assoc. 99 619–642.
  • Fan, Y. and Tang, C. Y. (2013). Tuning parameter selection in high dimensional penalized likelihood. J. R. Stat. Soc. Ser. B. Stat. Methodol. 75 531–552.
  • Friedman, J. H., Hastie, T. J. and Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33 1–22.
  • Fuld, M. K., Grout, R. W., Guo, J., Morgan, J. H. and Hoffman, E. (2012). Systems for lung volume standardization during static and dynamic MDCT-based quantitative assessment of pulmonary structure and function. Acad. Radiol. 19 930–940.
  • Gao, W. (2010). Development of human lung query atlas. Dissertation, Univ. Iowa.
  • Guo, J., Fuld, M. K., Alford, S. K., Reinhardt, J. M. and Hoffman, E. A. (2008). Pulmonary Analysis Software Suite 9.0: Integrating quantitative measures of function with structural analyses. In First International Workshop on Pulmonary Image Analysis 283–292.
  • Hankinson, J. L., Odencrantz, J. R. and Fedan, K. B. (1999). Spirometric reference values from a sample of the general U.S. population. Am. J. Respir. Crit. Care Med. 159 179–187.
  • Hoffman, E. A., Simon, B. A. and McLennan, G. (2006). State of the art. A structural and functional assessment of the lung via multidetector-row computed tomography: Phenotyping chronic obstructive pulmonary disease. Proc. Am. Thorac. Soc. 3 519–532.
  • Huang, J., Breheny, P. and Ma, S. (2012). A selective review of group selection in high-dimensional models. Statist. Sci. 27 481–499.
  • Huang, J., Horowitz, J. L. and Ma, S. (2008). Asymptotic properties of bridge estimators in sparse high-dimensional regression models. Ann. Statist. 36 587–613.
  • Huang, J., Ma, S., Xie, H. and Zhang, C.-H. (2009). A group bridge approach for variable selection. Biometrika 96 339–355.
  • Iyer, K. S., Grant, R. W., Zamba, G. K. and Hoffman, E. A. (2014). Repeatability and sample size assessment associated with computed tomography-based lung density metrics. Journal of the COPD Foundation 1 97–104.
  • Liu, J., Ma, S. and Huang, J. (2014). Integrative analysis of cancer diagnosis studies with composite penalization. Scand. Stat. Theory Appl. 41 87–103.
  • Ma, S., Huang, J., Wei, F., Xie, Y. and Fang, K. (2011). Integrative analysis of multiple cancer prognosis studies with gene expression measurements. Stat. Med. 30 3361–3371.
  • Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. Ann. Statist. 34 1436–1462.
  • Montesantos, S., Katz, I., Fleming, J., Majoral, C., Pichelin, M., Dubau, C., Piednoir, B., Conway, J., Texereau, J. and Caillibotte, G. (2013). Airway morphology from high resolution computed tomography in healthy subjects and patients with moderate persistent asthma. Anat Rec (Hoboken) 296 852–866.
  • Nakano, Y., Tho, N. V., Yamada, H., Osawa, M. and Nagao, T. (2009). Radiological approach to asthma and COPD—the role of computed tomography. Allergol. Intern. 58 323–331.
  • Palagyi, K., Tschirren, J., Hoffman, E. A. and Sonka, M. (2006). Quantitative analysis of pulmonary airway tree structuress. Comput. Biol. Med. 36 974–976.
  • R Development Core Team (2015). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.
  • Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statist. 6 461–464.
  • Smith, B. M., Hoffman, E. A., Rabinowitz, D., Bleecker, E., Christenson, S., Couper, D., Donohue, K. M., Han, M. K., Hansel, N. N., Kanner, R. E. et al. (2014). Comparison of spatially matched airways reveals thinner airway walls in COPD. The Multi-Ethnic Study of Atherosclerosis (MESA) COPD Study and the Subpopulations and Intermediate Outcomes in COPD Study (SPIROMICS). Thorax 69 987–996.
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267–288.
  • Tschirren, J., Hoffman, E. A., McLennan, G. and Sonka, M. (2005a). Intrathoracic airway trees: Segmentation and airway morphology analysis from low-dose CT scans. IEEE Trans. Med. Imag. 24 1529–1539.
  • Tschirren, J., Hoffman, E. A., McLennan, G. and Sonka, M. (2005b). Segmentation and quantitative analysis of intrathoracic airway trees from computed tomography images. Proc. Am. Thorac. Soc. 2 484–7, 503–4.
  • Tschirren, J., McLennan, G., Palagyi, K., Hoffman, E. A. and Sonka, M. (2005c). Matching and anatomical labeling of human airway tree. Comput. Biol. Med. 24 1540–1547.
  • Weibel, E. R. (2015). How Benoit Mandelbrot changed my thinking about biological form. Benoit Mandelbrot: A Life in Many Dimensions 1 471–487.
  • Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B Stat. Methodol. 68 49–67.
  • Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. Ann. Statist. 38 894–942.
  • Zhang, C., Jiang, Y. and Chai, Y. (2010). Penalized Bregman divergence for large-dimensional regression and classification. Biometrika 97 551–566.
  • Zhao, P., Rocha, G. and Yu, B. (2009). The composite absolute penalties family for grouped and hierarchical variable selection. Ann. Statist. 37 3468–3497.
  • Zou, H. (2006). The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101 1418–1429.
  • Zou, H. and Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models. Ann. Statist. 36 1509–1533.

Supplemental materials

  • Supplement to“Linking lung airway structure to pulmonary function via composite bridge regression”. We provide the technical details in the theoretical investigation of the proposed method and an additional simulation example to investigate the impact of group size.