The Annals of Applied Statistics

Joint mean and covariance modeling of multiple health outcome measures

Xiaoyue Niu and Peter D. Hoff

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


Health exams determine a patient’s health status by comparing the patient’s measurement with a population reference range, a 95% interval derived from a homogeneous reference population. Similarly, most of the established relation among health problems are assumed to hold for the entire population. We use data from the 2009–2010 National Health and Nutrition Examination Survey (NHANES) on four major health problems in the U.S. and apply a joint mean and covariance model to study how the reference ranges and associations of those health outcomes could vary among subpopulations. We discuss guidelines for model selection and evaluation, using standard criteria such as AIC in conjunction with posterior predictive checks. The results from the proposed model can help identify subpopulations in which more data need to be collected to refine the reference range and to study the specific associations among those health problems.

Article information

Ann. Appl. Stat., Volume 13, Number 1 (2019), 321-339.

Received: July 2015
Revised: January 2018
First available in Project Euclid: 10 April 2019

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Heterogeneous population reference range covariance regression NHANES


Niu, Xiaoyue; Hoff, Peter D. Joint mean and covariance modeling of multiple health outcome measures. Ann. Appl. Stat. 13 (2019), no. 1, 321--339. doi:10.1214/18-AOAS1187.

Export citation


  • Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In Second International Symposium on Information Theory (Tsahkadsor, 1971) (B. N. Petrov and F. Csaki, eds.) 267–281. Akadémiai Kiadó, Budapest.
  • Boik, R. J. (2002). Spectral models for covariance matrices. Biometrika 89 159–182.
  • Boik, R. J. (2003). Principal component models for correlation matrices. Biometrika 90 679–701.
  • CDC/NCHS (2010a). National Health and Nutrition Examination Survey Data, 2009–2010. U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics, Hyattsville, MD.
  • CDC/NCHS (2010b). National Health and Nutrition Examination Survey: Analytic Guidelines, 1999–2010. U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics, Hyattsville, MD.
  • Chiu, T. Y. M., Leonard, T. and Tsui, K.-W. (1996). The matrix-logarithmic covariance model. J. Amer. Statist. Assoc. 91 198–210.
  • CLSI (2008). Defining, Establishing, and Verifying Reference Intervals in the Clinical Laboratory: Approved Guideline, 3rd ed. CLSI document EP28-A3c. Clinical and Laboratory Standards Institute, Wayne, PA.
  • Cox, D. R. and Reid, N. (1987). Parameter orthogonality and approximate conditional inference. J. Roy. Statist. Soc. Ser. B 49 1–39 (with a discussion).
  • Cripps, E., Carter, C. and Kohn, R. (2005). Variable selection and covariance selection in multivariate regression models. In Bayesian Thinking: Modeling and Computation (D. Dey and C. R. Rao, eds.). Handbook of Statist. 25 519–552. Elsevier/North-Holland, Amsterdam.
  • Engle, R. F. and Kroner, K. F. (1995). Multivariate simultaneous generalized ARCH. Econometric Theory 11 122–150.
  • Fong, P. W., Li, W. K. and An, H.-Z. (2006). A simple multivariate ARCH model specified by random coefficients. Comput. Statist. Data Anal. 51 1779–1802.
  • Foulds, H., Bredin, S. and Warburton, D. (2012). The relationship between diabetes and obesity across different ethnicities. J. Diabetes Metab. 3.
  • Fraser, S. D. S., Roderick, P. J., Mclntyre, N. J., Harris, S., Mclntyre, C. W., Fluck, R. J. and Taal, M. W. (2012). Socio-economic disparities in the distribution of cardiovascular risk in chronic kidney disease stage 3. Nephron, Clin. Pract. 122 58–65.
  • Gaskins, J. T. and Daniels, M. J. (2013). A nonparametric prior for simultaneous covariance estimation. Biometrika 100 125–138.
  • Gelman, A. (2007). Struggles with survey weighting and regression modeling. Statist. Sci. 22 153–164.
  • Guttman, I. (1967). The use of the concept of a future observation in goodness-of-fit problems. J. Roy. Statist. Soc. Ser. B 29 83–100.
  • Harris, E. K. and Boyd, J. C. (1990). On dividing reference data into subgroups to produce separate reference ranges. Clin. Chem. 36 265–270.
  • Hoff, P. D. (2007). Extending the rank likelihood for semiparametric copula estimation. Ann. Appl. Stat. 1 265–283.
  • Hoff, P. D. (2009). A hierarchical eigenmodel for pooled covariance estimation. J. R. Stat. Soc. Ser. B. Stat. Methodol. 71 971–992.
  • Hoff, P. D. and Niu, X. (2012). A covariance regression model. Statist. Sinica 22 729–753.
  • KDIGO (2013). Kidney Disease: Improving Global Outcomes (KDIGO) CKD Work Group. KDIGO 2012 clinical practice guideline for the evaluation and management of chronic kidney disease. Kidney Inter., Suppl. 3 1–150.
  • Liang, K. Y. and Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika 73 13–22.
  • Mattix, H. J., Hsu, C.-Y., Shaykevich, S. and Curhan, G. (2002). Use of the albumin/creatinine ratio to detect microalbuminuria: Implications of sex and race. J. Am. Soc. Nephrol. 13 1034–1039.
  • McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models, 2nd ed. Chapman & Hall, London. [Second edition of MR0727836.]
  • NIDDK (2013). U.S. Renal Data System, USRDS 2013 Annual Data Report: Atlas of Chronic Kidney Disease and End-Stage Renal Disease in the United States. National Institutes of Health, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD.
  • Niu, X. and Hoff, P. D. (2019). Supplement to “Joint mean and covariance modeling of multiple health outcome measures.” DOI:10.1214/18-AOAS1187SUPP.
  • Pourahmadi, M. (1999). Joint mean-covariance models with applications to longitudinal data: Unconstrained parameterisation. Biometrika 86 677–690.
  • Pourahmadi, M. (2011). Covariance estimation: The GLM and regularization perspectives. Statist. Sci. 26 369–387.
  • Rubin, D. B. (1984). Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann. Statist. 12 1151–1172.
  • Winship, C. and Radbill, L. (1994). Sampling weights and regression analysis. Sociol. Methods Res. 23 230–257.
  • Zeger, S. L. and Liang, K.-Y. (1986). Longitudinal data analysis for discrete and continuous outcomes. Biometrics 42 121–130.

Supplemental materials

  • Supplement to “Joint mean and covariance modeling of multiple health outcome measures.”. Additional results, tables, and plots mentioned in the text are in the Supplemental Material.