The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 3, Number 2 (2009), 663-690.
Bayesian variable selection using cost-adjusted BIC, with application to cost-effective measurement of quality of health care
In the field of quality of health care measurement, one approach to assessing patient sickness at admission involves a logistic regression of mortality within 30 days of admission on a fairly large number of sickness indicators (on the order of 100) to construct a sickness scale, employing classical variable selection methods to find an “optimal” subset of 10–20 indicators. Such “benefit-only” methods ignore the considerable differences among the sickness indicators in cost of data collection, an issue that is crucial when admission sickness is used to drive programs (now implemented or under consideration in several countries, including the U.S. and U.K.) that attempt to identify substandard hospitals by comparing observed and expected mortality rates (given admission sickness). When both data-collection cost and accuracy of prediction of 30-day mortality are considered, a large variable-selection problem arises in which costly variables that do not predict well enough should be omitted from the final scale.
In this paper (a) we develop a method for solving this problem based on posterior model odds, arising from a prior distribution that (1) accounts for the cost of each variable and (2) results in a set of posterior model probabilities that corresponds to a generalized cost-adjusted version of the Bayesian information criterion (BIC), and (b) we compare this method with a decision-theoretic cost-benefit approach based on maximizing expected utility. We use reversible-jump Markov chain Monte Carlo (RJMCMC) methods to search the model space, and we check the stability of our findings with two variants of the MCMC model composition (MC3) algorithm. We find substantial agreement between the decision-theoretic and cost-adjusted-BIC methods; the latter provides a principled approach to performing a cost-benefit trade-off that avoids ambiguities in identification of an appropriate utility structure. Our cost-benefit approach results in a set of models with a noticeable reduction in cost and dimensionality, and only a minor decrease in predictive performance, when compared with models arising from benefit-only analyses.
Ann. Appl. Stat., Volume 3, Number 2 (2009), 663-690.
First available in Project Euclid: 22 June 2009
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Input-output analysis quality of health care sickness at hospital admission cost-benefit analysis Laplace approximation reversible-jump Markov chain Monte Carlo (RJMCMC) methods MCMC model composition (MC^3) Bayesian information criterion (BIC) cost-adjusted BIC
Fouskakis, D.; Ntzoufras, I.; Draper, D. Bayesian variable selection using cost-adjusted BIC, with application to cost-effective measurement of quality of health care. Ann. Appl. Stat. 3 (2009), no. 2, 663--690. doi:10.1214/08-AOAS207. https://projecteuclid.org/euclid.aoas/1245676190
- Supplementary material: Cost-based prior distributions for variable selection in generalized linear models.