The Annals of Applied Statistics

Bayesian hierarchical rule modeling for predicting medical conditions

Tyler H. McCormick, Cynthia Rudin, and David Madigan

Full-text: Open access


We propose a statistical modeling technique, called the Hierarchical Association Rule Model (HARM), that predicts a patient’s possible future medical conditions given the patient’s current and past history of reported conditions. The core of our technique is a Bayesian hierarchical model for selecting predictive association rules (such as “condition 1 and condition 2 → condition 3”) from a large set of candidate rules. Because this method “borrows strength” using the conditions of many similar patients, it is able to provide predictions specialized to any given patient, even when little information about the patient’s history of conditions is available.

Article information

Ann. Appl. Stat., Volume 6, Number 2 (2012), 652-668.

First available in Project Euclid: 11 June 2012

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Association rule mining healthcare surveillance hierarchical model machine learning


McCormick, Tyler H.; Rudin, Cynthia; Madigan, David. Bayesian hierarchical rule modeling for predicting medical conditions. Ann. Appl. Stat. 6 (2012), no. 2, 652--668. doi:10.1214/11-AOAS522.

Export citation


  • Agarwal, D., Zhang, L. and Mazumder, R. (2012). Modeling item–item similarities for personalized recommendations on Yahoo! front page. Ann. Appl. Stat. To appear.
  • Agrawal, R., Imieliński, T. and Swami, A. (1993). Mining association rules between sets of items in large databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data 207–216. ACM, New York, NY, USA.
  • Berchtold, A. and Raftery, A. E. (2002). The mixture transition distribution model for high-order Markov chains and non-Gaussian time series. Statist. Sci. 17 328–356.
  • Breese, J. S., Heckerman, D. and Kadie, C. (1998). Empirical analysis of predictive algorithms for collaborative filtering. In Proceedings of the Fourteenth Conference on Uncertainty and Artificial Intelligence 43–52. Morgan Kaufmann, San Francisco, CA.
  • Condliff, M. K., Lewis, D. D., Madigan, D. and Posse, C. (1999). Bayesian mixed-effects models for recommender systems. In Proceedings of the ACM SIGIR Workshop on Recommender Systems: Algorithms and Evaluation 23–30. ACM Press, New York.
  • Davis, D. A., Chawla, N. V., Christakis, N. A. and Barabási, A.-L. (2010). Time to CARE: A collaborative engine for practical disease prediction. Data Min. Knowl. Discov. 20 388–415.
  • DuMouchel, W. and Pregibon, D. (2001). Empirical Bayes screening for multi-item associations. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 67–76. ACM Press, New York.
  • Fraley, C. and Raftery, A. E. (2002). Model-based clustering, discriminant analysis, and density estimation. J. Amer. Statist. Assoc. 97 611–631.
  • Geng, L. and Hamilton, H. J. (2007). Choosing the right lens: Finding what is interesting in data mining. In Quality Measures in Data Mining 3–24. Springer, Berlin.
  • Gopalakrishnan, V., Lustgarten, J. L., Visweswaran, S. and Cooper, G. F. (2010). Bayesian rule learning for biomedical data mining. Bioinformatics 26 668–675.
  • Hood, L. and Friend, S. H. (2011). Predictive, personalized, preventive, participatory (P4) cancer medicine. Nat. Rev. Clin. Oncol. 8 184–187.
  • Kukline, E., Yoon, P. W. and Keenan, N. L. (2010). Prevalence of coronary heart disease risk factors and screening for high cholesterol levels among young adults in the United States, 1999–2006. Annals of Family Medicine 8 327–333.
  • Letham, B., Rudin, C. and Madigan, D. (2011). Sequential event prediction. Working Paper OR 387-11, MIT Operations Research Center.
  • McCormick, T., Rudin, C. and Madigan, D. (2011). Supplement to “Bayesian hierarchical rule modeling for predicting medical conditions.” DOI:10.1214/11-AOAS522SUPP.
  • Piatetsky-Shapiro, G. (1991). Discovery, analysis and presentation of strong rules. In Knowledge Discovery in Databases (G. Piatetsky-Shapiro and W. J. Frawley, eds.) 229–248. AAAI/MIT Press.
  • Rosamond, W., Flegal, K., Friday, G., Furie, K., Go, A., Greenlund, K., Haase, N., Ho, M., Howard, V., Kissela, B., Kittner, S., Lloyd-Jones, D., McDermott, M., Meigs, J., Moy, C., Nichol, G., O’Donnell, C. J., Roger, V., Rumsfeld, J., Sorlie, P., Steinberger, J., Thom, T., Wasserthiel-Smoller, S. and Hong, Y. (2007). Heart disease and stroke statistics—2007 update: A report from the American heart association statistics committee and stroke statistics subcommittee. Circulation 115 e69–e171.
  • Rudin, C., Letham, B., Kogan, E. and Madigan, D. (2011a). A learning theory framework for association rules and sequential events. SSRN ELibrary.
  • Rudin, C., Letham, B., Salleb-Aouissi, A., Kogan, E. and Madigan, D. (2011b). Sequential event prediction with association rules. In Proceedings of the 24th Annual Conference on Learning Theory (COLT).
  • Shmueli, G. (2010). To explain or to predict? Statist. Sci. 25 289–310.
  • Tan, P. N., Kumar, V. and Srivastava, J. (2002). Selecting the right interestingness measure for association patterns. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press, New York.
  • Vogenberg, F. R. (2009). Predictive and prognostic models: Implications for healthcare decision-making in a modern recession. American Health and Drug Benefits 6 218–222.
  • Willey, J. Z., Rodriguez, C. J., Carlino, R. F., Moon, Y. P., Paik, M. C., Boden-Albala, B., Sacco, R. L., DiTullio, M. R., Homma, S. and Elkind, M. S. V. (2011). Race-ethnic differences in the association between lipid profile components and risk of myocardial infarction: The Northern Manhattan Study. Am. Heart J. 161 886–892.

Supplemental materials