The Annals of Statistics

An algorithmic and a geometric characterization of coarsening at random

Richard D. Gill and Peter D. Grünwald

Full-text: Open access

Abstract

We show that the class of conditional distributions satisfying the coarsening at random (CAR) property for discrete data has a simple and robust algorithmic description based on randomized uniform multicovers: combinatorial objects generalizing the notion of partition of a set. However, the complexity of a given CAR mechanism can be large: the maximal “height” of the needed multicovers can be exponential in the number of points in the sample space. The results stem from a geometric interpretation of the set of CAR distributions as a convex polytope and a characterization of its extreme points. The hierarchy of CAR models defined in this way could be useful in parsimonious statistical modeling of CAR mechanisms, though the results also raise doubts in applied work as to the meaningfulness of the CAR assumption in its full generality.

Article information

Source
Ann. Statist., Volume 36, Number 5 (2008), 2409-2422.

Dates
First available in Project Euclid: 13 October 2008

Permanent link to this document
https://projecteuclid.org/euclid.aos/1223908097

Digital Object Identifier
doi:10.1214/07-AOS532

Mathematical Reviews number (MathSciNet)
MR2458192

Zentralblatt MATH identifier
1148.62005

Subjects
Primary: 62A01: Foundations and philosophical topics
Secondary: 62N01: Censored data models

Keywords
Coarsening at random (CAR) ignorability uniform multicover Fibonacci numbers

Citation

Gill, Richard D.; Grünwald, Peter D. An algorithmic and a geometric characterization of coarsening at random. Ann. Statist. 36 (2008), no. 5, 2409--2422. doi:10.1214/07-AOS532. https://projecteuclid.org/euclid.aos/1223908097


Export citation

References

  • Gill, R., van der Laan, M. and Robins, J. (1997). Coarsening at random: Characterisations, conjectures and counter-examples. In Proceedings First Seattle Conference on Biostatistics (D. Lin, ed.) 255–294. Springer, New York.
  • Grünwald, P. and Halpern, J. (2003). Updating probabilities. J. Artificial Intelligence Research 19 243–278.
  • Heitjan, D. and Rubin, D. (1991). Ignorability and coarse data. Ann. Statist. 19 2244–2253.
  • Jaeger, M. (2005a). Ignorability for categorical data. Ann. Statist. 33 1964–1981.
  • Jaeger, M. (2005b). Ignorability in statistical and probabilistic inference. J. Artificial Intelligence Research 24 889–917.
  • Jaeger, M. (2006a). The AI & M procedure for learning from incomplete data. In Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (UAI 2006) (R. Dechter and T. Richardson, eds.) 225–232.
  • Jaeger, M. (2006b). On testing the missing at random assumption. In Machine Learning: ECML 2007, Seventeenth European Conference on Machine Learning (J. Fürnkranz, T. Scheffer and M. Spiliopoulou, eds.). Lecture Notes in Comput. Sci. 4212 671–678. Springer, Berlin.
  • Schrijver, A. (1986). Theory of Linear and Integer Programming. Wiley, Chichester.