The field of distribution-free predictive inference provides tools for provably valid prediction without any assumptions on the distribution of the data, which can be paired with any regression algorithm to provide accurate and reliable predictive intervals. The guarantees provided by these methods are typically marginal, meaning that predictive accuracy holds on average over both the training data set and the test point that is queried. However, it may be preferable to obtain a stronger guarantee of training-conditional coverage, which would ensure that most draws of the training data set result in accurate predictive accuracy on future test points. This property is known to hold for the split conformal prediction method. In this work, we examine the training-conditional coverage properties of several other distribution-free predictive inference methods, and find that training-conditional coverage is achieved by some methods but is impossible to guarantee without further assumptions for others.
R.F.B. was supported by the National Science Foundation via grants DMS-1654076 and DMS-2023109, and by the Office of Naval Research via grant N00014-20-1-2337.
"Training-conditional coverage for distribution-free predictive inference." Electron. J. Statist. 17 (2) 2044 - 2066, 2023. https://doi.org/10.1214/23-EJS2145