We discuss recently developed methods that quantify the stability and generalizability of statistical findings under distributional changes. In many practical problems, the data is not drawn i.i.d. from the target population. For example, unobserved sampling bias, batch effects, or unknown associations might inflate the variance compared to i.i.d. sampling. For reliable statistical inference, it is thus necessary to account for these types of variation. We discuss and review two methods that allow to quantify distribution stability based on a single dataset. The first method computes the sensitivity of a parameter under worst-case distributional perturbations to understand which types of shift pose a threat to external validity. The second method treats distributional shifts as random which allows to assess average robustness (instead of worst-case). Based on a stability analysis of multiple estimators on a single dataset, it integrates both sampling and distributional uncertainty into a single confidence interval.
The research of D. Rothenhäusler was supported by the Stanford Institute for Human-Centered Artificial Intelligence (HAI).
The research of P. Bühlmann was supported by the European Research Council under the Grant Agreement No 786461 (CausalStats—ERC-2017-ADG).
We thank the Guest Editors and the Editor for the opportunity of presenting our work and the reviewers for constructive comments. The research was partially conducted during D. Rothenhäusler’s research stay at the Institute for Mathematical Research at ETH Zürich (FIM).
"Distributionally Robust and Generalizable Inference." Statist. Sci. 38 (4) 527 - 542, November 2023. https://doi.org/10.1214/23-STS902