Abstract
The hypothesis of randomness is fundamental in statistical machine learning and in many areas of nonparametric statistics; it says that the observations are assumed to be independent and coming from the same unknown probability distribution. This hypothesis is close, in certain respects, to the hypothesis of exchangeability, which postulates that the distribution of the observations is invariant with respect to their permutations. This paper reviews known methods of testing the two hypotheses concentrating on the online mode of testing, when the observations arrive sequentially. All known online methods for testing these hypotheses are based on conformal martingales, which are defined and studied in detail. An important variety of online testing is change detection, where the use of conformal martingales leads to conformal versions of the CUSUM and Shiryaev–Roberts procedures; these versions work in the nonparametric setting where the data is assumed IID according to a completely unknown distribution before the change. The paper emphasizes conceptual and practical aspects and states two kinds of results. Validity results limit the probability of a false alarm or, in the case of change detection, the frequency of false alarms for various procedures based on conformal martingales. Efficiency results establish connections between randomness, exchangeability, and conformal martingales.
Funding Statement
This research has been supported by Amazon, Astra Zeneca, and Stena Line.
Acknowledgements
The original version of this paper was written in support of my poster at ISIPTA 2019 (Eleventh International Symposium on Imprecise Probabilities: Theories and Applications) presented on 5 July 2019. I am grateful to all discussants, including Thomas Dietterich, Wouter Koolen, and Glenn Shafer. The experiments with the tangent metric are based on Daniel Keysers’s [17] implementation in C. Bartels’s test for randomness used in Section 2 is from the R package randtests (Testing randomness in R) by Frederico Caeiro and Ayana Mateus [5]. Comments by three reviewers have been extremely useful for improving the presentation; they led, in particular, to the inclusion of experiments with the Absenteeism dataset and to the explicit discussion of Cournot’s principle.
Citation
Vladimir Vovk. "Testing Randomness Online." Statist. Sci. 36 (4) 595 - 611, November 2021. https://doi.org/10.1214/20-STS817
Information