Annals of Statistics
- Ann. Statist.
- Volume 47, Number 1 (2019), 382-414.
Asymptotic distribution-free change-point detection for multivariate and non-Euclidean data
We consider the testing and estimation of change-points, locations where the distribution abruptly changes, in a sequence of multivariate or non-Euclidean observations. We study a nonparametric framework that utilizes similarity information among observations, which can be applied to various data types as long as an informative similarity measure on the sample space can be defined. The existing approach along this line has low power and/or biased estimates for change-points under some common scenarios. We address these problems by considering new tests based on similarity information. Simulation studies show that the new approaches exhibit substantial improvements in detecting and estimating change-points. In addition, under some mild conditions, the new test statistics are asymptotically distribution-free under the null hypothesis of no change. Analytic $p$-value approximations to the significance of the new test statistics for the single change-point alternative and changed interval alternative are derived, making the new approaches easy off-the-shelf tools for large datasets. The new approaches are illustrated in an analysis of New York taxi data.
Ann. Statist., Volume 47, Number 1 (2019), 382-414.
Received: June 2017
Revised: February 2018
First available in Project Euclid: 30 November 2018
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Chu, Lynna; Chen, Hao. Asymptotic distribution-free change-point detection for multivariate and non-Euclidean data. Ann. Statist. 47 (2019), no. 1, 382--414. doi:10.1214/18-AOS1691. https://projecteuclid.org/euclid.aos/1543568592
- Supplement to “Asymptotic distribution-free change-point detection for multivariate and non-Euclidean data”. The supplementary material contains the new test statistics for the changed-interval alternative, additional technical results and proofs, more illustrations of the data, additional power and analytical critical value tables and further discussion on the conditions of the graph and the relationship between the new statistics, including an extension of the max-type statistic.