Electronic Journal of Statistics

Multiscale change-point segmentation: beyond step functions

Housen Li, Qinghai Guo, and Axel Munk

Full-text: Open access


Modern multiscale type segmentation methods are known to detect multiple change-points with high statistical accuracy, while allowing for fast computation. Underpinning (minimax) estimation theory has been developed mainly for models that assume the signal as a piecewise constant function. In this paper, for a large collection of multiscale segmentation methods (including various existing procedures), such theory will be extended to certain function classes beyond step functions in a nonparametric regression setting. This extends the interpretation of such methods on the one hand and on the other hand reveals these methods as robust to deviation from piecewise constant functions. Our main finding is the adaptation over nonlinear approximation classes for a universal thresholding, which includes bounded variation functions, and (piecewise) Hölder functions of smoothness order $0<\alpha \le1$ as special cases. From this we derive statistical guarantees on feature detection in terms of jumps and modes. Another key finding is that these multiscale segmentation methods perform nearly (up to a log-factor) as well as the oracle piecewise constant segmentation estimator (with known jump locations), and the best piecewise constant approximants of the (unknown) true signal. Theoretical findings are examined by various numerical simulations.

Article information

Electron. J. Statist., Volume 13, Number 2 (2019), 3254-3296.

Received: January 2019
First available in Project Euclid: 25 September 2019

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G08: Nonparametric regression 62G20: Asymptotic properties 62G35: Robustness

Change-point regression adaptive estimation oracle inequality jump detection model misspecification multiscale inference approximation spaces robustness

Creative Commons Attribution 4.0 International License.


Li, Housen; Guo, Qinghai; Munk, Axel. Multiscale change-point segmentation: beyond step functions. Electron. J. Statist. 13 (2019), no. 2, 3254--3296. doi:10.1214/19-EJS1608. https://projecteuclid.org/euclid.ejs/1569377043

Export citation


  • Abramovich, F., Antoniadis, A. and Pensky, M. (2007). Estimation of piecewise-smooth functions by amalgamated bridge regression splines., Sankhyā 69 1–27.
  • Antoch, J. and Hušková, M. (2000). Bayesian-type estimators of change points., J. Statist. Plann. Inference 91 195–208. Prague Workshop on Perspectives in Modern Statistical Inference: Parametrics, Semi-parametrics, Non-parametrics (1998).
  • Aue, A., Cheung, R. C. Y., Lee, T. C. M. and Zhong, M. (2014). Segmented model selection in quantile regression using the minimum description length principle., J. Amer. Statist. Assoc. 109 1241–1256.
  • Behr, M., Holmes, C. and Munk, A. (2018). Multiscale blind source separation., Ann. Statist. 46 711–744.
  • Behr, M. and Munk, A. (2017). Identifiability for blind source separation of multiple finite alphabet linear mixtures., IEEE Trans. Inform. Theory 63 5506–5517.
  • Bellman, R. (1957)., Dynamic Programming. Princeton University Press, Princeton, NJ, USA.
  • Billingsley, P. (1999)., Convergence of Probability Measures, second ed. Wiley Series in Probability and Statistics: Probability and Statistics. John Wiley & Sons, Inc., New York. A Wiley-Interscience Publication.
  • Boneva, L. I., Kendall, D. and Stefanov, I. (1971). Spline transformations: Three new diagnostic aids for the statistical data-analyst., J. Roy. Statist. Soc. Ser. B 33 1–70.
  • Boysen, L., Kempe, A., Liebscher, V., Munk, A. and Wittich, O. (2009). Consistencies and rates of convergence of jump-penalized least squares estimators., Ann. Statist. 37 157–183.
  • Braun, J. V., Braun, R. K. and Mueller, H. G. (2000). Multiple changepoint fitting via quasilikelihood, with application to DNA sequence segmentation., Biometrika 87 301–314.
  • Burchard, H. G. and Hale, D. F. (1975). Piecewise polynomial approximation on optimal meshes., J. Approximation Theory 14 128–147.
  • Cai, T. T., Jeng, X. J. and Li, H. (2012). Robust detection and identification of sparse segments in ultrahigh dimensional data analysis., J. R. Stat. Soc. Ser. B. Stat. Methodol. 74 773–797.
  • Chan, H.-P. and Chen, H. (2017). Multi-sequence segmentation via score and higher-criticism tests., arXiv:1706.07586.
  • Chan, H. P. and Walther, G. (2013). Detection with the scan and the average likelihood ratio., Statist. Sinica 23 409–428.
  • Chen, H. and Zhang, N. (2015). Graph-based change-point detection., Ann. Statist. 43 139–176.
  • Davies, L., Höhenrieder, C. and Krämer, W. (2012). Recursive computation of piecewise constant volatilities., Comput. Stat. Data Anal. 56 3623–3631.
  • Davies, P. L. and Kovac, A. (2001). Local extremes, runs, strings and multiresolution., Ann. Statist. 29 1–65. With discussion and rejoinder by the authors.
  • del Alamo, M., Li, H. and Munk, A. (2018). Frame-constrained total variation regularization for white noise regression., arXiv:1807.02038.
  • DeVore, R. A. (1998). Nonlinear approximation. In, Acta Numerica, 1998. Acta Numer. 7 51–150. Cambridge Univ. Press, Cambridge.
  • DeVore, R. A. and Lorentz, G. G. (1993)., Constructive Approximation. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences] 303. Springer-Verlag, Berlin.
  • Diskin, S. J., Li, M., Hou, C., Yang, S., Glessner, J., Hakonarson, H., Bucan, M., Maris, J. M. and Wang, K. (2008). Adjustment of genomic waves in signal intensities from whole-genome SNP genotyping platforms., Nucleic Acids Res. 36 e126.
  • Donoho, D. L. (1988). One-Sided inference about functionals of a density., Ann. Statist. 16 1390–1420.
  • Donoho, D. L. and Johnstone, I. M. (1994). Ideal spatial adaptation by wavelet shrinkage., Biometrika 81 425–455.
  • Donoho, D. L., Johnstone, I. M., Kerkyacharian, G. and Picard, D. (1995). Wavelet shrinkage: asymptopia?, J. R. Stat. Soc. Ser. B. Stat. Methodol. 57 301–369. With discussion and a reply by the authors.
  • Du, C., Kao, C.-L. M. and Kou, S. C. (2016). Stepwise signal extraction via marginal likelihood., J. Amer. Statist. Assoc. 111 314–330.
  • Dümbgen, L. and Spokoiny, V. G. (2001). Multiscale testing of qualitative hypotheses., Ann. Statist. 29 124–152.
  • Fang, X., Li, J. and Siegmund, D. (2019). Segmentation and estimation of change-point models: false positive control and confidence regions., Ann. Statist. To appear.
  • Farcomeni, A. (2014). Discussion of “Multiscale change-point inference”., J. R. Stat. Soc. Ser. B. Stat. Methodol. 76 546–547.
  • Frick, K., Munk, A. and Sieling, H. (2014). Multiscale change point inference., J. R. Stat. Soc. Ser. B. Stat. Methodol. 76 495–580. With 32 discussions by 47 authors and a rejoinder by the authors.
  • Fryzlewicz, P. (2007). Unbalanced Haar technique for nonparametric function estimation., J. Amer. Statist. Assoc. 102 1318–1327.
  • Fryzlewicz, P. (2014). Wild binary segmentation for multiple change-point detection., Ann. Statist. 42 2243–2281.
  • Futschik, A., Hotz, T., Munk, A. and Sieling, H. (2014). Multiresolution DNA partitioning: statistical evidence for segments., Bioinformatics 30 2255–2262.
  • Gao, C., Han, F. and Zhang, C.-H. (2019). On estimation of isotonic piecewise constant signals., Ann. Statist. To appear.
  • Grasmair, M., Li, H. and Munk, A. (2018). Variational multiscale nonparametric regression: Smooth functions., Ann. Inst. Henri Poincaré Probab. Stat. 54 1058–1097.
  • Hall, P. and Marron, J. S. (1990). On variance estimation in nonparametric regression., Biometrika 77 415–419.
  • Han, Q. and Wellner, J. A. (2019). Convergence rates of least squares regression estimators with heavy-tailed errors., Ann. Statist. 47 2286–2319.
  • Harchaoui, Z. and Lévy-Leduc, C. (2008). Catching change-points with lasso., Adv. in Neur. Inform. Processing Syst. 20 161–168.
  • Harchaoui, Z. and Lévy-Leduc, C. (2010). Multiple change-point estimation with a total variation penalty., J. Amer. Statist. Assoc. 105 1480–1493.
  • Has’minskiĭ, R. Z. (1978). A lower bound on the risks of nonparametric estimates of densities in the uniform metric., Theory Probab. Appl. 23 794–798.
  • Haynes, K., Eckley, I. A. and Fearnhead, P. (2017). Computationally efficient changepoint detection for a range of penalties., J. Comput. Graph. Statist. 26 134–143.
  • Hotz, T., Schütte, O. M., Sieling, H., Polupanow, T., Diederichsen, U., Steinem, C. and Munk, A. (2013). Idealizing ion channel recordings by jump segmentation and statistical multiresolution analysis., IEEE Trans. Nanobiosci. 12 376–386.
  • Hušková, M. and Antoch, J. (2003). Detection of structural changes in regression., Tatra Mt. Math. Publ. 26 201–215. Probastat ’02. Part II.
  • Ibragimov, I. A. and Has’minskiĭ, R. Z. (1977). On the estimation of an infinite-dimensional parameter in Gaussian white noise., Sov. Math. Dokl. 18 1307–1309.
  • Ibragimov, I. A. and Has’minskiĭ, R. Z. (1981)., Statistical Estimation. Applications of Mathematics 16. Springer-Verlag, New York-Berlin Asymptotic theory, Translated from the Russian by Samuel Kotz.
  • Kabluchko, Z. (2007). Extreme-value analysis of standardized Gaussian increments., arXiv:0706.1849.
  • Killick, R., Fearnhead, P. and Eckley, I. A. (2012). Optimal detection of changepoints with a linear computational cost., J. Amer. Statist. Assoc. 107 1590–1598.
  • Korostelev, A. and Korosteleva, O. (2011)., Mathematical Statistics. Graduate Studies in Mathematics 119. American Mathematical Society, Providence, RI Asymptotic minimax theory.
  • Korte, B. and Vygen, J. (2012)., Combinatorial Optimization, fifth ed. Algorithms and Combinatorics 21. Springer, Heidelberg. Theory and algorithms.
  • Lai, W. R., Johnson, M. D., Kucherlapati, R. and Park, P. J. (2005). Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data., Bioinformatics 21 3763–3770.
  • Li, H., Munk, A. and Sieling, H. (2016). FDR-control in multiscale change-point segmentation., Electron. J. Stat. 10 918–959.
  • Lin, K., Sharpnack, J., Rinaldo, A. and Tibshirani, R. J. (2016). Approximate Recovery in Changepoint Problems, from $\ell _2$ Estimation Error Rates., arXiv:1606.06746.
  • Linton, O. and Seo, M. H. (2014). Discussion of “Multiscale change-point inference”., J. R. Stat. Soc. Ser. B. Stat. Methodol. 76 548.
  • Maidstone, R., Hocking, T., Rigaill, G. and Fearnhead, P. (2016). On optimal multiple changepoint algorithms for large data., Stat. Comput. 1–15.
  • Mammen, E. and van de Geer, S. (1997). Locally adaptive regression splines., Ann. Statist. 25 387–413.
  • Müller, H.-G. and Stadtmüller, U. (1987). Estimation of heteroscedasticity in regression analysis., Ann. Statist. 15 610–625.
  • Munk, A., Bissantz, N., Wagner, T. and Freitag, G. (2005). On difference-based variance estimation in nonparametric regression when the covariate is high dimensional., J. R. Stat. Soc. Ser. B Stat. Methodol. 67 19–41.
  • Nemirovski, A. (1985). Nonparametric estimation of smooth regression functions., Izv. Akad. Nauk. SSR Teckhn. Kibernet. (in Russian) 3 50–60. J. Comput. System Sci., 23:1–11, 1986 (in English).
  • Nemirovski, A. (2000). Topics in non-parametric statistics. In, Lectures on Probability Theory and Statistics (Saint-Flour, 1998). Lecture Notes in Math. 1738 85–277. Springer, Berlin.
  • Olshen, A. B., Venkatraman, E. S., Lucito, R. and Wigler, M. (2004). Circular binary segmentation for the analysis of array-based DNA copy number data., Biostatistics 5 557–572.
  • Pein, F., Sieling, H. and Munk, A. (2017). Heterogeneous change point inference., J. R. Stat. Soc. Ser. B. Stat. Methodol. 79 1207–1227.
  • Petrushev, P. P. (1988). Direct and converse theorems for spline and rational approximation and Besov spaces. In, Function Spaces and Applications (Lund, 1986). Lecture Notes in Math. 1302 363–377. Springer, Berlin.
  • Rivera, C. and Walther, G. (2013). Optimal detection of a jump in the intensity of a Poisson process or in a density with likelihood ratio statistics., Scand. J. Stat. 40 752–769.
  • Schwartzman, A., Gavrilov, Y. and Adler, R. J. (2011). Multiple testing of local maxima for detection of peaks in 1D., Ann. Statist. 39 3290–3319.
  • Scott, A. J. and Knott, M. (1974). A cluster analysis method for grouping means in the analysis of variance., Biometrics 30 507–512.
  • Shao, Q. M. (1995). On a conjecture of Révész., Proc. Amer. Math. Soc. 123 575–582.
  • Siegmund, D. (2013). Change-points: from sequential detection to biology and back., Sequential Anal. 32 2–14.
  • Siegmund, D. and Venkatraman, E. S. (1995). Using the generalized likelihood ratio statistic for sequential detection of a change-point., Ann. Statist. 23 255–271.
  • Siegmund, D. and Yakir, B. (2000). Tail probabilities for the null distribution of scanning statistics., Bernoulli 6 191–213.
  • Song, R., Banerjee, M. and Kosorok, M. R. (2016). Asymptotics for change-point models under varying degrees of mis-specification., Ann. Statist. 44 153–182.
  • Spokoiny, V. G. (1998). Estimation of a function with discontinuities via local polynomial fit with an adaptive window choice., Ann. Statist. 26 1356–1378.
  • Spraul, M., Neidig, P., Klauck, U., Kessler, P., Holmes, E., Nicholson, J. K., Sweatman, B. C., Salman, S. R., Farrant, R. D., Rahr, E., Beddell, C. R. and Lindon, J. C. (1994). Automatic reduction of NMR spectroscopic data for statistical and pattern recognition classification of samples., J. Pharm. Biomed. Anal. 12 1215–1225.
  • Tecuapetla-Gómez, I. and Munk, A. (2017). Autocovariance estimation in regression with a discontinuous signal and $m$-dependent errors: a difference-based approach., Scand. J. Stat. 44 346–368.
  • Tibshirani, R. and Wang, P. (2008). Spatial smoothing and hot spot detection for CGH data using the fused lasso., Biostatistics 9 18–29.
  • Tsybakov, A. (2009)., Introduction to Nonparametric Estimation. Springer-Verlag, New York.
  • Tukey, J. W. (1961). Curves as parameters, and touch estimation. In, Proc. 4th Berkeley Sympos. Math. Statist. and Prob., Vol. I 681–694. Univ. California Press, Berkeley, Calif.
  • Walther, G. (2010). Optimal and fast detection of spatial clusters with scan statistics., Ann. Statist. 38 1010–1033.
  • Yao, Y.-C. and Au, S. T. (1989). Least-squares estimation of a step function., Sankhyā Ser. A 51 370–381.
  • Zhang, N. R. and Siegmund, D. O. (2007). A modified Bayes information criterion with applications to the analysis of comparative genomic hybridization data., Biometrics 63 22–32.
  • Zhang, N. R. and Siegmund, D. O. (2012). Model selection for high-dimensional, multi-sequence change-point problems., Statist. Sinica 22 1507–1538.