The Annals of Statistics

Multiscale blind source separation

Merle Behr, Chris Holmes, and Axel Munk

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


We provide a new methodology for statistical recovery of single linear mixtures of piecewise constant signals (sources) with unknown mixing weights and change points in a multiscale fashion. We show exact recovery within an $\varepsilon$-neighborhood of the mixture when the sources take only values in a known finite alphabet. Based on this we provide the SLAM (Separates Linear Alphabet Mixtures) estimators for the mixing weights and sources. For Gaussian error, we obtain uniform confidence sets and optimal rates (up to log-factors) for all quantities. SLAM is efficiently computed as a nonconvex optimization problem by a dynamic program tailored to the finite alphabet assumption. Its performance is investigated in a simulation study. Finally, it is applied to assign copy-number aberrations from genetic sequencing data to different clones and to estimate their proportions.

Article information

Ann. Statist., Volume 46, Number 2 (2018), 711-744.

Received: July 2016
Revised: January 2017
First available in Project Euclid: 3 April 2018

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G08: Nonparametric regression 62G15: Tolerance and confidence regions
Secondary: 92D10: Genetics {For genetic algebras, see 17D92}

Multiscale inference honest confidence sets change point regression finite alphabet linear mixture exact recovery genetic sequencing


Behr, Merle; Holmes, Chris; Munk, Axel. Multiscale blind source separation. Ann. Statist. 46 (2018), no. 2, 711--744. doi:10.1214/17-AOS1565.

Export citation


  • [1] Aï ssa-El-Bey, A., Pastor, D., Sbaï, S. M. A. and Fadlallah, Y. (2015). Sparsity-based recovery of finite alphabet solutions to underdetermined linear systems. IEEE Trans. Inform. Theory 61 2008–2018.
  • [2] Arora, S., Ge, R., Kannan, R. and Moitra, A. (2012). Computing a nonnegative matrix factorization—provably. In STOC’12—Proceedings of the 2012 ACM Symposium on Theory of Computing 145–161. ACM, New York.
  • [3] Arora, S., Ge, R., Moitra, A. and Sachdeva, S. (2015). Provable ICA with unknown Gaussian noise, and implications for Gaussian mixtures and autoencoders. Algorithmica 72 215–236.
  • [4] Bai, J. and Perron, P. (1998). Estimating and testing linear models with multiple structural changes. Econometrica 66 47–78.
  • [5] Behr, M., Holmes, C. and Munk, A. (2018). Supplement to “Multiscale blind source separation.” DOI:10.1214/17-AOS1565SUPP.
  • [6] Behr, M. and Munk, A. (2015). Identifiability for blind source separation of multiple finite alphabet linear mixtures. IEEE Trans. Inform. Theory 63 5506–5517.
  • [7] Belkin, M., Rademacher, L. and Voss, J. (2013). Blind signal separation in the presence of Gaussian noise. J. Mach. Learn. Res. Proc. 30 270–287.
  • [8] Beroukhim, R., Mermel, C. H., Porter, D., Wei, G., Raychaudhuri, S., Donovan, J., Barretina, J., Boehm, J. S., Dobson, J., Urashima, M. et al. (2010). The landscape of somatic copy-number alteration across human cancers. Nature 463 899–905.
  • [9] Bioglio, V., Coluccia, G. and Magli, E. (2014). Sparse image recovery using compressed sensing over finite alphabets. IEEE Int. Conf. Image Process. (ICIP) 1287–1291.
  • [10] Bofill, P. and Zibulevsky, M. (2001). Underdetermined blind source separation using sparse representations. Signal Process. 81 2353–2362.
  • [11] Boysen, L., Kempe, A., Liebscher, V., Munk, A. and Wittich, O. (2009). Consistencies and rates of convergence of jump-penalized least squares estimators. Ann. Statist. 37 157–183.
  • [12] Candes, E. J. and Tao, T. (2006). Near-optimal signal recovery from random projections: Universal encoding strategies? IEEE Trans. Inform. Theory 52 5406–5425.
  • [13] Carlstein, E., Müller, H.-G. and Siegmund, D., eds. (1994). Change-Point Problems. Lecture Notes—Monograph Series 23. IMS, Hayward, CA.
  • [14] Carter, S. L., Cibulskis, K., Helman, E., McKenna, A., Shen, H., Zack, T., Laird, P. W., Onofrio, R. C., Winckler, W., Weir, B. A. et al. (2012). Absolute quantification of somatic DNA alterations in human cancer. Nat. Biotechnol. 30 413–421.
  • [15] Chen, H., Xing, H. and Zhang, N. R. (2011). Estimation of parent specific DNA copy number in tumors using high-density genotyping arrays. PLoS Comput. Biol. 7 e1001060.
  • [16] Cheng, M.-Y. and Hall, P. (1999). Mode testing in difficult cases. Ann. Statist. 27 1294–1315.
  • [17] Comon, P. (1994). Independent component analysis, a new concept? Signal Process. 36 287–314.
  • [18] Das, A. K. and Vishwanath, S. (2013). On finite alphabet compressive sensing. IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP) 5890–5894.
  • [19] Davies, L., Höhenrieder, C. and Krämer, W. (2012). Recursive computation of piecewise constant volatilities. Comput. Statist. Data Anal. 56 3623–3631.
  • [20] Davies, P. L. and Kovac, A. (2001). Local extremes, runs, strings and multiresolution. Ann. Statist. 29 1–65.
  • [21] Dette, H., Munk, A. and Wagner, T. (1998). Estimating the variance in nonparametric regression—what is a reasonable choice? J. R. Stat. Soc. Ser. B. Stat. Methodol. 60 751–764.
  • [22] Diamantaras, K. I. (2006). A clustering approach for the blind separation of multiple finite alphabet sequences from a single linear mixture. Signal Process. 86 877–891.
  • [23] Ding, L., Wendl, M. C., McMichael, J. F. and Raphael, B. J. (2014). Expanding the computational toolbox for mining cancer genomes. Nat. Rev. Genet. 15 556–570.
  • [24] Donoho, D. and Stodden, V. (2003). When does non-negative matrix factorization give a correct decomposition into parts? Adv. Neural Inf. Process. Syst. 16.
  • [25] Donoho, D. L. (2006). Compressed sensing. IEEE Trans. Inform. Theory 52 1289–1306.
  • [26] Draper, S. C. and Malekpour, S. (2009). Compressed sensing over finite fields. Proceedings of the 2009 IEEE international conference on Symposium on Information Theory 1 669–673.
  • [27] Du, C., Kao, C.-L. M. and Kou, S. C. (2016). Stepwise signal extraction via marginal likelihood. J. Amer. Statist. Assoc. 111 314–330.
  • [28] Dümbgen, L., Piterbarg, V. I. and Zholud, D. (2006). On the limit distribution of multiscale test statistics for nonparametric curve estimation. Math. Methods Statist. 15 20–25.
  • [29] Dümbgen, L. and Spokoiny, V. G. (2001). Multiscale testing of qualitative hypotheses. Ann. Statist. 29 124–152.
  • [30] Dümbgen, L. and Walther, G. (2008). Multiscale inference about a density. Ann. Statist. 36 1758–1785.
  • [31] Fearnhead, P. (2006). Exact and efficient Bayesian inference for multiple changepoint problems. Stat. Comput. 16 203–213.
  • [32] Frick, K., Munk, A. and Sieling, H. (2014). Multiscale change point inference. J. R. Stat. Soc. Ser. B. Stat. Methodol. 76 495–580.
  • [33] Friedrich, F., Kempe, A., Liebscher, V. and Winkler, G. (2008). Complexity penalized $M$-estimation: Fast computation. J. Comput. Graph. Statist. 17 201–224.
  • [34] Fryzlewicz, P. (2014). Wild binary segmentation for multiple change-point detection. Ann. Statist. 42 2243–2281.
  • [35] Futschik, A., Hotz, T., Munk, A. and Sieling, H. (2014). Multiscale DNA partitioning: Statistical evidence for segments. Bioinformatics 30 2255–2262.
  • [36] Greaves, M. and Maley, C. C. (2012). Clonal evolution in cancer. Nature 481 306–313.
  • [37] Gu, F., Zhang, H., Li, N. and Lu, W. (2010). Blind separation of multiple sequences from a single linear mixture using finite alphabet. IEEE Int. Conf. Wirel. Commun. Signal Process. (WCSP) 1–5.
  • [38] Ha, G., Roth, A., Khattra, J., Ho, J., Yap, D., Prentice, L. M., Melnyk, N., McPherson, A., Bashashati, A., Laks, E. et al. (2014). TITAN: Inference of copy number architectures in clonal cell populations from tumor whole-genome sequence data. Genome Res. 24 1881–1893.
  • [39] Hall, P., Kay, J. W. and Titterington, D. M. (1990). Asymptotically optimal difference-based estimation of variance in nonparametric regression. Biometrika 77 521–528.
  • [40] Harchaoui, Z. and Lévy-Leduc, C. (2010). Multiple change-point estimation with a total variation penalty. J. Amer. Statist. Assoc. 105 1480–1493.
  • [41] Jeng, X. J., Cai, T. T. and Li, H. (2010). Optimal sparse segment identification with application in copy number variation analysis. J. Amer. Statist. Assoc. 105 1156–1166.
  • [42] Killick, R., Fearnhead, P. and Eckley, I. A. (2012). Optimal detection of changepoints with a linear computational cost. J. Amer. Statist. Assoc. 107 1590–1598.
  • [43] Kofidis, N., Margaris, A., Diamantaras, K. and Roumeliotis, M. (2008). Blind system identification: Instantaneous mixtures of $n$ sources. Int. J. Comput. Math. 85 1333–1340.
  • [44] Lee, D. and Seung, S. (1999). Learning the parts of objects by non-negative matrix factorization. Nature 401 788–791.
  • [45] Lee, T. W., Lewicki, M. S., Girolami, M. and Sejnowski, T. J. (1999). Blind source separation of more sources than mixtures using overcomplete representations. Signal Process. Lett. 6 87–90.
  • [46] Li, J., Ray, S. and Lindsay, B. G. (2007). A nonparametric statistical approach to clustering via mode identification. J. Mach. Learn. Res. 8 1687–1723.
  • [47] Li, Y., Amari, S. I., Cichocki, A., Ho, D. W. and Xie, S. (2006). Underdetermined blind source separation based on sparse representation. IEEE Trans. Signal Process. 54 423–437.
  • [48] Liu, B., Morrison, C. D., Johnson, C. S., Trump, D. L., Qin, M., Conroy, J. C., Wang, J. and Liu, S. (2013). Computational methods for detecting copy number variations in cancer genome using next generation sequencing: Principles and challenges. Oncotarget 4 1868.
  • [49] Matteson, D. S. and James, N. A. (2014). A nonparametric approach for multiple change point analysis of multivariate data. J. Amer. Statist. Assoc. 109 334–345.
  • [50] Müller, H.-G. and Stadtmüller, U. (1987). Estimation of heteroscedasticity in regression analysis. Ann. Statist. 15 610–625.
  • [51] Niu, Y. S. and Zhang, H. (2012). The screening and ranking algorithm to detect DNA copy number variations. Ann. Appl. Stat. 6 1306–1326.
  • [52] Olshen, A. B., Venkatraman, E. S., Lucito, R. and Wigler, M. (2004). Circular binary segmentation for the analysis of array-based DNA copy number data. Biostat. 5 557–572.
  • [53] Ooi, H. (2002). Density visualization and mode hunting using trees. J. Comput. Graph. Statist. 11 328–347.
  • [54] Pajunen, P. (1997). Blind separation of binary sources with less sensors than sources. IEEE Int. Conf. Neural Netw. 3 1994–1997.
  • [55] Polonik, W. (1998). The silhouette, concentration functions and ML-density estimation under order restrictions. Ann. Statist. 26 1857–1877.
  • [56] Proakis, J. G. (1995). Digital Communications. McGraw-Hill, New York.
  • [57] Recht, B., Re, C., Tropp, J. and Bittorf, V. (2012). Factoring nonnegative matrices with linear programs. Adv. Neural Inf. Process. Syst. 25 1214–1222.
  • [58] Rosenberg, A. and Hirschberg, J. (2007). V-measure: A conditional entropy-based external cluster evaluation measure. EMNLP-CoNLL 7 410–420.
  • [59] Rostami, M., Babaie-Zadeh, M., Samadi, S. and Jutten, C. (2011). Blind source separation of discrete finite alphabet sources using a single mixture. IEEE Stat. Signal Process. Workshop (SSP) 709–712.
  • [60] Roth, A., Khattra, J., Yap, D., Wan, A., Laks, E., Biele, J., Ha, G., Aparicio, S., Bouchard-Côté, A. and Shah, S. P. (2014). PyClone: Statistical inference of clonal population structure in cancer. Nat. Methods 11 396–398.
  • [61] Shah, S. P., Roth, A., Goya, R., Oloumi, A., Ha, G., Zhao, Y., Turashvili, G., Ding, J., Tse, K., Haffari, G. et al. (2012). The clonal and mutational evolution spectrum of primary triple-negative breast cancers. Nature 486 395–399.
  • [62] Siegmund, D. (2013). Change-points: From sequential detection to biology and back. Sequential Anal. 32 2–14.
  • [63] Siegmund, D. and Yakir, B. (2000). Tail probabilities for the null distribution of scanning statistics. Bernoulli 6 191–213.
  • [64] Spielman, D. A., Wang, H. and Wright, J. (2012). Exact recovery of sparsely-used dictionaries. J. Mach. Learn. Res. Proc. 23 37.1–37.18.
  • [65] Spokoiny, V. (2009). Multiscale local change point detection with applications to value-at-risk. Ann. Statist. 37 1405–1436.
  • [66] Talwar, S., Viberg, M. and Paulraj, A. (1996). Blind separation of synchronous co-channel digital signals using an antenna array—Part I. algorithms. IEEE Trans. Signal Process. 44 1184–1197.
  • [67] Tibshirani, R., Walther, G. and Hastie, T. (2001). Estimating the number of clusters in a data set via the gap statistic. J. R. Stat. Soc. Ser. B. Stat. Methodol. 63 411–423.
  • [68] Tibshirani, R. and Wang, P. (2008). Spatial smoothing and hot spot detection for CGH data using the fused lasso. Biostat. 9 18–29.
  • [69] Verdú, S. (1998). Multiuser Detection. Cambridge University Press, Cambridge.
  • [70] Walther, G. (2010). Optimal and fast detection of spatial clusters with scan statistics. Ann. Statist. 38 1010–1033.
  • [71] Yau, C., Papaspiliopoulos, O., Roberts, G. O. and Holmes, C. (2011). Bayesian non-parametric hidden Markov models with applications in genomics. J. R. Stat. Soc. Ser. B. Stat. Methodol. 73 37–57.
  • [72] Yuanqing, L., Cichocki, A. and Zhang, L. (2003). Blind separation and extraction of binary sources. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 86 580–589.
  • [73] Zhang, N. R. and Siegmund, D. O. (2007). A modified Bayes information criterion with applications to the analysis of comparative genomic hybridization data. Biometrics 63 22–32.
  • [74] Zhang, N. R. and Siegmund, D. O. (2012). Model selection for high-dimensional, multi-sequence change-point problems. Statist. Sinica 22 1507–1538.

Supplemental materials

  • Supplement to Multiscale Blind Source Separation. Proofs of Theorem 1.4, Theorem 2.5, and Theorem 2.7 (Section S1); additional details on algorithms (Section S2); additional figures and tables from Section 4 and 5 (Section S3); details on the SST-method (Section S4).