## Electronic Journal of Statistics

### Quantifying identifiability in independent component analysis

#### Abstract

We are interested in consistent estimation of the mixing matrix in the ICA model, when the error distribution is close to (but different from) Gaussian. In particular, we consider $n$ independent samples from the ICA model $X=A\epsilon$, where we assume that the coordinates of $\epsilon$ are independent and identically distributed according to a contaminated Gaussian distribution, and the amount of contamination is allowed to depend on $n$. We then investigate how the ability to consistently estimate the mixing matrix depends on the amount of contamination. Our results suggest that in an asymptotic sense, if the amount of contamination decreases at rate $1/\sqrt{n}$ or faster, then the mixing matrix is only identifiable up to transpose products. These results also have implications for causal inference from linear structural equation models with near-Gaussian additive noise.

#### Article information

Source
Electron. J. Statist., Volume 8, Number 1 (2014), 1438-1459.

Dates
First available in Project Euclid: 20 August 2014

https://projecteuclid.org/euclid.ejs/1408540293

Digital Object Identifier
doi:10.1214/14-EJS932

Mathematical Reviews number (MathSciNet)
MR3263128

Zentralblatt MATH identifier
1298.62045

Subjects
Primary: 62F12: Asymptotic properties of estimators
Secondary: 62F35: Robustness and adaptive procedures

#### Citation

Sokol, Alexander; H. Maathuis, Marloes; Falkeborg, Benjamin. Quantifying identifiability in independent component analysis. Electron. J. Statist. 8 (2014), no. 1, 1438--1459. doi:10.1214/14-EJS932. https://projecteuclid.org/euclid.ejs/1408540293

#### References

• [1] Amari, S.-I. and Cardoso, J.-F., Blind source separation – semiparametric statistical approach, IEEE Transactions on Signal Processing 45 (1997), no. 11, 2692–2700.
• [2] Bartlett, M. S., Movellan, J. R., and Sejnowski, T. J., Face recognition by independent component analysis, IEEE Transactions on Neural Networks 13 (2002), no. 6, 1450–1464.
• [3] Beckmann, C. F. and Smith, S. M., Probabilistic independent component analysis for functional magnetic resonance imaging, IEEE Transactions on Medical Imaging 23 (2004), no. 2, 137–152.
• [4] Chen, A. and Bickel, P. J., Efficient independent component analysis, Ann. Statist. 34 (2006), no. 6, 2825–2855.
• [5] Comon, P., Independent component analysis, a new concept? Signal Processing 36 (1994), 287–314.
• [6] Comon, P. and Jutten, C., Handbook of blind source separation: Independent component analysis and applications, Elsevier, Oxford, 2010.
• [7] Csörgő, M. and Horváth, L., A note on strong approximations of multivariate empirical processes, Stochastic Process. Appl. 28 (1988), no. 1, 101–109.
• [8] Dudley, R. M., Weak convergences of probabilities on nonseparable metric spaces and empirical measures on Euclidean spaces, Illinois J. Math. 10 (1966), 109–126.
• [9] Eriksson, J. and Koivunen, V., Identifiability, separability and uniqueness of linear ICA models, IEEE Signal Processing Letters 11 (2004), no. 7, 601–604.
• [10] Hoyer, P. O., Janzing, D., Mooij, J. M., Peters, J., and Schölkopf, B., Nonlinear causal discovery with additive noise models, Advances in Neural Information Processing Systems 21 (NIPS), MIT Press, 2009, pp. 689–696.
• [11] Hyvärinen, A., Fast and robust fixed-point algorithms for independent component analysis, IEEE Transactions on Neural Networks 10 (1999), no. 3, 626–634.
• [12] Hyvärinen, A., Independent component analysis: Recent advances, Phil. Trans. Roy. Soc. Ser. A 371 (2013), 1–19.
• [13] Hyvärinen, A., Karhunen, J., and Oja, E., Independent component analysis, Wiley-Blackwell, New York, 2001.
• [14] Ilmonen, P. and Paindaveine, D., Semiparametrically efficient inference based on signed ranks in symmetric independent component models, Ann. Statist. 39 (2011), no. 5, 2448–2476.
• [15] Jung, T. P., Makeig, S., McKeown, M. J., Bell, A. J., Lee, T.-W., and Sejnowski, T. J., Imaging brain dynamics using independent component analysis, Proceedings of the IEEE 89 (2001), no. 7, 1107–1122.
• [16] Khoshnevisan, D., Multiparameter processes, Springer Monographs in Mathematics, Springer-Verlag, New York, 2002.
• [17] Lifshits, M. A., Gaussian random functions, Mathematics and Its Applications, vol. 322, Kluwer Academic Publishers, Dordrecht, 1995.
• [18] Novey, M. and Adah, T., Complex ICA by negentropy maximization, IEEE Transactions on Neural Networks 19 (2008), no. 4, 596–609.
• [19] Ollila, E., Kim, H.-J., and Koivunen, V., Compact Cramér-Rao bound expression for independent component analysis, IEEE Transactions on Signal Processing 56 (2008), no. 4, 1421–1428.
• [20] Peters, J. and Bühlmann, P., Identifiability of Gaussian structural equation models with same error variances, Biometrika 101 (2014), 219–228.
• [21] Rudin, W., Real and complex analysis, third ed., McGraw-Hill Book Co., New York, 1987.
• [22] Samworth, R. J. and Yuan, M., Independent component analysis via nonparametric maximum likelihood estimation, Ann. Statist. 40 (2012), 2973–3002.
• [23] Shimizu, S., Hoyer, P. O., Hyvärinen, A., and Kerminen, A., A linear non-Gaussian acyclic model for causal discovery, J. Mach. Learn. Res. 7 (2006), 2003–2030.
• [24] Shimizu, S., Inazumi, T., Sogawa, Y., Hyvärinen, A., Kawahara, Y., Washio, T., Hoyer, P. O., and Bollen, K., DirectLiNGAM: a direct method for learning a linear non-Gaussian structural equation model, J. Mach. Learn. Res. 12 (2011), 1225–1248.
• [25] van der Vaart, A. W. and Wellner, J. A., Weak convergence and empirical processes, Springer Series in Statistics, Springer-Verlag, New York, 1996.
• [26] Vigario, R., Särelä, J., Jousmäki, V., Hämäläinen, M. and Oja, E., Independent component approach to the analysis of EEG and MEG recordings, IEEE Transactions on Biomedical Engineering 47 (2000), no. 5, 589–593.
• [27] Yuan, M., On the identifiability of additive index models, Stat. Sinica 21 (2011), 1901–1911.