Bernoulli

  • Bernoulli
  • Volume 19, Number 5B (2013), 2524-2556.

Optimal rank-based tests for Common Principal Components

Marc Hallin, Davy Paindaveine, and Thomas Verdebout

Full-text: Open access

Abstract

This paper provides optimal testing procedures for the $m$-sample null hypothesis of Common Principal Components (CPC) under possibly non-Gaussian and heterogeneous elliptical densities. We first establish, under very mild assumptions that do not require finite moments of order four, the local asymptotic normality (LAN) of the model. Based on that result, we show that the pseudo-Gaussian test proposed in Hallin et al. (J. Nonparametr. Stat. 22 (2010) 879–895) is locally and asymptotically optimal under Gaussian densities, and show how to compute its local powers. A numerical evaluation of those powers, however, reveals that, while remaining valid, this test is poorly efficient away from the Gaussian. Moreover, it still requires finite moments of order four. We therefore propose rank-based procedures that remain valid under any possibly heterogeneous $m$-tuple of elliptical densities, irrespective of the existence of any moments. In elliptical families, indeed, principal components naturally can be based on the scatter matrices characterizing the density contours, hence do not require finite variances. Those rank-based tests, as usual, involve score functions, which may or may not be associated with a reference density at which they achieve optimality. A major advantage of our rank tests is that they are not only validity-robust, in the sense of surviving arbitrary elliptical population densities: unlike their pseudo-Gaussian counterparts, they also are efficiency-robust, in the sense that their local powers do not deteriorate away from the reference density at which they are optimal. We show, in particular, that in the homokurtic case, their normal-score version uniformly dominates, in the Pitman sense, the aforementioned pseudo-Gaussian generalization of Flury’s test. Theoretical results are obtained via a nonstandard application of Le Cam’s methodology in the context of curved LAN experiments. The finite-sample properties of the proposed tests are investigated via simulations.

Article information

Source
Bernoulli, Volume 19, Number 5B (2013), 2524-2556.

Dates
First available in Project Euclid: 3 December 2013

Permanent link to this document
https://projecteuclid.org/euclid.bj/1386078612

Digital Object Identifier
doi:10.3150/12-BEJ461

Mathematical Reviews number (MathSciNet)
MR3160563

Zentralblatt MATH identifier
06254571

Keywords
Common Principal Components local asymptotic normality rank-based methods robustness

Citation

Hallin, Marc; Paindaveine, Davy; Verdebout, Thomas. Optimal rank-based tests for Common Principal Components. Bernoulli 19 (2013), no. 5B, 2524--2556. doi:10.3150/12-BEJ461. https://projecteuclid.org/euclid.bj/1386078612


Export citation

References

  • [1] Airoldi, J.P. and Hoffmann, R.S. (1984). Age variation in voles (Microtus californicus and Microtus ochrogaster) and its significance for systematic studies. Occasional Papers of the Museum of the Natural History, University of Kansas, Lawrence 111 1–45.
  • [2] Anderson, T.W. (2003). An Introduction to Multivariate Statistical Analysis, 3rd ed. Wiley Series in Probability and Statistics. Hoboken, NJ: Wiley-Interscience [John Wiley & Sons].
  • [3] Bentler, P.M. and Dudgeon, P. (1996). Covariance structure analysis: Statistical practice, theory, and directions. Annu. Rev. Psych 47 563–592.
  • [4] Boente, G. and Orellana, L. (2001). A robust approach to common principal components. In Statistics in Genetics and in the Environmental Sciences (Ascona, 1999). Trends Math. (L.T. Fernholz, S. Morgenthaler and W. Stahel, eds.) 117–145. Basel: Birkhäuser.
  • [5] Boente, G. and Orellana, L. (2004). Robust plug-in estimators in proportional scatter models. J. Statist. Plann. Inference 122 95–110.
  • [6] Boente, G., Pires, A.M. and Rodrigues, I.M. (2002). Influence functions and outlier detection under the common principal components model: A robust approach. Biometrika 89 861–875.
  • [7] Boente, G., Pires, A.M. and Rodrigues, I.M. (2009). Robust tests for the common principal components model. J. Statist. Plann. Inference 139 1332–1347.
  • [8] Boik, R.J. (2002). Spectral models for covariance matrices. Biometrika 89 159–182.
  • [9] Browne, M.W. (1984). The decomposition of multitrait-multimethod matrices. British J. Math. Statist. Psych. 37 1–21.
  • [10] Chernoff, H. and Savage, I. R. (1958). Asymptotic normality and efficiency of certain nonparametric test statistics. Ann. Math. Statist. 29 972–994.
  • [11] Flury, B. and Riedwyl, H. (1988). Multivariate Statistics: A Practical Approach. New York: Chapman & Hall.
  • [12] Flury, B.K. (1987). Two generalizations of the common principal component model. Biometrika 74 59–69.
  • [13] Flury, B.N. (1984). Common principal components in $k$ groups. J. Amer. Statist. Assoc. 79 892–898.
  • [14] Flury, B.N. (1986). Asymptotic theory for common principal component analysis. Ann. Statist. 14 418–430.
  • [15] Flury, B.N. and Gautschi, W. (1986). An algorithm for simultaneous orthogonal transformation of several positive definite symmetric matrices to nearly diagonal form. SIAM J. Sci. Statist. Comput. 7 169–184.
  • [16] Hallin, M. and Paindaveine, D. (2006). Semiparametrically efficient rank-based inference for shape. I. Optimal rank-based tests for sphericity. Ann. Statist. 34 2707–2756.
  • [17] Hallin, M., Oja, H. and Paindaveine, D. (2006). Semiparametrically efficient rank-based inference for shape. II. Optimal $R$-estimation of shape. Ann. Statist. 34 2757–2789.
  • [18] Hallin, M. and Paindaveine, D. (2006). Parametric and semiparametric inference for shape: The role of the scale functional. Statist. Decisions 24 327–350.
  • [19] Hallin, M. and Paindaveine, D. (2008). A general method for constructing pseudo-Gaussian tests. J. Japan Statist. Soc. 38 27–39.
  • [20] Hallin, M., Paindaveine, D. and Verdebout, T. (2010). Optimal rank-based testing for principal components. Ann. Statist. 38 3245–3299.
  • [21] Hallin, M., Paindaveine, D. and Verdebout, T. (2010). Testing for common principal components under heterokurticity. J. Nonparametr. Stat. 22 879–895.
  • [22] Hallin, M. and Werker, B.J.M. (2003). Semi-parametric efficiency, distribution-freeness and invariance. Bernoulli 9 137–165.
  • [23] Hettmansperger, T.P. and Randles, R.H. (2002). A practical affine equivariant multivariate median. Biometrika 89 851–860.
  • [24] Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. J. Educ. Psych. 24 417–441.
  • [25] Kreiss, J.P. (1987). On adaptive estimation in stationary ARMA processes. Ann. Statist. 15 112–133.
  • [26] Le Cam, L. (1986). Asymptotic Methods in Statistical Decision Theory. Springer Series in Statistics. New York: Springer.
  • [27] Le Cam, L. and Yang, G.L. (2000). Asymptotics in Statistics: Some Basic Concepts, 2nd ed. Springer Series in Statistics. New York: Springer.
  • [28] Muirhead, R.J. and Waternaux, C.M. (1980). Asymptotic distributions in canonical correlation analysis and other multivariate procedures for nonnormal populations. Biometrika 67 31–43.
  • [29] Paindaveine, D. (2006). A Chernoff–Savage result for shape: On the non-admissibility of pseudo-Gaussian methods. J. Multivariate Anal. 97 2206–2220.
  • [30] Paindaveine, D. (2008). A canonical definition of shape. Statist. Probab. Lett. 78 2240–2247.
  • [31] Pearson, K. (1901). On lines and planes of closest fit to system of points in space. Philos. Mag. 2 559–572.
  • [32] Rao, C.R. and Mitra, S.K. (1971). Generalized Inverse of Matrices and Its Applications. New York: Wiley.
  • [33] Satorra, A. and Bentler, P.M. (1988). Scaling corrections for chi-square statistics in covariance structure analysis. In Proceedings of the Business and Economic Statistics Section of the American Statistical Association 308–313. Alexandria, VA: American Statistical Association.
  • [34] Shapiro, A. and Browne, M.W. (1987). Analysis of covariance structures under elliptical distributions. J. Amer. Statist. Assoc. 82 1092–1097.
  • [35] Van der Vaart, A.W. (2000). Asymptotic Statistics. Cambridge: Cambridge Univ. Press.
  • [36] Wilks, S.S. (1938). The large-sample distribution of the likelihood ratio for testing composite hypotheses. Ann. Math. Statist. 9 60–62.